我只需要匹配第一次出現的html鏈接與'data- {someData}'屬性。我寫的正則表達式如下圖所示:正則表達式首次出現html鏈接
\<a\s+(.+)\s+data-\s*(.+)\s*>(.+)<\/a>
和它的作品對HTML的PICE與像只有一個HTML鏈接:
SOME TEXT/HTML
<a href="~/link.aspx?_id=B0B5056BD5984878BEB5C92AF6B74DB3&_z=z"
data-dms="{6782B150-F6FA-49E6-A2FF-6D6014470373}"
data-targetid="{B0B5056B-D598-4878-BEB5-C92AF6B74DB3}"
data-dms-event="Content button">Link1
</a>
SOME TEXT/HTML
,但問題是當HTML中包含更多的聯繫。然後正則表達式匹配,直到最後一次出現</a>
。所以,從下面的HTML:
SOME TEXT/HTML
<a href="~/link.aspx?_id=B0B5056BD5984878BEB5C92AF6B74DB3&_z=z"
data-dms="{6782B150-F6FA-49E6-A2FF-6D6014470373}"
data-targetid="{B0B5056B-D598-4878-BEB5-C92AF6B74DB3}"
data-dms-event="Content button">Link1
</a>
SOME TEXT/HTML
<a href="~/link.aspx?_id=1256272320C4429DAB8A1F40D429C841&_z=z"
data-dms="{6782B150-F6FA-49E6-A2FF-6D6014470373}"
data-targetid="{12562723-20C4-429D-AB8A-1F40D429C841}"
data-dms-event="Content button">Link2
</a>
SOME TEXT/HTML
我需要修復我的正則表達式來只匹配:
<a href="~/link.aspx?_id=B0B5056BD5984878BEB5C92AF6B74DB3&_z=z"
data-dms="{6782B150-F6FA-49E6-A2FF-6D6014470373}"
data-targetid="{B0B5056B-D598-4878-BEB5-C92AF6B74DB3}"
data-dms-event="Content button">Link1
</a>
爲什麼你不使用DOM解析器來解析HTML? –