2012-07-19 60 views
0

示例文本:無法找到正確的正則表達式

'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">test </span> 
<span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">test </span> 
<span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">test </span> 
<span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">test </span> 
<span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">6. </span> 
<span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">Oktober </span> 
<span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">1997</span> 
<span style="font-family:'Times New Roman'; font-size:6pt; letter-spacing:0.05pt; vertical-align:super">2</span> 
<span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">),</sp 

我正則表達式應該匹配:現在

<span style="font-family:'Times New Roman'; font-size:6pt; letter-spacing:0.05pt; vertical-align:super">2</span> 
<span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">) 

我正則表達式是:

<span.*?>\d+?</span><span.*?>\) 

結果:

<span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">test </span><span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">test </span><span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">test </span><span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">6. </span><span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">Oktober </span><span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">1997</span>***<span style="font-family:'Times New Roman'; font-size:6pt; letter-spacing:0.05pt; vertical-align:super">2</span><span style="font-family:'Times New Roman'; font-size:9pt; letter-spacing:0.05pt">) 

試過現在很多,但我cna't得到它的工作

感謝您的幫助

+2

不要混合使用正則表達式和HTML! – hsz 2012-07-19 09:23:15

+0

使用您正在使用的任何語言的HTML解析器。既然你沒有提到這一點,我現在無法提出任何建議。 – 2012-07-19 09:25:49

+0

不是一個直接的答案,而是一個暗示:也許使用一個xml解析器對象來存儲html,然後循環遍歷每個標記並在其上使用正則表達式。否則你可能不得不寫一個更復雜的正則表達式。只是我的觀點。 – 2012-07-19 09:26:19

回答

1

那是很難準確地閱讀,但嘗試:

<span[^>]*>\d+?</span>.*<span[^>]*>

通過搜索不是字符該支架更清楚你所得到的。我也很幸運加入了無關緊要的白色空間領域。

+0

'] *> \ d *?] *> \)'的作品,感謝您的幫助:) – user1519979 2012-07-19 09:36:32