vb.net正則表達式 - 替換標籤而不替換span標籤

-2

我的函數需要替換字符串中的標籤，如果其中提取的數據有url。例如：vb.net正則表達式 - 替換標籤而不替換span標籤

www.cnn.com

這工作正常，但是當我有這樣一個字符串：

<a href=www.cnn.com><span style="color: rgb(255, 0, 0);">www.cnn.com</span></a>

我只得到：

www.cnn.com

<a href=www.cnn.com>www.cnn.com</a>

將被取代

當我真的想要sta y與：

<span style="color: rgb(255, 0, 0);">www.cnn.com</span>

我需要添加到它的代碼工作？

這是我的函數：

Dim ret As String = text 

'If it looks like a URL 
Dim regURL As New Regex("(www|\.org\b|\.com\b|http)") 
'Gets a Tags regex 
Dim rxgATags = New Regex("<[^>]*>", RegexOptions.IgnoreCase) 

'Gets all matches of <a></a> and adds them to a list 
Dim matches As MatchCollection = Regex.Matches(ret, "<a\b[^>]*>(.*?)</a>") 

'for each <a></a> in the text check it's content, if it looks like URL then delete the <a></a> 
For Each m In matches 
'tmpText holds the data extracted within the a tags. /visit at.../www.applyhere.com 
     Dim tmpText = rxgATags.Replace(m.ToString, "") 

     If regURL.IsMatch(tmpText) Then 
      ret = ret.Replace(m.ToString, tmpText) 
     End If 
Next 

Return ret

來源

2015-04-05 Offir Pe' er

使用此「@」] *>「'正則表達式。 – 2015-04-05 11:17:20

我加入這個我的代碼：

'Selects only the A tags without the data extracted between them 
Dim rxgATagsOnly = New Regex("</?a\b[^>]*>", RegexOptions.IgnoreCase) 

    For Each m In matches 
     'tmpText holds the data extracted within the a tags. /visit at.../www.applyhere.com 
     Dim tmpText = rxgATagsContent.Replace(m.ToString, "") 

     'if the data extract between the tags looks like a URL then take off the a tags without touching the span tags. 
     If regURL.IsMatch(tmpText) Then 
      'select everything but a tags 
      Dim noATagsStr As String = rxgATagsOnly.Replace(m.ToString, Environment.NewLine) 
      'replaces string with a tag to non a tag string keeping it's span tags 
      ret = ret.Replace(m.ToString, noATagsStr) 

     End If 
    Next

所以從字符串：

<a href=www.cnn.com><span style="color: rgb(255, 0, 0);">www.cnn.com</span></a>

我只選擇了與阿維納什·拉吉正則表達式和一個標籤然後用「」替換它們。謝謝大家回答。

來源

2015-04-05 12:52:57

下面的正則表達式將刪除所有的HTML標籤：

string someString = "<a href=www.one.co.il><span style=\"color: rgb(255, 0, 255);\">www.visitus.com</span></a>"; 

string target = System.Text.RegularExpressions.Regex.Replace(someString, @"<[^>]*>", "", RegexOptions.Compiled).ToString();

這是正則表達式，你想：我的代碼<[^>]*>

結果：www.visitus.com

來源

2015-04-05 11:33:31

您可以使用以下正則表達式 - <a\s*[^<>]*>|</a> - 這將匹配所有<a>節點，包括開始和結束節點。

你不需要使用regURL，這可以構建到rxATags正則表達式中。我們可以通過檢查href和regURL alternatives, then grab everything in between the opening and close`標籤來確保它是一個URL參考<a>標籤，然後僅使用它們之間的內容。

Dim ret As String = "<a href=www.one.co.il><span style=""color: rgb(255, 0, 255);"">www.visitus.com</span></a>" 
'Gets a Tags regex 
Dim rxgATags = New Regex("(<a\s*[^<>]*href=[""']?(?:www|\.org\b|\.com\b|http)[^<>]*>)((?>\s*<(?<t>[\w.-]+)[^<>]*?>[^<>]*?</\k<t>>\s*)+)(</a>)", RegexOptions.IgnoreCase) 
Dim replacement As String = "$2" 
ret = rxgATags.Replace(ret, replacement)

enter image description here

來源

2015-04-05 12:02:27

vb.net正則表達式 - 替換標籤而不替換span標籤

回答

相關問題