2015-10-15 66 views
-1

我有以下情形:使用正則表達式的多行文本獲得價值

我想提取之間「」 所有字符串後的.html(JS代碼的下面使用正則表達式

這是我迄今所做的:

(\.html\()?"(.+)"\s*\+* 
\.html\("(.+)"(\s*\+\s*\n\s*"(.+)")* 

但對於所有行不行

任何幫助將大大apprec iated。

謝謝。

JavaScript代碼

 sym.getSymbol("popup").$("main").html("Delamination at Sharp Points and Corners"); 
    sym.getSymbol("popup").$("sub").html("<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br>" + 
     "<span style='font-family: abel_probold'>Common cause:</span> Very little adhesive area to hold the application in place<br><br>" + 
     "<span style='font-family: abel_probold'>Corrective action:</span> When possible, eliminate sharp points. Validate bond performance with wash testing.</div>"); 
+0

解析JavaScript和HTML用正則表達式通常不是一個很好的主意? – adeneo

+0

你想只提取'.html(「...」'strings? – sln

+0

@sln是的,包括由+ – caesar

回答

2

你可以匹配一切報價之間的,這是不是一個報價:

string.match(/\"([^\"]*)\"/g) 
+0

它的作品,但我需要在「.html(」後面的字符串。我怎麼能做到這一點? – caesar

+0

你可以做'string.split('html(')[1]'只保留它後面的東西, t最後的正則表達式太難看了 – floribon

-1

如果你只是想分析JS,你可以使用修改過的c/C++註釋解析器。
由於它匹配字符串中的所有文本,因此您只需坐在循環中
檢查捕獲組1是否匹配。

如果第1組匹配,你有.html("..."。組1包含引號
加上文本,組2只是內部文本。什麼它發現

Formatted and tested:

# (?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\\]|\\\n?)*?\n)|(?:\.html\(("((?:\\[\S\s]|[^"\\])*)")|"(?:\\[\S\s]|[^"\\])*"|[\S\s](?:(?!\.html\()[^/"\\])*) 

    (?:        # Comments 
     /\*        # Start /* .. */ comment 
     [^*]* \*+ 
     (?: [^/*] [^*]* \*+)* 
     /        # End /* .. */ comment 
     |         # or, 
     //        # Start // comment 
     (?: [^\\] | \\ \n?)*?   # Possible line-continuation 
     \n        # End // comment 
    ) 
|         # or, 
    (?:        # Non-Comments 
     \.html\(
     (        # (1 start), Html double quoted strings 
       " 
       (        # (2 start), Inner text 
        (?: \\ [\S\s] | [^"\\])* 
      )        # (2 end) 
       " 
     )        # (1 end) 
     |         # or, 
     "        # Other double quoted strings 
     (?: \\ [\S\s] | [^"\\])* 
     " 
     |         # or, 
     [\S\s]       # Any other char 
     (?: 
       (?! \.html\()     # Give htlm strings a chance above 
       [^/"\\]       # Chars which doesn't start a comment, string, escape, 
               # or line continuation (escape + newline) 
     )* 
    ) 

例子:

** Grp 0 - (pos 36 , len 48) 
.html("Delamination at Sharp Points and Corners" 
** Grp 1 - (pos 42 , len 42) 
"Delamination at Sharp Points and Corners" 
** Grp 2 - (pos 43 , len 40) 
Delamination at Sharp Points and Corners 


** Grp 0 - (pos 124 , len 130) 
.html("<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br>" 
** Grp 1 - (pos 130 , len 124) 
"<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br>" 
** Grp 2 - (pos 131 , len 122) 
<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br> 
+1

我喜歡正則表達式幽默 – 2015-10-16 12:16:26

+0

這似乎很難對付。 – sln