如果你只是想分析JS,你可以使用修改過的c/C++註釋解析器。
由於它匹配字符串中的所有文本,因此您只需坐在循環中
檢查捕獲組1是否匹配。
如果第1組匹配,你有.html("..."
。組1包含引號
加上文本,組2只是內部文本。什麼它發現
Formatted and tested:
# (?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\\]|\\\n?)*?\n)|(?:\.html\(("((?:\\[\S\s]|[^"\\])*)")|"(?:\\[\S\s]|[^"\\])*"|[\S\s](?:(?!\.html\()[^/"\\])*)
(?: # Comments
/\* # Start /* .. */ comment
[^*]* \*+
(?: [^/*] [^*]* \*+)*
/ # End /* .. */ comment
| # or,
// # Start // comment
(?: [^\\] | \\ \n?)*? # Possible line-continuation
\n # End // comment
)
| # or,
(?: # Non-Comments
\.html\(
( # (1 start), Html double quoted strings
"
( # (2 start), Inner text
(?: \\ [\S\s] | [^"\\])*
) # (2 end)
"
) # (1 end)
| # or,
" # Other double quoted strings
(?: \\ [\S\s] | [^"\\])*
"
| # or,
[\S\s] # Any other char
(?:
(?! \.html\() # Give htlm strings a chance above
[^/"\\] # Chars which doesn't start a comment, string, escape,
# or line continuation (escape + newline)
)*
)
例子:
** Grp 0 - (pos 36 , len 48)
.html("Delamination at Sharp Points and Corners"
** Grp 1 - (pos 42 , len 42)
"Delamination at Sharp Points and Corners"
** Grp 2 - (pos 43 , len 40)
Delamination at Sharp Points and Corners
** Grp 0 - (pos 124 , len 130)
.html("<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br>"
** Grp 1 - (pos 130 , len 124)
"<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br>"
** Grp 2 - (pos 131 , len 122)
<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br>
來源
2015-10-15 20:21:08
sln
解析JavaScript和HTML用正則表達式通常不是一個很好的主意? – adeneo
你想只提取'.html(「...」'strings? – sln
@sln是的,包括由+ – caesar