Pyparsing：dblQuotedString在nestedExpr中的解析方式

我正在研究解析搜索查詢的語法（不評估它們，只是將它們分解爲組件）。現在我正在與nestedExpr合作，只是爲了抓住每個術語的不同「水平」，但如果術語的第一部分是雙引號，我似乎有問題。Pyparsing：dblQuotedString在nestedExpr中的解析方式

語法的簡化版本：

QUOTED = QuotedString(quoteChar = '「', endQuoteChar = '」', unquoteResults = False).setParseAction(remove_curlies) 
WWORD = Word(alphas8bit + printables.replace("(", "").replace(")", "")) 
WORDS = Combine(OneOrMore(dblQuotedString | QUOTED | WWORD), joinString = ' ', adjacent = False) 
TERM = OneOrMore(WORDS) 
NESTED = OneOrMore(nestedExpr(content = TERM)) 

query = '(dog* OR boy girl w/3 ("girls n dolls" OR friends OR "best friend" OR (friends w/10 enemies)))'

調用NESTED.parseString(query)回報：

[['dog* OR boy girl w/3', ['"girls n dolls"', 'OR friends OR "best friend" OR', ['friends w/10 enemies']]]]

第一dblQuotedString實例是從長期的在同一嵌套的休息，這不分開出現在第二個dblQuotedString實例中，並且如果所引用的位是QUOTED實例（帶有引號引號），而不是dblQuotedString，那麼也不會發生這種情況。

dblQuotedString我有遺漏嗎？

注：我知道operatorPrecedence可以分解這樣的搜索字詞，但我對可以分解的內容有一些限制，所以我測試是否可以使用nestedExpr在這些限制內工作。

來源

2017-04-10 allonsyechoes

nestedExpr有一個可選關鍵字參數ignoreExpr，採取這種nestedExpr應該使用忽略，否則將被解釋爲嵌套開啓子或者關閉字符的表達，默認被pyparsing的quotedString，其被定義爲sglQuotedString | dblQuotedString。這是處理字符串一樣：

(this has a tricky string "string with)")

由於默認ignoreExpr是quotedString，在「）」在引號不被誤解爲右括號。

但是，您的content參數也匹配dblQuotedString。主引用字符串通過跳過可能包含「（）」的引用字符串在內部與nestedExpr匹配，然後匹配您的內容，這也匹配帶引號的字符串。您可以抑制nestedExpr的使用NoMatch忽略表達式：

NESTED = OneOrMore(nestedExpr(content = TERM, ignoreExpr=NoMatch()))

現在應該給你：

[['dog* OR boy girl w/3', 
['"girls n dolls" OR friends OR "best friend" OR', ['friends w/10 enemies']]]]

你會發現更多細節和例子在https://pythonhosted.org/pyparsing/pyparsing-module.html#nestedExpr

來源

2017-04-13 06:49:53 PaulMcG

啊，我明白了，這就說得通了。非常感謝！ – allonsyechoes

Pyparsing：dblQuotedString在nestedExpr中的解析方式

回答

相關問題