如何編寫PLY語法來解析路徑？

我試圖用PLY編寫一個語法分析文件中的路徑。我正在進入輪班減少衝突，我不知道如何改變語法來修復它。這是我試圖解析的文件的一個例子。路徑/文件名可以是任何可接受的linux路徑。如何編寫PLY語法來解析路徑？

file : ../../dir/filename.txt 
file : filename.txt 
file : filename

所以這裏是我寫的語法。

header : ID COLON path 

path : pathexpr filename 

pathexpr : PERIOD PERIOD DIVIDE pathexpr 
      | PERIOD DIVIDE pathexpr 
      | ID DIVIDE pathexpr 
      | 
filename : ID PERIOD ID 
      | ID

這是我的代幣。我正在使用包含ctokens庫的PLY。只是爲了節省寫作自己的努力。

t_ID = r'[A-Za-z_][A-Za-z0-9_]*' 
t_PERIOD = r'\.' 
t_DIVIDE = r'/' 
t_COLON = r':'

所以我相信這是一個轉變降低在「文件名」的規則衝突，因爲分析器不知道是否減少令牌「ID」，或爲「ID期間ID」轉變。我認爲在沒有路徑（「文件名」）的情況下會出現另一個問題，它將在pathexpr中使用該標記而不是將其還原爲空。

如何修復我的語法來處理這些情況？也許我需要更換我的令牌？

來源

2015-09-28 jjm012

簡單的解決方案：使用左遞歸而不是右遞歸。

LR解析器（如PLY和yacc）寧願左遞歸，因爲它避免了必須展開解析器堆棧。它通常也更接近表達式的語義 - 當你想要真正解釋語言並且不僅僅能夠識別它時，這是非常有用的，而且在這種情況下，它經常會避免使用左邊的因素。

例如，在這種情況下，每個路徑段需要應用到前面的pathexpr，通過查找當前找到的目錄中的段目錄。解析器的操作很明確：在$ 1中查找$ 2。你如何正確的遞歸版本的行動？

所以，一個簡單的變換：

header : ID COLON path 

path  : pathexpr filename 

pathexpr : pathexpr PERIOD PERIOD DIVIDE 
     | pathexpr PERIOD DIVIDE 
     | pathexpr ID DIVIDE 
     | 
filename : ID PERIOD ID 
     | ID

來源

2015-09-28 19:38:17 rici

感謝您的幫助！從右遞歸更改爲左可解決問題。 – jjm012

我想你可能在使用PLY而不是pyparsing來查看這些「t_xxx」名稱。但這裏是一個pyparsing解決您的問題，請參見下面有幫助的意見：

""" 
header : ID COLON path 

path : pathexpr filename 

pathexpr : PERIOD PERIOD DIVIDE pathexpr 
      | PERIOD DIVIDE pathexpr 
      | ID DIVIDE pathexpr 
      | 
filename : ID PERIOD ID 
      | ID 
""" 

from pyparsing import * 

ID = Word(alphanums) 
PERIOD = Literal('.') 
DIVIDE = Literal('/') 
COLON = Literal(':') 

# move this to the top, so we can reference it in a negative 
# lookahead while parsing the path 
file_name = ID + Optional(PERIOD + ID) 

# simple path_element - not sufficient, as it will consume 
# trailing ID that should really be part of the filename 
path_element = PERIOD+PERIOD | PERIOD | ID 

# more complex path_element - adds lookahead to avoid consuming 
# filename as a part of the path 
path_element = (~(file_name + WordEnd())) + (PERIOD+PERIOD | PERIOD | ID) 

# use repetition for these kind of expressions, not recursion 
path_expr = path_element + ZeroOrMore(DIVIDE + path_element) 

# use Combine so that all the tokens will get returned as a 
# contiguous string, not as separate path_elements and slashes 
path = Combine(Optional(path_expr + DIVIDE) + file_name) 

# define header - note the use of results names, which will allow 
# you to access the separate fields by name instead of by position 
# (similar to using named groups in regexp's) 
header = ID("id") + COLON + path("path") 

tests = """\ 
file: ../../dir/filename.txt 
file: filename.txt 
file: filename""".splitlines() 

for t in tests: 
    print t 
    print header.parseString(t).dump() 
    print

打印

file: ../../dir/filename.txt 
['file', ':', '../../dir/filename.txt'] 
- id: file 
- path: ../../dir/filename.txt 

file: filename.txt 
['file', ':', 'filename.txt'] 
- id: file 
- path: filename.txt 

file: filename 
['file', ':', 'filename'] 
- id: file 
- path: filename

來源

2015-09-28 12:51:26 PaulMcG

感謝您的迴應！對不起，是的，我的意思是PLY。我最初希望使用pyparsing，但後來切換到PLY。我偶然混淆了這些名字。 – jjm012

我相信這應該語法工作，它具有能夠recoganize狀延伸，目錄的路徑的部分一個額外的好處，驅動等我還沒有做出解析器，只有這個語法。

fullfilepath : path SLASH filename 
path : root 
    | root SLASH directories 
root : DRIVE 
    | PERCENT WIN_DEF_DIR PERCENT 
directories : directory 
      | directory SLASH directories 
directory : VALIDNAME 
filename : VALIDNAME 
     | VALIDNAME DOT EXTENSION

來源

2017-05-24 12:09:46 Shan

如何編寫PLY語法來解析路徑？

回答

相關問題