我不認爲NLTK CFG語法讀者可以用方括號讀取CFG的格式。
首先讓我們嘗試CFG語法不加方括號:
from nltk.grammar import CFG
grammar_string = '''
S -> NP
PP -> P NP
D -> 'the'
PN -> 'saumya'|'dinesh'
PRO -> 'she'|'he'|'we'
A -> 'tall'|'naughty'|'long'|'three'|'black'
P -> 'with'|'in'|'from'|'at'
QP -> 'some'
'''
grammar = CFG.fromstring(grammar_string)
print grammar
[出]:
Grammar with 18 productions (start state = S)
S -> NP
PP -> P NP
D -> 'the'
PN -> 'saumya'
PN -> 'dinesh'
PRO -> 'she'
PRO -> 'he'
PRO -> 'we'
A -> 'tall'
A -> 'naughty'
A -> 'long'
A -> 'three'
A -> 'black'
P -> 'with'
P -> 'in'
P -> 'from'
P -> 'at'
QP -> 'some'
現在,讓我們把方括號:
from nltk.grammar import CFG
grammar_string = '''
S -> NP
PP -> P NP
D -> 'the'
PN -> 'saumya'|'dinesh'
PRO -> 'she'|'he'|'we'
A -> 'tall'|'naughty'|'long'|'three'|'black'
P -> 'with'|'in'|'from'|'at'
QP -> 'some'
N[NUM=sg] -> 'boy'|'girl'|'room'|'garden'|'hair'
N[NUM=pl] -> 'dogs'|'cats'
'''
grammar = CFG.fromstring(grammar_string)
print grammar
[出]:
Traceback (most recent call last):
File "test.py", line 33, in <module>
grammar = CFG.fromstring(grammar_string)
File "/usr/local/lib/python2.7/dist-packages/nltk/grammar.py", line 519, in fromstring
encoding=encoding)
File "/usr/local/lib/python2.7/dist-packages/nltk/grammar.py", line 1273, in read_grammar
(linenum+1, line, e))
ValueError: Unable to parse line 10: N[NUM=sg] -> 'boy'|'girl'|'room'|'garden'|'hair'
Expected an arrow
再回到你的語法,好像你正在使用的方括號表示約束或uncontraints,因此該解決方案將是:
- 使用強調了contrainted非終端和
- 做出了unconstrainted非終端
規則,以便您的CFG規則將如下這樣:
from nltk.parse import RecursiveDescentParser
from nltk.grammar import CFG
grammar_string = '''
S -> NP
NP -> PN | PRO | D N | D A N | D N PP | QP N | A N | D NOM PP | D NOM
PP -> P NP
PN -> 'saumya'|'dinesh'
PRO -> 'she'|'he'|'we'
A -> 'tall'|'naughty'|'long'|'three'|'black'
P -> 'with'|'in'|'from'|'at'
QP -> 'some'
D -> D_def | D_sg
D_def -> 'the'
D_sg -> 'a'
N -> N_sg | N_pl
N_sg -> 'boy'|'girl'|'room'|'garden'|'hair'
N_pl -> 'dogs'|'cats'
'''
grammar = CFG.fromstring(grammar_string)
rdparser = RecursiveDescentParser(grammar)
sent = "a dogs".split()
trees = rdparser.parse(sent)
for tree in trees:
print (tree)
[出]:
(S (NP (D (D_sg a)) (N (N_pl dogs))))
請同時發佈代碼中的完整錯誤追溯。 – alvas 2014-10-22 13:54:13