使用NLTK中的POS標籤的CFG

-4

我正在嘗試使用NLTK來檢查給定的句子是否是語法。使用NLTK中的POS標籤的CFG

例：

OK：鯨魚舔悲傷

不正常：最好的我曾經

我知道我可以做詞性標註，然後用CFG解析器和檢查方式，但我還沒有找到使用POS標記而不是實際詞作爲終端分支的CFG。

是否有任何人可以推薦的CFG？我認爲製作我自己的作品是很愚蠢的，因爲我不是語言學家，可能會遺漏重要的結構。

此外，我的應用程序是這樣的系統將理想地拒絕許多句子，只批准句子，它是非常確定的。

感謝：d

來源

2013-02-21 Sam

您是否看到過這個相關的StackOverflow討論？ http://stackoverflow.com/questions/10252448/how-to-check-whether-a-sentence-is-correct-simple-grammar-check-in-python – stepthom 2013-02-21 18:33:28

的CFG的終端節點可以是任何東西，甚至POS標籤。只要你的短語規則識別POS而不是單詞作爲輸入，用POS來聲明語法應該沒有問題。

import nltk 
# Define the cfg grammar. 
grammar = nltk.parse_cfg(""" 
S -> NP VP 
NP -> 'DT' 'NN' 
VP -> 'VB' 
VP -> 'VB' 'NN' 
""") 


# Make your POS sentence into a list of tokens. 
sentence = "DT NN VB NN".split(" ") 

# Load the grammar into the ChartParser. 
cp = nltk.ChartParser(grammar) 

# Generate and print the nbest_parse from the grammar given the sentence tokens. 
for tree in cp.nbest_parse(sentence): 
    print tree

來源

2013-02-23 02:01:29 alvas

要從句子中獲取POS標籤，英語有很多POS標籤。例如http://code.google.com/p/hunpos/ – alvas 2013-02-23 02:03:29

我知道如何獲得POS標籤，但是如何獲得使用POS標籤作爲終端的英語語言的CFG？ – Sam 2013-02-23 21:31:12

POS標籤語料庫中的每個句子，取出最頻繁發生的POS模式。 – alvas 2013-02-24 01:38:15

使用NLTK中的POS標籤的CFG

回答

相關問題