2017-04-16 41 views
0

我已經從this中讀取了text1.similar(「怪物」)和text1.concordance(「怪物」)。在nltk中類似()和一致性的區別

在Python中,自然語言處理工具箱的text1.concordance('monstrous')text1.similar('monstrous')之間的差異無法得到滿意的答案。

那麼請您詳細說明一個例子嗎?

回答

1

使用concordance(token)爲您提供圍繞參數token的上下文。它會顯示token出現的句子。

使用similar(token)返回出現在與token相同的上下文中的單詞列表。在這種情況下,上下文僅僅是token兩側的單詞。

所以,看着莫比迪克的文字(text1)。我們可以檢查'monstrous'

text1.concordance('monstrous') 

# returns: 
Displaying 11 of 11 matches: 
ong the former , one was of a most monstrous size . ... This came towards us , 
ON OF THE PSALMS . " Touching that monstrous bulk of the whale or ork we have r 
ll over with a heathenish array of monstrous clubs and spears . Some were thick 
d as you gazed , and wondered what monstrous cannibal and savage could ever hav 
that has survived the flood ; most monstrous and most mountainous ! That Himmal 
they might scout at Moby Dick as a monstrous fable , or still worse and more de 
th of Radney .'" CHAPTER 55 Of the Monstrous Pictures of Whales . I shall ere l 
ing Scenes . In connexion with the monstrous pictures of whales , I am strongly 
ere to enter upon those still more monstrous stories of them which are to be fo 
ght have been rummaged out of this monstrous cabinet there is no telling . But 
of Whale - Bones ; for Whales of a monstrous size are oftentimes cast up dead u 

的一致性,然後我們可以得到出現在類似的情境,以'monstrous'單詞列表。第一個返回行的上下文是'most _____ size'

text1.similar('monstrous') 

# returns: 
determined maddens contemptible modifies abundant tyrannical puzzled 
trustworthy impalpable gamesome curious mean pitiable untoward 
christian subtly passing domineering uncommon true 

如果我們把字'true',並檢查它的一致性與text.concordance('true')我們將回到第一87 25使用單詞「true」的。這不是非常有用,但NLTK確實提供了一種名爲common_contexts的附加方法,可以顯示何時使用單詞列表共享相同的單詞。

text1.common_contexts(['monstrous', 'true']) 

# returns: 
the_pictures 

這個結果告訴我們,短語"the monstrous pictures""the true pictures"都出現在白鯨。

+0

謝謝你的優秀解釋。但是我仍然不完全清楚類似(),所以你可以試試。 @詹姆士 – dex