2011-03-30 56 views
6

有沒有一種方法可以通過NLTK從synsets中獲取WordNet選擇限制(如+動畫,+人等)? 或者是否有其他方式提供有關synset的語義信息?我能得到的最接近的是上位關係。NLTK中的字網選擇限制

回答

4

這取決於什麼是你的「選擇限制」或我稱之爲語義特徵,因爲在傳統的語義,存在的concepts世界和概念之間進行比較,我們必須找到

  • 鑑別特徵
  • 相似特徵(即,其被用於區分它們彼此的概念特徵)(即類似的概念的特徵和強調需要區分它們)

例如:

Man is [+HUMAN], [+MALE], [+ADULT] 
Woman is [+HUMAN], [-MALE], [+ADULT] 

[+HUMAN] and [+ADULT] = similarity features 
[+-MALE] is the discrimating features 

傳統語義的共同問題,把這個理論在計算語義的

這個問題:「有沒有的,我們可以用功能的特定列表比較任何

「如果是這樣,該列表上的功能是什麼?」 概念?「

(詳見www.acl.ldc.upenn.edu/E/E91/E91-1034.pdf)

再回到WordNet的,我可以建議2層的方法來解決「選擇限制「

首先,檢查上位詞的區分功能,但首先您必須確定什麼是區分功能。爲了區分動物和人類,我們將區分特徵作爲[+人類]和[+ - 動物]。

from nltk.corpus import wordnet as wn 

# Concepts to compare 
dog_sense = wn.synsets('dog')[0] # It's http://goo.gl/b9sg9X 
jb_sense = wn.synsets('James_Baldwin')[0] # It's http://goo.gl/CQQIG9 

# To access the hypernym_paths()[0] 
# It's weird for that hypernym_paths gives a list of list rather than a list, nevertheless it works. 
dog_hypernyms = dog_sense.hypernym_paths()[0] 
jb_hypernyms = jb_sense.hypernym_paths()[0] 


# Discriminating features in terms of concepts in WordNet 
human = wn.synset('person.n.01') # i.e. [+human] 
animal = wn.synset('animal.n.01') # i.e. [+animal] 

try: 
    assert human in jb_hypernyms and animal not in jb_hypernyms 
    print "James Baldwin is human" 
except: 
    print "James Baldwin is not human" 

try: 
    assert human in dog_hypernyms and animal not in dog_hypernyms 
    print "Dog is an animal" 
except: 
    print "Dog is not an animal" 

二,檢查@Jacob建議的相似性度量。

dog_sense = wn.synsets('dog')[0] # It's http://goo.gl/b9sg9X 
jb_sense = wn.synsets('James_Baldwin')[0] # It's http://goo.gl/CQQIG9 

# Features to check against whether the 'dubious' concept is a human or an animal 
human = wn.synset('person.n.01') # i.e. [+human] 
animal = wn.synset('animal.n.01') # i.e. [+animal] 

if dog_sense.wup_similarity(animal) > dog_sense.wup_similarity(human): 
    print "Dog is more of an animal than human" 
elif dog_sense.wup_similarity(animal) < dog_sense.wup_similarity(human): 
    print "Dog is more of a human than animal" 
+0

謝謝您的詳細解答。我前一段時間意識到,由於您提到的原因,我無法在WordNet中找到相似性/區分性功能。 – erickrf 2013-12-04 17:38:31

0

你可以嘗試使用一些相似功能與精選的synsets,並使用它來過濾。但它基本上與上位詞樹相同 - afaik所有的詞網相似度函數在計算中都使用上位詞距離。另外,synset有很多可選屬性值得探索,但它們的存在可能非常不一致。