如何保留依賴關係的順序？

我有以下代碼打開目錄中的文件，對它們運行spaCy NLP，輸出依賴項將信息解析到新目錄中的文件中。如何保留依賴關係的順序？

import spacy, os 

nlp = spacy.load('en') 

path1 = 'C:/Path/to/my/input' 
path2 = '../output' 
for file in os.listdir(path1): 
    with open(file, encoding='utf-8') as text: 
     txt = text.read() 
     doc = nlp(txt) 
     for sent in doc.sents: 
      f = open(path2 + '/' + file, 'a+') 
      for token in sent: 
       f.write(file + '\t' + str(token.dep_) + '\t' + str(token.head) + '\t' + str(token.right_edge) + '\n') 
    f.close()

問題在於，這不會保留輸出文件中依賴關係的順序。我似乎無法在API文檔中找到對字符位置的任何引用。

來源

2016-11-04 Shane

字符索引爲token.idx。詞索引在token.i。我知道這不是特別直觀。

令牌也被位置比較，所以你可以這樣做：

for child in sent: 
    word1, word2 = sorted((child, child.head))

這將讓你的每個依賴性弧線，排列文檔順序。儘管如此，我不確定你想要做什麼，但我不確定這是否完全符合你的要求。

來源

2016-11-04 20:07:48

謝謝syllogism_！這很好。我結束了以下內容：'爲孩子送： \t \t \t \t頭= child.head \t \t \t \t head_pos = child.head.tag_ \t \t \t \t常量=小孩 \t \t \t \t const_pos = child.tag_ \t \t \t \t f.write（file +'\ t'+ str（child.idx）+'\ t'+ str（child.dep_）+'\ t'+ str（head）+'\ t'+ str（head_pos）+'\ t'+ str（const）+'\ t'+ str（const_pos） +'\ n'）' – Shane

如何保留依賴關係的順序？

回答

相關問題