如何將字符串傳遞給CoreNLP中的AbstractSequenceClassifier.classifyAndWriteAnswersKBest？

AbstractSequenceClassifier.classifyAndWriteAnswersKBest允許傳遞一個文件名和一個ObjectBank<List<IN>>，但是從ObjectBank的文檔中不清楚如何在不涉及文件的情況下創建這樣的ObjectBank。如何將字符串傳遞給CoreNLP中的AbstractSequenceClassifier.classifyAndWriteAnswersKBest？

我使用CoreNLP 3.7.0與Java 8

來源

2017-03-18 Karl Richter

你應該只使用此方法代替：

Counter<List<IN>> classifyKBest(List<IN> doc, Class<? extends CoreAnnotation<String>> answerField, int k)

它將返回返回序列的映射得分。

有了這個代碼行，你可以反過來說，反成序列的排序列表：

List<List<IN>> sorted = Counters.toSortedList(kBest);

我不知道你想要做什麼，但一般在一個CoreLabel。這裏的關鍵是把你的字符串變成一個IN列表。這應該是CoreLabel，但我不知道正在使用的AbstractSequenceClassifier的全部細節。

如果你想運行在一個句子的順序分類，你可以先用管道記號化，然後通過令牌的列表classifyKBest(...)

舉例來說，如果你的榜樣，你正在試圖獲得第k - 最好的實體標籤：

// set up pipeline 
Properties props = new Properties(); 
props.setProperty("annotators", "tokenize"); 
StanfordCoreNLP tokenizerPipeline = new StanfordCoreNLP(props); 

// get list of tokens for example sentence 
String exampleSentence = "..."; 
// wrap sentence in an Annotation object 
Annotation annotation = new Annotation(exampleSentence); 
// tokenize sentence 
tokenizerPipeline.annotate(annotation); 
// get the list of tokens 
List<CoreLabel> tokens = annotation.get(CoreAnnotations.TokensAnnotation.class); 

//... 
// classifier should be an AbstractSequenceClassifier 

// get the k best sequences from your abstract sequence classifier 
Counter<List<CoreLabel>> kBestSequences = classifier.classifyKBest(tokens,CoreAnnotations.NamedEntityTagAnnotation.class, 10) 
// sort the k-best examples 
List<List<CoreLabel>> sortedKBest = Counters.toSortedList(kBestSequences); 
// example: getting the second best list 
List<CoreLabel> secondBest = sortedKBest.get(1); 
// example: print out the tags for the second best list 
System.out.println(secondBest.stream().map(token->token.get(CoreAnnotations.NamedEntityTagAnnotation.class)).collect(Collectors.joining(" "))); 
// example print out the score for the second best list 
System.out.println(kBestSequences.getCount(secondBest));

如果您有更多的問題，請讓我知道，我可以幫助！

來源

2017-03-21 01:02:52 StanfordNLPHelp

謝謝！問題是[「Annotations是保存註釋器結果的數據結構」]（http://stanfordnlp.github.io/CoreNLP/api.html）並沒有讓人認爲'Annotation'可以保存輸入因爲術語「結果」 - 儘管下面的例子顯示了這一點。 –

如何將字符串傳遞給CoreNLP中的AbstractSequenceClassifier.classifyAndWriteAnswersKBest？

回答

相關問題