在Spacer NER模型中的評估

我正在評估使用spacy lib創建的訓練有素的NER模型。通常對於這類問題，您可以使用f1分數（精度和召回之間的比例）。我無法在文檔中找到經過訓練的NER模型的精確度函數。在Spacer NER模型中的評估

我不知道，如果它的正確的，但我想用下面的方式（例如）做到這一點，並使用f1_score從sklearn：

from sklearn.metrics import f1_score 
import spacy 
from spacy.gold import GoldParse 


nlp = spacy.load("en") #load NER model 
test_text = "my name is John" # text to test accuracy 
doc_to_test = nlp(test_text) # transform the text to spacy doc format 

# we create a golden doc where we know the tagged entity for the text to be tested 
doc_gold_text= nlp.make_doc(test_text) 
entity_offsets_of_gold_text = [(11, 15,"PERSON")] 
gold = GoldParse(doc_gold_text, entities=entity_offsets_of_gold_text) 

# bring the data in a format acceptable for sklearn f1 function 
y_true = ["PERSON" if "PERSON" in x else 'O' for x in gold.ner] 
y_predicted = [x.ent_type_ if x.ent_type_ !='' else 'O' for x in doc_to_test] 
f1_score(y_true, y_predicted, average='macro')`[1] 
> 1.0

任何想法或見解是有用的。

來源

2017-06-29 Mpizos Dimitris

對於具有下面的鏈接同一個問題的一個：

spaCy/scorer.py

你可以找到不同的指標，包括：fscore，召回率和準確。使用scorer一個例子：

from spacy.gold import GoldParse 

def evaluate(ner_model, examples): 
    scorer = Scorer() 
    for input_, annot in examples: 
     doc_gold_text = ner_model.make_doc(input_) 
     gold = GoldParse(doc_gold_text, entities=annot) 
     pred_value = ner_model(input_) 
     scorer.score(pred_value, gold) 
    return scorer.scores

其中input_是文本（例如「我的名字叫約翰」）和annot是註釋（例如[（11,16，「人」）

。

的scorer.scores返回多的分數。這個例子是從spaCy example in github取（鏈接不工作）

來源

2017-06-30 07:59:48

1.你github上鍊接斷開 2.什麼是自我在這種情況下？我可以在哪裏找到self.make_gold？ – farlee2121

@ farlee2121我已經更新了答案更清晰。 –

在Spacer NER模型中的評估

回答

相關問題