2017-05-09 131 views
0

我想爲StanfordNPP創建自己的序列化訓練模型,但訓練代碼花費了大量時間。StanfordNLP CRFClassifier需要太多時間

我的配置如下:

# location of the training file 
trainFile=dictionary.tsv 

# location where you would like to save (serialize) your 
# classifier; adding .gz at the end automatically gzips the file, 
# making it smaller, and faster to load 
serializeTo=dictionary.ser.gz 

# structure of your training file; this tells the classifier that 
# the word is in column 0 and the correct answer is in column 1 
map=word=0,answer=1 

# This specifies the order of the CRF: order 1 means that features 
# apply at most to a class pair of previous class and current class 
# or current class and next class. 
maxLeft=1 

# these are the features we'd like to train with 
# some are discussed below, the rest can be 
# understood by looking at NERFeatureFactory 
useClassFeature=true 
useWord=true 
# word character ngrams will be included up to length 6 as prefixes 
# and suffixes only 
useNGrams=true 
noMidNGrams=true 
maxNGramLeng=6 
usePrev=true 
useNext=true 
useDisjunctive=true 
useSequences=true 
usePrevSequences=true 
# the last 4 properties deal with word shape features 
useTypeSeqs=true 
useTypeSeqs2=true 
useTypeySequences=true 
wordShape=chris2useLC 

type=crf 
useQN=true 
QNsize=2 
featureDiffThresh=0.05 
saveFeatureIndexToDisk=true 
readerAndWriter=edu.stanford.nlp.sequences.ColumnDocumentReaderAndWriter 

任何人可以幫助我這個?有人希望我的培訓文件能夠理解嗎?

+0

要添加,我得到以下異常: - – user3279692

回答

0

要添加,我得到以下異常: -


Exception in thread "main" java.lang.RuntimeException: Got NaN for prob in CRFLogConditionalObjectiveFunction.calculate() - this may well indicate numeric underflow due to overly long documents. 
     at edu.stanford.nlp.ie.crf.CRFLogConditionalObjectiveFunction.calculate(CRFLogConditionalObjectiveFunction.java:427) 
     at edu.stanford.nlp.optimization.AbstractCachingDiffFunction.ensure(AbstractCachingDiffFunction.java:140) 
     at edu.stanford.nlp.optimization.AbstractCachingDiffFunction.valueAt(AbstractCachingDiffFunction.java:145) 
     at edu.stanford.nlp.optimization.QNMinimizer.lineSearchMinPack(QNMinimizer.java:1460) 
     at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:1008) 
     at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:857) 
     at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:851) 
     at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:93) 
     at edu.stanford.nlp.ie.crf.CRFClassifier.trainWeights(CRFClassifier.java:1919) 
     at edu.stanford.nlp.ie.crf.CRFClassifier.train(CRFClassifier.java:1726) 
     at edu.stanford.nlp.ie.AbstractSequenceClassifier.train(AbstractSequenceClassifier.java:758) 
     at edu.stanford.nlp.ie.AbstractSequenceClassifier.train(AbstractSequenceClassifier.java:746) 
     at edu.stanford.nlp.ie.crf.CRFClassifier.main(CRFClassifier.java:3034)