在gensim
中的Word2Vec對象有null_word
參數,在文檔中沒有對此進行說明。什麼是gensim Word2Vec中的`null_word`參數?
類gensim.models.word2vec.Word2Vec(句子=無,大小= 100,α-= 0.025,窗口= 5,min_count = 5,max_vocab_size =無,樣品= 0.001,種子= 1,工人= 3 ,min_alpha = 0.0001,SG = 0,HS = 0,負= 5,cbow_mean = 1,hashfxn =,ITER = 5,null_word = 0,trim_rule =無,sorted_vocab = 1,batch_words = 10000)
什麼是null_word
參數用於?
在https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/word2vec.py#L680檢查代碼,它指出:
if self.null_word:
# create null pseudo-word for padding when using concatenative L1 (run-of-words)
# this word is only ever input – never predicted – so count, huffman-point, etc doesn't matter
word, v = '\0', Vocab(count=1, sample_int=0)
v.index = len(self.wv.vocab)
self.wv.index2word.append(word)
self.wv.vocab[word] = v
什麼是 「拼接L1」?