2014-10-19 49 views
0

我在scikit-learn中使用TfidfVectorizo​​r函數。我正在嘗試使用「use_idf = True」來包含tf-idf元素。在文檔中,它之後說,result.idf_應該返回我的idf權重的數組和形狀,但我得到「無」。以下是我的輸入和輸出。 (我最終試圖判斷min_df和max_df如何影響我的結果,因此它們現在只是隨機值)。在scikit-learn中沒有爲idf_輸出

tester =TfidfVectorizer(docs_train, min_df=.2, max_df=.8, use_idf=True) 

print tester 

TfidfVectorizer(analyzer=u'word', binary=False, charset=None, 
     charset_error=None, decode_error=u'strict', 
     dtype=<type 'numpy.int64'>, encoding=u'utf-8', 
     input=["today , war became a reality to me after seeing a screening of saving  priivate ryan . \nsteve spielberg goes beyond reality with his latest production . \nthe audience is tossed about the theatre witnessing the horror of war . \nplease keep the kids home as the r rating is for reality . \nto...esting motif out of the ubiquitous palmetto bugs-but nothing can freshen up this stale script . \n'], 
    lowercase=True, max_df=0.8, max_features=None, min_df=0.2, 
    ngram_range=(1, 1), norm=u'l2', preprocessor=None, smooth_idf=True, 
    stop_words=None, strip_accents=None, sublinear_tf=False, 
    token_pattern=u'(?u)\\b\\w\\w+\\b', tokenizer=None, use_idf=True, 
    vocabulary=None) 

print tester.idf_ 

None 

回答

0

您還沒有向矢量器提供任何數據。您應該使用fitfit_transform

+0

是的,我完全錯過了這部分,它解決了我的問題。謝謝! – lilyrobin 2014-10-21 00:01:37