2017-06-02 66 views
0

你好下面是我的功能在PythonTFIDF在Python

def tf_idf(self,job_id,method='local'): 
    jobtext = self.get_job_text (job_id , method=method) 
    tfidf_vectorizer = TfidfVectorizer(max_df=0.8 , max_features=200000 , 
             min_df=0.2 , stop_words='english' , 
             use_idf=True , tokenizer=self.tokenize_and_stem(jobtext), ngram_range=(1, 3)) 
    #tfidf_vectorizer.fit(jobtext) 
    tfidf_matrix = tfidf_vectorizer.fit_transform(jobtext) #fit the vectorizer to synopses 
    print(tfidf_matrix.shape) 

創建TFIDF矩陣,我收到以下錯誤:

回溯(最近通話最後一個):

File ".../employment_skills_extraction-master/api/process_request.py", line 206, in <module> 
main() 
    File ".../employment_skills_extraction-master/api/process_request.py", line 202, in main 
print pr.process(json.dumps(test)) 
    File ".../employment_skills_extraction-master/api/process_request.py", line 188, in process 
termVector=self.tf_idf(job_id) 
    File ".../employment_skills_extraction-master/api/process_request.py", line 174, in tf_idf 
tfidf_matrix = tfidf_vectorizer.fit_transform(jobtext) #fit the vectorizer to synopses 
    File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 1285, in fit_transform 
X = super(TfidfVectorizer, self).fit_transform(raw_documents) 
    File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 804, in fit_transform 
self.fixed_vocabulary_) 
    File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 739, in _count_vocab 
for feature in analyze(doc): 
    File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 236, in <lambda> 
tokenize(preprocess(self.decode(doc))), stop_words) 
TypeError: 'list' object is not callable 

請幫助我爲什麼得到這個錯誤?

回答

0

TypeError: 'list' object is not callable看起來像錯誤的相關部分,它涉及您的變量job_id這可能不是你認爲的那樣。無論它應該是什麼,它可能是一個列表(我不知道多久),其中包含你想要的東西。

如果插入的功能的第二線的線路和改變一個變量名,以保持其優雅是這樣的:

job_id_element = job_id[0] 
jobtext = self.get_job_text (job_id_element , method=method) 

它可能會工作。

只要檢查變量job_id的內容並考慮您是否想要它的第一個元素(我寫的0)或最後一個len(job_id)是您需要的而不是0或可能是不同的。