0
我是sklearn管道的新手,並從sklearn文檔研究它。我用它在movie review數據的情緒分析。數據包含兩列,第一個爲class
,第二個爲text
。sklearn管道不工作
input_file_df = pd.read_csv("movie-pang.csv")
x_train = input_file_df["text"] #used complete data as train data
y_train = input_file_df["class"]
我只用一個特點,sentiment score for each sentence.
我寫了這個自定義變壓器:
class GetWorldLevelSentiment(BaseEstimator, TransformerMixin):
def __init__(self):
pass
def get_word_level_sentiment(self, word_list):
sentiment_score = 1
for word in word_list:
word_sentiment = swn.senti_synsets(word)
if len(word_sentiment) > 0:
word_sentiment = word_sentiment[0]
else:
continue
if word_sentiment.pos_score() > word_sentiment.neg_score():
word_sentiment_score = word_sentiment.pos_score()
elif word_sentiment.pos_score() < word_sentiment.neg_score():
word_sentiment_score = word_sentiment.neg_score()*(-1)
else:
word_sentiment_score = word_sentiment.pos_score()
print word, " " , word_sentiment_score
if word_sentiment_score != 0:
sentiment_score = sentiment_score * word_sentiment_score
return sentiment_score
def transform(self, review_list, y=None):
sentiment_score_list = list()
for review in review_list:
sentiment_score_list.append(self.get_word_level_sentiment(review.split()))
return np.asarray(sentiment_score_list)
def fit(self, x, y=None):
return self
管道,我用的是:
pipeline = Pipeline([
("word_level_sentiment",GetWorldLevelSentiment()),
("clf", MultinomialNB())])
,然後調用合適的管道:
pipeline.fit(x_train, y_train)
但這是給下面的錯誤對我說:
This MultinomialNB instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.
是否有人可以指導我什麼,我做錯了什麼?這將是一個很大的幫助。
請張貼錯誤和完整代碼的完整的堆棧跟蹤複製的行爲。 –
嘗試刪除這樣的括號:(「clf」,MultinomialNB) – CrazyElf
@CrazyElf。刪除括號不起作用。管道需要一個實例,而不是類。 –