1
我有一個包含兩列的CSV文件包含句子。例如 Test.csv:如何在csv文件中幹掉每一行?
Col[1]
----------------------
This trip was amazing.
Col[2]
--------------------
The cats are playing.
所以我做了一些NLP過程:
with codecs.open('test.csv','r', encoding='utf-8', errors='ignore') as myfile:
data = csv.reader(myfile, delimiter=',')
next(data)
stops = set(stopwords.words("english"))
stemmer = PorterStemmer()
for row in data:
word_tokens1 = word_tokenize(row[1].lower())
word_tokens2 = word_tokenize(row[2].lower())
remo1 = [w for w in word_tokens1 if w in re.sub("[^a-zA-Z]"," ",w)]
remo2 = [w for w in word_tokens2 if w in re.sub("[^a-zA-Z]"," ",w)]
list1 = [w for w in remo1 if not w in stops]
list2 = [w for w in remo2 if not w in stops]
for w in list1:
l = stemmer.stem(w)
print(l)
for w in list2:
l2 = stemmer.stem(w)
print(l2)
我的問題是,當我不制止,當我打印:
trip
amazi
cat
play
它連續打印每個單詞。我怎樣才能制止 等之後返回來了一句:
Col[1]:
-------------------
trip amazi
Col[2]:
-------------------
cat play
您可以顯示文件的示例嗎?我想知道你爲什麼使用csv軟件包。據我所知,你關心的是行。在csv中,列之間用逗號分隔。行由換行符分隔。 – MAZDAK
它是在不同的顏色對不起,我寫它作爲代碼.. –
因此,每條線看起來像「這次旅行是驚人的,貓在玩」? – MAZDAK