2016-12-06 71 views
0

我想多個CSV文件合併成一個大的CSV我的數據集。我正在尋找的是從多個CVS文件取幾列數據,並進行數據集出來。我不希望我的最終數據集中的所有列都有,但很少有選定的列。在閱讀CSV文件時,我在panda中使用了names屬性,並且它的返回正常,但我無法從提取的CSV中創建新的CSV。我在這裏做錯了什麼?我在底部添加了堆棧跟蹤。熊貓to_csv():類型錯誤:強迫爲Unicode:需要字符串或緩衝區,列表中找到

import glob 
import pandas as pd 
import os 
import time 
from datetime import datetime 
import numpy as np 

path = "C:\Users\lenovo\Downloads\Compressed\LoanStats3a.csv_2\csv" 
class MergeCsvFiles: 
def MergeCsv(self): 
    allFiles = glob.glob(os.path.join(path, "LoanStats3a.csv")) 
    print 'allFiles',allFiles 

    for file_ in allFiles: 
     print 'file_ ######### ',file_ 

     # merge_df = pd.DataFrame.from_csv(file_) 
     # print merge_df 
     fileToSave = glob.glob(os.path.join(path, "merge.csv")) 
     print 'filrToSave #### ', fileToSave 
     np_array_list = [] 

     df = pd.read_csv(file_, skipinitialspace=True,low_memory=False,header=0,index_col=None) 
     np_array_list.append(df.as_matrix()) 
     comb_np_array = np.vstack(np_array_list) 
     big_frame = pd.DataFrame(comb_np_array) 
     # big_frame.columns = fields 
     print 'big_frame#### ', big_frame 
     big_frame.to_csv(fileToSave) 

     # See the keys 
     print 'df.keys########',df.keys() 
     print 'df @@@@@', df 
     frame = pd.DataFrame() 
     list_ = [] 

     list_.append(df) 
     frame = pd.concat(list_) 
     # print 'frame#### ',frame 

     frame.to_csv(fileToSave) 

if __name__ == "__main__": 
    s = MergeCsvFiles() 
    s.MergeCsv() 

堆棧跟蹤:

Traceback (most recent call last): 
    File "C:/Users/lenovo/Downloads/Video/Machine Learning/MLPredictiveAnalysis/MergeCsv.py", line 59, in <module> 
    s.MergeCsv() 
    File "C:/Users/lenovo/Downloads/Video/Machine Learning/MLPredictiveAnalysis/MergeCsv.py", line 39, in MergeCsv 
    big_frame.to_csv(fileToSave) 
    File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 1344, in to_csv 
    formatter.save() 
    File "C:\Python27\lib\site-packages\pandas\formats\format.py", line 1526, in save 
    compression=self.compression) 
    File "C:\Python27\lib\site-packages\pandas\io\common.py", line 426, in _get_handle 
    f = open(path, mode) 
TypeError: coercing to Unicode: need string or buffer, list found 
+1

'glob.glob'返回一個列表。您需要將路徑名字符串傳遞給'big_frame.csv'。爲什麼你甚至需要glob? 'big_frame.csv(os.path.join(路徑, 「merge.csv」))'應該工作 –

+0

Thanks.It工作。 – Cyclotron3x3

回答

1

glob.glob返回一個列表。您需要將路徑名稱的字符串傳遞給big_frame.csv。爲什麼你甚至需要glob? big_frame.csv(os.path.join(path, "merge.csv"))應該工作。

您還在循環底部使用frame.to_csv(fileToSave)寫這個文件。每一次迭代都將文件寫入,因此只有最後一次迭代纔會保存任何文件。

相關問題