這裏是我的代碼:Subseting在大塊的大熊貓
path = 'C:\\Users\\Daniil\\Desktop\\dw_payments'
#list of all df:
all_files = glob.glob(path + '/*.csv')
all_payments_data = pd.DataFrame()
dfs = []
for file in all_files:
df = pd.read_csv(file,index_col = None,chunksize = 200000)
df_f = df[df['CUSTOMER_NO'] == 20069675]
df_f = pd.concat(df_f,ignore_index = True)
dfs.append(df_f)
all_payments_data = pd.concat(dfs)
正如你在該行df_f = df[df['CUSTOMER_NO'] == 20069675]
看,我想在一個塊來選擇特定的客戶,然後將其合併到空的數據幀。我想多次重複這個過程(有很多文件)。
但它引發我一個錯誤:
TypeError: 'TextFileReader' object is not subscriptable
我怎樣才能解決這個問題?
它是一個大的文件?如果不是,您可以省略塊大小。 –
@cᴏʟᴅsᴘᴇᴇᴅ非常大。這就是問題 –
啊,使用chunksize會產生'df'可迭代,而不是迭代器。在閱讀'df = pd.read_csv'後,試試'df_f = [x [x ['CUSTOMER_NO'] == 20069675] for df in'df]'? – Zero