2016-06-13 56 views
3

我有一個csv文件,我作爲一個數據幀導入。這個數據幀經過多個過濾步驟。數據也根據條件在列之間移動。從現有創建新的數據幀 - SettingWithCopyWarning

import numpy as np 
import pandas as pd 

df = pd.read_csv('my_csv_file.csv', names=headers) 
df2 = df.drop_duplicates(['Column_X']) 
series1 = df2.loc[df2['Column_Y'] == 'Category1', 'Column_X'] 
df2.loc[df2['Column_Y'] == 'Category1', 'Column_Z'] = series1 
... 

將最後一行輸入命令提示符後,我得到SettingWithCopyWarning。

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame. 
Try using .loc[row_indexer, col_indexer] = value instead. 

請注意,我在我的代碼中使用了.loc。

執行以下操作,不會引發錯誤:

df.loc[df['Column_Y'] == 'Category1', 'Column_Z'] = series1 

這讓我覺得這個問題是在使用DF2作爲一個新的數據幀。

回答

1

我認爲問題是df2df1的視圖。取而代之的是在.drop_duplicates呼叫的末尾添加.copy()

df2 = df.drop_duplicates(['Column_X']).copy()