2016-01-22 118 views
1

我試圖訪問數據幀的過濾版本,使用帶有過濾器值的列表。使用while循環過濾Pandas DataFrame

我正在使用while循環,我認爲會逐個將適當的列表值插入到數據框過濾器中。此代碼打印第一個罰款,但隨後打印4個空的數據框。

我確定這是一個快速修復,但我一直無法找到它。

boatID = [342, 343, 344, 345, 346] 
i = 0 
while i < len(boatID): 
    df = df[(df['boat_id']==boatID[i])] 
    #run some code, i'm printing DF.head to test it works 
    print(df.head()) 
    i = i + 1 

實例數據框:

boat_id activity speed heading 
0  342   1 3.34 270.00 
1  343   1 0.02  0.00 
2  344   1 0.01 270.00 
3  345   1 8.41 293.36 
4  346   1 0.03 90.00 
+0

感謝您的建議,我沒有試圖返回'isin'生成的布爾值,我試圖過濾DF'boat_id' ==某些數字。 – hselbie

+0

更新使用'int(boatID [i])'不起作用 – hselbie

回答

0

我覺得你dfdf = df[(df['boat_id']==boatID[i])]覆蓋df

也許你需要改變輸出到新的數據幀,例如df1

boatID = [342, 343, 344, 345, 346] 
i = 0 
while i < len(boatID): 
    df1 = df[(df['boat_id']==boatID[i])] 
    #run some code, i'm printing DF.head to test it works 
    print(df1.head()) 
    i = i + 1 

# boat_id activity speed heading 
#0  342   1 3.34  270 
# boat_id activity speed heading 
#1  343   1 0.02  0 
# boat_id activity speed heading 
#2  344   1 0.01  270 
# boat_id activity speed heading 
#3  345   1 8.41 293.36 
# boat_id activity speed heading 
#4  346   1 0.03  90 

如果您需要通過清單,boat_id列篩選數據框dfboatID使用isin

df1 = df[(df['boat_id'].isin(boatID))] 
print df1 
# boat_id activity speed heading 
#0  342   1 3.34 270.00 
#1  343   1 0.02  0.00 
#2  344   1 0.01 270.00 
#3  345   1 8.41 293.36 
#4  346   1 0.03 90.00 

編輯:

我認爲你可以使用dictionarydataframes的:

print df 
    boat_id activity speed heading 
0  342   1 3.34 270.00 
1  343   1 0.02  0.00 
2  344   1 0.01 270.00 
3  345   1 8.41 293.36 
4  346   1 0.03 90.00 

boatID = [342, 343, 344, 345, 346] 

dfs = ['df' + str(x) for x in boatID] 
dicdf = dict() 

print dfs 
['df342', 'df343', 'df344', 'df345', 'df346'] 

i = 0 
while i < len(boatID): 
    print dfs[i] 
    dicdf[dfs[i]] = df[(df['boat_id']==boatID[i])] 
    #run some code, i'm printing DF.head to test it works 
# print(df1.head()) 
    i = i + 1 
print dicdf 
{'df344': boat_id activity speed heading 
2  344   1 0.01  270, 'df345': boat_id activity speed heading 
3  345   1 8.41 293.36, 'df346': boat_id activity speed heading 
4  346   1 0.03  90, 'df342': boat_id activity speed heading 
0  342   1 3.34  270, 'df343': boat_id activity speed heading 
1  343   1 0.02  0} 

print dicdf['df342'] 
    boat_id activity speed heading 
0  342   1 3.34  270 
+0

謝謝,這看起來像是它將所有DF連接成5個dtype對象,在一個DF中,這是否正確? – hselbie

+0

現在它從'df'創建5個新的數據幀'df1'。在下一個循環中,舊的'df1'被覆蓋新的'df1' – jezrael

+0

是否有可能而不是每次循環運行時都覆蓋df1,像'df [i]'那樣創建5個具有唯一名稱的數據幀,如df342,df343等? – hselbie