Python pandas - >按列名稱條件選擇

我有df列名：'a'，'b'，'c'...'z'。Python pandas - >按列名稱條件選擇

print(my_df.columns) 
Index(['a', 'b', 'c', ... 'y', 'z'], 
    dtype='object', name=0)

我有確定哪些列應該顯示的功能。例如：

start = con_start() 
stop = con_stop() 
print(my_df.columns >= start) & (my_df <= stop)

我的結果是：

[False False ... False False False False True True 
True True False False]

我的目標是顯示數據框只能滿足我的條件列。如果start = 'A'，並停止= 'B'，我想有：

0          a    b   
index1  index2             
New York  New York   0.000000  0.000000   
California Los Angeles 207066.666667 214466.666667  
Illinois  Chicago  138400.000000 143633.333333  
Pennsylvania Philadelphia 53000.000000 53633.333333  
Arizona  Phoenix  111833.333333 114366.666667

來源

2017-04-04 Cezary.Sz

我想使這個強大的和用盡可能少的假設越好。

選項1
使用iloc與陣列切片
假設：

my_df.columns.is_unique評估爲True
列已經爲了

start = df.columns.get_loc(con_start()) 
stop = df.columns.get_loc(con_stop()) 

df.iloc[:, start:stop + 1]

選項2
使用loc與布爾切片
假設：

列值是可比較的

start = con_start() 
stop = con_stop() 

c = df.columns.values 
m = (start <= c) & (stop >= c) 

df.loc[:, m]

來源

2017-04-04 22:13:22 piRSquared

您可以使用切片用的.loc來實現這一目標：

df.loc[:,'a':'b']

來源

2017-04-04 22:00:48

生成柱的側向承載力到列表顯示：

cols = [x for x in my_df.columns if start <= x <= stop]

僅使用這些列在您的數據幀：

my_df[cols]

來源

2017-04-04 22:02:27 acidtobi

假設result是你[true/false]陣列和letters是[a...z]：

res=[letters[i] for i,r in enumerate(result) if r] 
new_df=df[res]

來源

2017-04-04 22:04:14

如果你的條件是在複雜的類似的水平，你在你的例子所示，不需要使用任何額外的功能，但只是過濾例如

sweet_and_red_fruit = fruit[(fruit[sweet == 1) & (fruit["colour"] == "red")] 
print(sweet_and_red_fruit)

或者如果你只想打印

print(fruit[(fruit[sweet == 1) & (fruit["colour"] == "red")])

來源

2017-08-25 15:16:34

Python pandas - >按列名稱條件選擇

回答

相關問題