2017-08-10 108 views
1

我有一個數據幀:np.where多個變量

customer_id [1,2,3,4,5,6,7,8,9,10] 
feature1 [0,0,1,1,0,0,1,1,0,0] 
feature2 [1,0,1,0,1,0,1,0,1,0] 
feature3 [0,0,1,0,0,0,1,0,0,0] 

使用此我想創建一個新的變量(比如說new_var)的說法,當特徵1是1,則new_var = 1,如果feature_2 = 1 then new_var = 2,feature3 = 1然後new_var = 3 else 4.我正在嘗試np.where,但雖然它不會給我一個錯誤,但它沒有做正確的事情 - 所以我想嵌套的np .where僅適用於單個變量。在這種情況下,在熊貓中執行嵌套if/case的最佳方法是什麼?

我np.where代碼是這樣的:

df[new_var]=np.where(df['feature1']==1,'1', np.where(df['feature2']==1,'2', np.where(df[feature3']==1,'3','4'))) 
+0

只是爲了某種回答我的問題:我剛纔提到的東西我也試過np.where解決方案的工作 - 在因爲它沒有給我正確的結果是因爲feature1的數據類型是字符串,而不是整數..所以對於任何尋找類似問題的人來說,'nested np.where'解決方案和'numpy.select'解決方案jezrael提到作品 – Shraddha

回答

1

我認爲你需要numpy.select - 它首先選擇True值和所有其他都不重要:

m1 = df['feature1']==1 
m2 = df['feature2']==1  
m3 = df['feature3']==1 
df['new_var'] = np.select([m1, m2, m3], ['1', '2', '3'], default='4') 

樣品

customer_id = [1,2,3,4,5,6,7,8,9,10] 
feature1 = [0,0,1,1,0,0,1,1,0,0] 
feature2 = [1,0,1,0,1,0,1,0,1,0] 
feature3 = [0,0,1,0,0,0,1,0,0,0] 

df = pd.DataFrame({'customer_id':customer_id, 
        'feature1':feature1, 
        'feature2':feature2, 
        'feature3':feature3}) 

m1 = df['feature1']==1 
m2 = df['feature2']==1  
m3 = df['feature3']==1 
df['new_var'] = np.select([m1, m2, m3], ['1', '2', '3'], default='4') 
print (df) 
    customer_id feature1 feature2 feature3 new_var 
0   1   0   1   0  2 
1   2   0   0   0  4 
2   3   1   1   1  1 
3   4   1   0   0  1 
4   5   0   1   0  2 
5   6   0   0   0  4 
6   7   1   1   1  1 
7   8   1   0   0  1 
8   9   0   1   0  2 
9   10   0   0   0  4 

如果features10可轉換0False1True

m1 = df['feature1'].astype(bool) 
m2 = df['feature2'].astype(bool) 
m3 = df['feature3'].astype(bool) 
df['new_var'] = np.select([m1, m2, m3], ['1', '2', '3'], default='4') 
print (df) 
    customer_id feature1 feature2 feature3 new_var 
0   1   0   1   0  2 
1   2   0   0   0  4 
2   3   1   1   1  1 
3   4   1   0   0  1 
4   5   0   1   0  2 
5   6   0   0   0  4 
6   7   1   1   1  1 
7   8   1   0   0  1 
8   9   0   1   0  2 
9   10   0   0   0  4 
+0

謝謝@jezrael - 似乎工作得很好,如果我嘗試這個例子,但不是在我的代碼,我想弄清楚爲什麼。此外,這是一種解決方案,當功能1,2,3僅爲第一個值(例如第3行)時,它們都不爲1的情況。 – Shraddha

+1

現在工作!我有0/1作爲字符串,這就是爲什麼它每次都返回默認值4。謝謝! – Shraddha

+0

很高興能幫到你!美好的一天! – jezrael