重新排序在兩隻大熊貓數據幀列

我有三列的熊貓數據幀選擇項目，如部分如下所示：重新排序在兩隻大熊貓數據幀列

data = {'T1': {0: 'Belarus', 1: 'Netherlands', 2: 'France', 3: 'Faroe Islands', 
     4: 'Hungary'}, 'T2': {0: 'Sweden', 1: 'Bulgaria', 2: 'Luxembourg', 
     3: 'Andorra', 4: 'Portugal'}, 'score': {0: -4, 1: 2, 2: 0, 3: 1, 4: -1}} 
df = pd.DataFrame(data) 
#   T1    t2 score 
#0  Belarus  Sweden  -4 
#1 Netherlands Bulgaria  2 
#2   France Luxembourg  0 
#3 Faroe Islands  Andorra  1 
#4  Hungary Portugal  -1

對於任何行，其中物品T1和T2不是字母順序（例如，"Netherlands"和"Bulgaria"），我想交換項目，並且還要更改score的符號。

我能想出一個怪物：

df.apply(lambda x: 
      pd.Series([x["T2"], x["T1"], -x["score"]]) 
      if (x["T1"] > x["T2"]) 
      else pd.Series([x["T1"], x["T2"], x["score"]]), 
     axis=1) 
#   0    1 2 
#0 Belarus   Sweden -4 
#1 Bulgaria Netherlands -2 
#2 France  Luxembourg 0 
#3 Andorra Faroe Islands -1 
#4 Hungary  Portugal -1

是否有更好的方法來得到相同的結果？（性能不成問題）

來源

2017-09-15 DyZ

選項1
布爾索引。

m = df.T1 > df.T2 
m 

0 False 
1  True 
2 False 
3  True 
4 False 
dtype: bool 

df.loc[m, 'score'] = df.loc[m, 'score'].mul(-1) 
df.loc[m, ['T1', 'T2']] = df.loc[m, ['T2', 'T1']].values 
df 

     T1    T2 score 
0 Belarus   Sweden  -4 
1 Bulgaria Netherlands  -2 
2 France  Luxembourg  0 
3 Andorra Faroe Islands  -1 
4 Hungary  Portugal  -1

選項2
df.eval

m = df.eval('T1 > T2') 
df.loc[m, 'score'] = df.loc[m, 'score'].mul(-1) 
df.loc[m, ['T1', 'T2']] = df.loc[m, ['T2', 'T1']].values 
df 

     T1    T2 score 
0 Belarus   Sweden  -4 
1 Bulgaria Netherlands  -2 
2 France  Luxembourg  0 
3 Andorra Faroe Islands  -1 
4 Hungary  Portugal  -1

選項3
df.query

idx = df.query('T1 > T2').index 
idx 
Int64Index([1, 3], dtype='int64') 

df.loc[idx, 'score'] = df.loc[idx, 'score'].mul(-1) 
df.loc[idx, ['T1', 'T2']] = df.loc[idx, ['T2', 'T1']].values 
df 

     T1    T2 score 
0 Belarus   Sweden  -4 
1 Bulgaria Netherlands  -2 
2 France  Luxembourg  0 
3 Andorra Faroe Islands  -1 
4 Hungary  Portugal  -1

來源

2017-09-15 03:19:44

還不如利落的@cᴏʟᴅsᴘᴇᴇᴅ的答案，但工作

df1=df[['T1','T2']] 
df1.values.sort(1) 
df1['new']=np.where((df1!=df[['T1','T2']]).any(1),-df.score,df.score) 

df1 
Out[102]: 
     T1    T2 new 
0 Belarus   Sweden -4 
1 Bulgaria Netherlands -2 
2 France  Luxembourg 0 
3 Andorra Faroe Islands -1 
4 Hungary  Portugal -1

來源

2017-09-15 03:49:56 Wen

你需要打印出df1 :) –

@cᴏʟᴅsᴘᴇᴇᴅ是的，你是對的〜:) – Wen

使用LOC

cond = df.T1 > df.T2 
df.loc[cond, 'score'] = df['score'] *-1 
df.loc[cond, ['T1', 'T2']] = df.loc[cond, ['T2', 'T1']].values

輸出

T1   T2    score 
0 Belarus  Sweden   -4 
1 Bulgaria Netherlands  -2 
2 France  Luxembourg  0 
3 Andorra  Faroe Islands -1 
4 Hungary  Portugal  -1

來源

2017-09-15 04:21:54 Vaishali

Loc已經在這裏提到：https://stackoverflow.com/a/46231172/4909087 –

但是......感謝這個我意識到我也需要交換這些值，所以沒關係;-) –

這裏是一個有趣的和創造性的方式使用numpy的工具

t = df[['T1', 'T2']].values 
a = t.argsort(1) 

df[['T1', 'T2']] = t[np.arange(len(t))[:, None], a] 
# @ is python 3.5 thx @cᴏʟᴅsᴘᴇᴇᴅ 
# otherwise use 
# df['score'] *= a.dot([-1, 1]) 
df['score'] *= a @ [-1, 1] 

df 

     T1    T2 score 
0 Belarus   Sweden  -4 
1 Bulgaria Netherlands  -2 
2 France  Luxembourg  0 
3 Andorra Faroe Islands  -1 
4 Hungary  Portugal  -1

來源

2017-09-15 06:00:29 piRSquared

'@'？這是什麼語法？ –

Python 3數組乘法...應該說（ - ： – piRSquared

你的意思是3.6？這在任何python <= 3.4上拋出一個語法 –

重新排序在兩隻大熊貓數據幀列

回答

相關問題