樞軸在大熊貓一系列等級列

在熊貓，我有一個數據幀，其中每一行對應於一個用戶，並且每個列於與該用戶，包括他們如何額定某一件事的變量：樞軸在大熊貓一系列等級列

+----------------+--------------------------+----------+----------+ 
|  name  |   email   | rating_a | rating_b | 
+----------------+--------------------------+----------+----------+ 
| Someone  | [email protected]   |  7.8 |  9.9 | 
| Someone Else | [email protected] |  2.4 |  9.2 | 
| Another Person | [email protected] |  3.5 |  7.5 | 
+----------------+--------------------------+----------+----------+

欲樞轉表，使得一列是評級的（a，或b）的類型，另一種是額定的值（7.8，3.5等），和其他列是與上述相同，這樣：

+----------------+-------------------------+-------------+--------------+ 
|  name  |   email   | rating_type | rating_value | 
+----------------+-------------------------+-------------+--------------+ 
| Someone  | [email protected]  | a   |   7.8 | 
| Someone  | [email protected]  | b   |   9.9 | 
| Someone Else | [email protected] | a   |   2.4 | 
| Someone Else | [email protected] | b   |   9.2 | 
| Another Person | [email protected] | a   |   3.5 | 
| Another Person | [email protected] | b   |   7.5 | 
+----------------+-------------------------+-------------+--------------+

似乎熊貓melt方法是正確的，但我不完全確定我的id_vars是什麼和我的value_vars是在這種情況下。此外，它似乎刪除不在這兩個類別之一中的所有列，例如電子郵件地址。但我想保留所有這些信息。

我該如何與熊貓一起做這件事？

來源

2017-05-29 Miguel

您可以使用melt + str.replace變革列名：

df.columns = df.columns.str.replace('rating_','') 
df = df.melt(id_vars=['name','email'], var_name='rating_type', value_name='rating_value') 
print (df) 
      name      email rating_type rating_value 
0   Someone   [email protected]   a   7.8 
1 Someone Else  [email protected]   a   2.4 
2 Another Person [email protected]   a   3.5 
3   Someone   [email protected]   b   9.9 
4 Someone Else  [email protected]   b   9.2 
5 Another Person [email protected]   b   7.5

與set_index + stack + rename_axis + reset_index另一種解決方案：

df.columns = df.columns.str.replace('rating_','') 
df = df.set_index(['name','email']) 
     .stack() 
     .rename_axis(['name','email','rating_type']) 
     .reset_index(name='rating_value') 
print (df) 
      name      email rating_type rating_value 
0   Someone   [email protected]   a   7.8 
1   Someone   [email protected]   b   9.9 
2 Someone Else  [email protected]   a   2.4 
3 Someone Else  [email protected]   b   9.2 
4 Another Person [email protected]   a   3.5 
5 Another Person [email protected]   b   7.5

解決方案與melt如果需要更改的行順序：

df.columns = df.columns.str.replace('rating_','') 
df = df.reset_index() \ 
     .melt(id_vars=['index','name','email'], 
      var_name='rating_type', 
      value_name='rating_value')\ 
     .sort_values(['index','rating_type']) \ 
     .drop('index', axis=1) \ 
     .reset_index(drop=True) 
print (df) 
      name      email rating_type rating_value 
0   Someone   [email protected]   a   7.8 
1   Someone   [email protected]   b   9.9 
2 Someone Else  [email protected]   a   2.4 
3 Someone Else  [email protected]   b   9.2 
4 Another Person [email protected]   a   3.5 
5 Another Person [email protected]   b   7.5

來源

2017-05-29 07:38:00 jezrael

樞軸在大熊貓一系列等級列

回答

相關問題