2016-09-28 224 views
2

,我有以下數據:在大熊貓擴大行數據幀

product Sales_band Hour_id sales 
prod_1 HIGH   1 200 
prod_1 HIGH   3 100 
prod_1 HIGH   4 300 
prod_1 VERY HIGH  2 100 
prod_1 VERY HIGH  5 253 
prod_1 VERY HIGH  6 234 

要添加一個基於hour_id值的行。 hour_id變量的值可以從1到10。因此,上述相同的數據將在缺少小時id的位置展開。虛擬輸出:(銷售= 0失蹤小時ID

product Sales_band Hour_id sales 
prod_1 HIGH   1 200 
prod_1 HIGH   2 0 
prod_1 HIGH   3 100 
prod_1 HIGH   4 300 
prod_1 HIGH   5 0 
prod_1 HIGH   6 0 
prod_1 HIGH   7 0 
prod_1 HIGH   8 0 
prod_1 HIGH   9 0 
prod_1 HIGH   10 0 
prod_1 VERY HIGH  1 0 
prod_1 VERY HIGH  2 100 
prod_1 VERY HIGH  3 0 
prod_1 VERY HIGH  4 0 
prod_1 VERY HIGH  5 253 
prod_1 VERY HIGH  6 234 
prod_1 VERY HIGH  7 0 
prod_1 VERY HIGH  8 0 
prod_1 VERY HIGH  9 0 
prod_1 VERY HIGH  10 0 

我怎麼能做到這一點使用python數據幀時。

+0

您應該結束了,每個產品和銷售帶10行? –

+0

是的,這應該是理想的最終輸出 – Mukul

回答

2

使用groupbyreindex

print (df.groupby(['product','Sales_band'])['Hour_id','sales'] 
     .apply(lambda x: x.set_index('Hour_id').reindex(range(1, 11), fill_value=0)) 
     .reset_index()) 

    product Sales_band Hour_id sales 
0 prod_1  HIGH  1 200 
1 prod_1  HIGH  2  0 
2 prod_1  HIGH  3 100 
3 prod_1  HIGH  4 300 
4 prod_1  HIGH  5  0 
5 prod_1  HIGH  6  0 
6 prod_1  HIGH  7  0 
7 prod_1  HIGH  8  0 
8 prod_1  HIGH  9  0 
9 prod_1  HIGH  10  0 
10 prod_1 VERY HIGH  1  0 
11 prod_1 VERY HIGH  2 100 
12 prod_1 VERY HIGH  3  0 
13 prod_1 VERY HIGH  4  0 
14 prod_1 VERY HIGH  5 253 
15 prod_1 VERY HIGH  6 234 
16 prod_1 VERY HIGH  7  0 
17 prod_1 VERY HIGH  8  0 
18 prod_1 VERY HIGH  9  0 
19 prod_1 VERY HIGH  10  0 
+0

非常感謝。有效。將閱讀更多關於set_index和reindex的信息。 – Mukul

+0

謝謝你的接受!美好的一天! – jezrael