繪製列值的出現次數

我希望標題足夠準確，但我不太確定如何對其進行定義。繪製列值的出現次數

總之，我的問題是，我有一個熊貓DF看起來像下面這樣：

       Customer  Source CustomerSource 
0        Apple   A    141 
1        Apple   B    36 
2       Microsoft   A    143 
3        Oracle   C    225 
4         Sun   C    151

這是一個更大的數據集得到一個DF和意義的CustomerSource的價值在於它的累積的Customer和Source所有出現的總和，例如，在這種情況下，存在141次出現Apple與SoureA並用SourceB等的CustomerOracle 225。

我想要做的是，我想做一個堆疊的barplot，它給我所有在x軸上的Customer s和在y軸上彼此堆疊的CustomerSource的值。類似於下面的例子。有關我將如何繼續進行此操作的任何提示？

來源

2017-09-05 Khaine775

您可以使用pivot或unstack的重塑，然後DataFrame.bar：

df.pivot('Customer','Source','CustomerSource').plot.bar(stacked=True)

df.set_index(['Customer','Source'])['CustomerSource'].unstack().plot.bar(stacked=True)

，或者如果對Customer，Source使用pivot_table或groupby與骨料sum重複：

print (df) 
    Customer Source CustomerSource 
0  Apple  A    141 <-same Apple, A 
1  Apple  A    200 <-same Apple, A 
2  Apple  B    36 
3 Microsoft  A    143 
4  Oracle  C    225 
5  Sun  C    151 

df = df.pivot_table(index='Customer',columns='Source',values='CustomerSource', aggfunc='sum') 
print (df) 
Source   A  B  C 
Customer      
Apple  341.0 36.0 NaN <-141 + 200 = 341 
Microsoft 143.0 NaN NaN 
Oracle  NaN NaN 225.0 
Sun   NaN NaN 151.0 


df.pivot_table(index='Customer',columns='Source',values='CustomerSource', aggfunc='sum') 
    .plot.bar(stacked=True)

df.groupby(['Customer','Source'])['CustomerSource'].sum().unstack().plot.bar(stacked=True)

而且可以互換列：

df.pivot('Customer','Source','CustomerSource').plot.bar(stacked=True)

df.pivot('Source', 'Customer','CustomerSource').plot.bar(stacked=True)

來源

2017-09-05 12:25:35 jezrael

繪製列值的出現次數

回答

相關問題