如何使用ggplot分組和顯示頂級X類別？

我正在嘗試使用ggplot按公司繪製生產數據，並使用點的顏色來指定年份。該follwoing圖表顯示了基於樣本數據的例子： enter image description here 如何使用ggplot分組和顯示頂級X類別？

然而，很多時候，我的實際數據有50-60不同comapnies至極使得在Y軸上的公司名稱被tiglhtly分組，而不是非常asteticly pleaseing。

什麼是僅顯示前5位公司信息數據的最簡單方法（按2011量子排名），然後顯示其餘的聚合並顯示爲「其他」？

下面是一些示例數據，我已經用於創建示例圖表代碼：

# create some sample data 
c=c("AAA","BBB","CCC","DDD","EEE","FFF","GGG","HHH","III","JJJ") 

q=c(1,2,3,4,5,6,7,8,9,10) 
y=c(2010) 
df1=data.frame(Company=c, Quantity=q, Year=y) 

q=c(3,4,7,8,5,14,7,13,2,1) 
y=c(2011) 
df2=data.frame(Company=c, Quantity=q, Year=y) 

df=rbind(df1, df2) 

# create plot 
p=ggplot(data=df,aes(Quantity,Company))+ 
    geom_point(aes(color=factor(Year)),size=4) 
p

我開始走上蠻力方式的路徑，但認爲有可能是一個簡單而高貴的方式做這是我應該學習的。任何援助將不勝感激。

來源

2012-04-19 MikeTP

這個怎麼樣：

df2011 <- subset (df, Year == 2011) 
    companies <- df2011$Company [order (df2011$Quantity, decreasing = TRUE)] 
    ggplot (data = subset (df, Company %in% companies [1 : 5]), 
      aes (Quantity, Company)) + 
      geom_point (aes (color = factor (Year)), size = 4)

BTW：爲了讓代碼被稱爲優雅，花了幾個空間，他們不貴......

來源

2012-04-19 19:42:28 cbeleites

不錯，但我希望不只是降小公司，而是aggreagate他們，並告訴他們爲「其他」。 – MikeTP 2012-04-19 19:46:54

對不起，我確實忽略了這個問題的一部分。但無論如何，我不清楚你究竟想要如何聚合它們？箱形圖？繪製所有點，但在一行？平均？中位數？ – cbeleites 2012-04-19 20:26:23

基本上，只需創建一個名爲「其他」的新公司名稱，即聚合不在「前5名」中的公司。所以沿着x軸將會有總共6家由「個人排名前5位」公司組成的「公司」和「其他」，這是所有非「前五名」公司的總和。我認爲基本上只是做一些像.... df2 = subset（df，公司％！在％公司[1：5]） – MikeTP 2012-04-19 20:40:00

看看這是你想要什麼。它需要你的df數據框，以及@cbeleites已經提出的一些想法。步驟如下：

1.選擇2011年的數據並訂購公司從最高到最低的數量。

2.將df分割成兩個比特：dftop其中前5個數據是連續的;和dfother，其中包含其他公司的彙總數據（使用來自plyr包的ddply()）。

3.將兩個數據幀放在一起給dfnew。

4.設置公司級別的繪製順序：從上到下從高到低，然後是「其他」。訂單部分由companies加上「其他」加上。

5.繪製如前。

library(ggplot2) 
library(plyr) 

# Step 1 
df2011 <- subset (df, Year == 2011) 
companies <- df2011$Company [order (df2011$Quantity, decreasing = TRUE)] 

# Step 2 
dftop = subset(df, Company %in% companies [1:5]) 
dftop$Company = droplevels(dftop$Company) 

dfother = ddply(subset(df, !(Company %in% companies [1:5])), .(Year), summarise, Quantity = sum(Quantity)) 
dfother$Company = "Other" 

# Step 3 
dfnew = rbind(dftop, dfother) 

# Step 4 
dfnew$Company = factor(dfnew$Company, levels = c("Other", rev(as.character(companies)[1:5]))) 
levels(dfnew$Company) # Check that the levels are in the correct order 

# Step 5 
p = ggplot (data = dfnew, aes (Quantity, Company)) + 
     geom_point (aes (color = factor (Year)), size = 4) 
p

的代碼產生：

enter image description here

來源

2012-04-20 05:40:26

如何使用ggplot分組和顯示頂級X類別？

回答

相關問題