2011-07-11 92 views
70

我想在ggplot2中的堆積條形圖上顯示數據值。這是我嘗試代碼在ggplot2中顯示堆積條形圖上的數據值

Year  <- c(rep(c("2006-07", "2007-08", "2008-09", "2009-10"), each = 4)) 
Category <- c(rep(c("A", "B", "C", "D"), times = 4)) 
Frequency <- c(168, 259, 226, 340, 216, 431, 319, 368, 423, 645, 234, 685, 166, 467, 274, 251) 
Data  <- data.frame(Year, Category, Frequency) 
library(ggplot2) 
p <- qplot(Year, Frequency, data = Data, geom = "bar", fill = Category,  theme_set(theme_bw())) 
p + geom_text(aes(label = Frequency), size = 3, hjust = 0.5, vjust = 3, position =  "stack") 

enter image description here

我想在每個部分的中間,以顯示這些數據值。任何在這方面的幫助將不勝感激。謝謝

+0

相關問題:http://stackoverflow.com/questions/18994631/center-labels-stacked-bar-counts-ggplot2/18994840?noredirect=1#18994840 –

+0

不是真的辯論的地方,但我想知道如果對此可能過於規範,特別是對於更普通的觀衆。 [這是一個很好的例子](http://gyazo.com/d24ae31837cdf57457337328d4ce87b4) - 數字表示可以記住的百分比,這就不再需要一個規模較小的數字識別讀者可能難以訪問的規模? – geotheory

回答

117

ggplot 2.2.0標籤可以很容易地使用position = position_stack(vjust = 0.5)geom_text堆疊。

ggplot(Data, aes(x = Year, y = Frequency, fill = Category, label = Frequency)) + 
    geom_bar(stat = "identity") + 
    geom_text(size = 3, position = position_stack(vjust = 0.5)) 

enter image description here

另外請注意,「position_stack()position_fill()現在的分組,這使得默認的堆疊順序相匹配的傳說相反的順序堆疊值。」


答案有效期爲舊版本的ggplot

這是一種方法,其計算杆的中點。

library(ggplot2) 
library(plyr) 

# calculate midpoints of bars (simplified using comment by @DWin) 
Data <- ddply(Data, .(Year), 
    transform, pos = cumsum(Frequency) - (0.5 * Frequency) 
) 

# library(dplyr) ## If using dplyr... 
# Data <- group_by(Data,Year) %>% 
# mutate(pos = cumsum(Frequency) - (0.5 * Frequency)) 

# plot bars and add text 
p <- ggplot(Data, aes(x = Year, y = Frequency)) + 
    geom_bar(aes(fill = Category), stat="identity") + 
    geom_text(aes(label = Frequency, y = pos), size = 3) 

Resultant chart

+0

感謝您的回答。我使用'data.table'來代替'plyr'來做類似的事情,所以像這樣:'Data.dt [,list(Category,Frequency,pos = cumsum(Frequency)-0.5 * Frequency),by = Year ]' – atomicules

16

由於哈德利提到有較堆積條形圖標籤傳達你的信息的更有效的方法。實際上,堆積圖不是很有效,因爲這些柱(每個類別)不共享一個軸,所以比較很難。

在這些實例中使用兩個圖形幾乎總是更好,共享一個公共座標軸。在您的示例中,我假設您要顯示總體總數,然後顯示每個類別在特定年份中貢獻的比例。

library(grid) 
library(gridExtra) 
library(plyr) 

# create a new column with proportions 
prop <- function(x) x/sum(x) 
Data <- ddply(Data,"Year",transform,Share=prop(Frequency)) 

# create the component graphics 
totals <- ggplot(Data,aes(Year,Frequency)) + geom_bar(fill="darkseagreen",stat="identity") + 
    xlab("") + labs(title = "Frequency totals in given Year") 
proportion <- ggplot(Data, aes(x=Year,y=Share, group=Category, colour=Category)) 
+ geom_line() + scale_y_continuous(label=percent_format())+ theme(legend.position = "bottom") + 
    labs(title = "Proportion of total Frequency accounted by each Category in given Year") 

# bring them together 
grid.arrange(totals,proportion) 

這會給你一個2平板顯示是這樣的:

Vertically stacked 2 panel graphic

如果你想添加的頻率值的表是最好的格式。