2017-04-21 61 views
1

我被卡在boxplot的一些細節上。我現在用的是如下數據(當然,這是它的一個樣本):在R中操縱盒子美學

dput(birds[1:20,]) 
structure(list(status = c(1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 
0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L), length = c(1520L, 
1250L, 870L, 720L, 820L, 770L, 50L, 570L, 580L, 480L, 470L, 450L, 
435L, 275L, 256L, 230L, 330L, 330L, 300L, 180L), mass = c(9600, 
5000, 3360, 2517, 3170, 4390, 1930, 1020, 910, 590, 539, 940, 
684, 230, 162, 170, 501, 439, 386, 95), range = c(1.21, 0.56, 
0.07, 1.1, 3.45, 2.96, 0.01, 9.01, 7.9, 4.33, 1.04, 2.17, 4.81, 
0.31, 0.24, 0.77, 2.23, 0.22, 2.4, 0.69), migr = c(1L, 1L, 1L, 
3L, 3L, 2L, 1L, 2L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L), insect = c(12L, 0L, 0L, 12L, 0L, 0L, 0L, 6L, 6L, 0L, 12L, 
12L, 12L, 3L, 3L, 3L, 3L, 3L, 3L, 12L), diet = c(2L, 1L, 1L, 
2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 
2L)), .Names = c("status", "length", "mass", "range", "migr", 
"insect", "diet"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 22L 
), class = "data.frame") 

,並創造了情節像這樣的:

give.n <- function(x){ 
    return(c(y = mean(x), label = length(x))) 
} 

plot2 <- ggplot(birds, aes(x = factor(diet, labels = c("herbivorous", "omnivorous","carnivorous")), 
          y = range, fill = factor(status,labels = c("absent", "present")))) + 
      geom_boxplot() + labs(x = "Diet", fill = "Status") + 
      stat_summary(fun.data = give.n, geom = "text") + 
      geom_jitter() 
plot2 

ggplot

這裏是我被困:

  • 我想抖動情節使用不同的顏色取決於什麼migr是(注:migr是1-seden 2 - 定居&遷徙,3 - 遷移),理想情況下,也會有一個傳說。我曾嘗試加入+ geom_jitter(birds, aes (x = factor(diet)) - > unecessfull。
  • 如何將數字(觀察次數)移動到箱形圖的中間。我嘗試了position的不同變化,但也沒有運氣。
+0

你可以加上'give.n'是什麼嗎?我可以給你一個答案,但我需要示例數據來查看定位是否可行 – Paolo

+1

@pdil對不起!似乎我忘記了一個pieco的代碼 - 我已經將它添加到現在的原始問題。感謝您指出, – Danka

回答

1

我建議改變因子水平的ggplot功能外:

library(ggplot2) 

df <- birds 
df$diet <- factor(df$diet, levels = 1:3, labels = c("herbivorous", "omnivorous","carnivorous")) 
df$status <- factor(df$status, levels = 0:1, labels = c("absent", "present")) 
df$migr <- factor(df$migr, levels = 1:3, labels = c('sedentary', 'sedentary & migratory', 'migratory')) 

give.n <- function(x){ 
    return(c(y = mean(x), label = length(x))) 
} 

ggplot(df, aes(x = diet, y = range, fill = status)) + 
    geom_boxplot() + labs(x = "Diet", fill = "Status") + 
    stat_summary(fun.data = give.n, geom = "text", 
       position = position_dodge(width = 0.75)) + 
    geom_jitter(aes(color = migr)) + 
    scale_color_brewer(palette = 'Set1') 

enter image description here

因爲我們有一個數字和抖動設置在同一時間,這將是很好,如果在每個點框對應的數字。因此,我們必須通過身份告訴geom_jitter抖動:

ggplot(df, aes(x = diet, y = range, fill = status)) + 
    geom_boxplot() + labs(x = "Diet", fill = "Status") + 
    stat_summary(fun.data = give.n, geom = "text", 
       position = position_dodge(width = 0.75)) + 
    geom_jitter(aes(color = migr, group = status), 
       position = position_jitterdodge(dodge.width = 0.75)) + 
    scale_color_brewer(palette = 'Set1') 

enter image description here

如果你想改變抖動寬度,改變說法position_jitterdodgejitter.width

+0

這項工作就像一個魅力,謝謝!還有一件事,離​​羣值被黑掉了。我怎樣才能讓他們在各自的顏色(並idealy添加一個標籤,知道哪些觀察是異常值)dispeyed? – Danka

+0

你在使用不同的數據集嗎? – mt1022