2017-07-26 83 views
0

如果我有任何意見,我將不勝感激 - 我是ggplot新手!訂購geom_segment圖表時出現問題

我正在嘗試創建一個克萊夫蘭點圖,該圖由具有3個級別的集羣構成。我有3個問題,我正在努力:

  1. 在每個羣集中,我希望點由我的連續x-var排序。下面的代碼沒有正確排序。

  2. 是否可以根據y-var是以0(沒有特徵)還是1(確實有特徵)結束來改變點類型?

  3. 我在我的數據集(人口)中有一個變量,它顯示了特徵的總體百分比。我想查看一下,與羣體相比,羣體特徵是否過度/代表不足。我想在每個y-var的同一行上添加一個點。

這裏是我的代碼:

ggplot(cl1, aes(x=Cluster_prop, y=reorder(Var, Cluster_prop)))+ 
    geom_segment(aes(yend=Var), xend=0, colour="grey50")+ 
    geom_point(size=3, aes(colour=Cluster))+ 
    facet_grid(Cluster~., scales="free_y", space="free_y") + 
    ggtitle("Top 10 Cluster Characteristics: % Children Within Cluster With 
Feature") 

這裏是我的數據:

> dput(cl1) 
structure(list(Var = structure(c(2L, 3L, 5L, 7L, 14L, 16L, 18L, 
19L, 20L, 22L, 15L, 9L, 7L, 6L, 21L, 13L, 17L, 12L, 4L, 11L, 
15L, 17L, 21L, 1L, 13L, 4L, 10L, 12L, 6L, 8L), .Label = c("asthdoc_1", 
"AttacksOnExer_1_0", "AttacksTTT_1_0", "AttacksTTT_1_1", "Breath0rmal_1_0", 
"Breath0rmal_1_1", "CAsthmaMed_1_0", "CAsthmaMed_1_1", "CCurrentAsthma_1_0", 

"CCurrentAsthma_1_1", "CongColds_1_1", "CoughNight_1_1", 
"CoughWithColds_1_1", 
"EverWheeze_1_0", "EverWheeze_1_1", "Wheeze6M_1_0", "Wheeze6M_1_1", 
"WheezeMostDays_1_0", "WheezeOcc_1_0", "WheezeWithColds_1_0", 
"WheezeWithColds_1_1", "WheezeWithShort_1_0"), class = "factor"), 
    Cluster_prop = c(100, 100, 100, 100, 100, 100, 100, 100, 
    100, 100, 100, 99.4219653, 98.8439306, 95.3757225, 94.7976879, 
    83.2369942, 79.1907514, 53.7572254, 50.867052, 50.867052, 
    100, 100, 100, 93.103448, 89.655172, 86.206897, 86.206897, 
    82.758621, 79.310345, 79.310345), Population = c(96.131528, 
    78.143133, 63.636364, 95.16441, 60.928433, 67.891683, 97.485493, 
    89.555126, 62.669246, 90.32882, 39.071567, 94.584139, 95.16441, 
    36.363636, 37.330754, 68.665377, 32.108317, 43.520309, 21.856867, 
    42.166344, 39.071567, 32.108317, 37.330754, 9.864603, 68.665377, 
    21.856867, 5.415861, 43.520309, 36.363636, 4.83559), Cluster = 
structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("1", 
"2", "3"), class = "factor")), .Names = c("Var", "Cluster_prop", 
"Population", "Cluster"), row.names = c(NA, -30L), vars = "Cluster", drop = 
TRUE, indices = list(
0:9, 10:19, 20:29), group_sizes = c(10L, 10L, 10L), biggest_group_size = 
10L, labels = structure(list(
Cluster = 1:3), row.names = c(NA, -3L), class = "data.frame", vars = 
"Cluster", drop = TRUE, .Names = "Cluster"), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame")) 

的任何建議非常感謝!

enter image description here

回答

0

對於你的第二個(編輯,三)發行人:

library(tidyverse) 
library(stringr) 
str_sub(str, start = -1, end = -1) 

cl2 <- cl1 %>% mutate(Shape = str_sub(Var, start = -1, end = -1)) 


ggplot(cl2, aes(x=Cluster_prop, y=reorder(Var, Cluster_prop)))+ 
    geom_segment(aes(yend=Var), xend=0, colour="grey50")+ 
    geom_point(size=3, aes(colour=Cluster, shape = Shape))+ 
    geom_point(aes(x = Population), size = 2, color = "black")+ 
    facet_grid(Cluster~., scales="free_y", space="free_y") + 
    ggtitle("Top 10 Cluster Characteristics: % Children Within Cluster With 
      Feature") 

enter image description here

+0

嗨@sara_khan如果這解決了您的問題,請考慮點擊接受它複選標記。這向更廣泛的社區表明,您已經找到了解決方案,併爲答覆者和您自己提供了一些聲譽。沒有義務這樣做。 – AntoineBic

+0

不知道爲什麼我不能按照我的要求訂購我的數據,但是,您完美地解決了其他問題。再次感謝你! –