2017-07-27 94 views
2

我已經在r中以下數據幀組通過在dplyr和計算百分比

Service  Container_Pick_Day 
    ABC    0 
    ABC    1 
    ABC    1 
    ABC    2 
    ABC    NA 
    ABC    0 
    ABC    1 
    DEF    NA 
    DEF    0 
    DEF    1 
    DEF    1 
    DEF    1 
    DEF    2 
    DEF    1 

Container_Pick_Day是數字,並且由NA值。 我想要做的是計算Service容器的明智百分比拿起0th day,after 1 day,2 day and so on忽略NA

期望中的數據幀將

Service  Container_Pick_Day  Percentage 
    ABC    0    (2/6)*100 = 33.33 
    ABC    1    (3/6)*100 = 50 
    ABC    2    (1/6)*100 = 16.67 
    DEF    0    (1/6)*100 = 16.67 
    DEF    1    (3/6)*100 = 50 
    DEF    2    (1/6)*100 = 16.67 

我做了以下的R,但其發電NA輸出

df%>% 
    group_by(Service) %>% 
    summarise(pick_day_perc = n()/sum(Container_Pick_Day),na.rm=T) %>% 
    as.data.frame() 

我必須按Service and Container_Pick_Day兩者兼而有之?

+1

看起來你需要的是' sum(Container_Pick_Day,na.rm = TRUE)'? –

+0

我想服務英明睿智的容器拿起個天..集裝箱的比例在同一天,第1天,第2天採摘。 – Neil

+0

是的,我明白,但你說「但其發電NA的輸出值」,所以我的意思是,以取代位我的總和()位在我之前的評論。 –

回答

3

添加基於所有評論回答上面的@nicola,@akrun和自己提供,

library(dplyr) 

#nicola 
df %>% 
filter(!is.na(Container_Pick_Day)) %>% 
group_by(Service,Container_Pick_Day) %>% 
summarise(Percentage=n()) %>% 
group_by(Service) %>% 
mutate(Percentage=Percentage/sum(Percentage)*100) 

#akrun 
df %>% 
filter(complete.cases(Container_Pick_Day)) %>% 
count(Service, Container_Pick_Day) %>% 
group_by(Service) %>% 
transmute(Container_Pick_Day, Percentage=n/sum(n)*100) 

#Sotos 
df %>% 
na.omit() %>% 
group_by_all() %>% 
summarise(ptg = n()) %>% 
group_by(Service) %>% 
mutate(ptg = prop.table(ptg)*100) 

所有產生於

Service Container_Pick_Day Percentage 
    <fctr>    <int>  <dbl> 
1  ABC     0 33.33333 
2  ABC     1 50.00000 
3  ABC     2 16.66667 
4  DEF     0 16.66667 
5  DEF     1 66.66667 
6  DEF     2 16.66667