2017-04-17 123 views
3

所以我有這個例子df添加新列在數據幀的最大列名

df <- dput(structure(list(arts = structure(c(1L, 1L, 3L, 4L), .Label = 
c("art1", 
"art2"), class = "character"), scr1 = c(52L, 58L, 40L, 62L), scr2 = c(25L, 
23L, 55L, 26L), scr3 = c(36L, 60L, 19L, 22L)), .Names = c("art_id", 
"scr1", "scr2", "scr3"), row.names = c(NA, -4L), class = "data.frame")) 

> df 
    art_id scr1 scr2 scr3 
1  1 52 25 36 
2  1 58 23 60 
3  3 40 55 19 
4  4 62 26 22 

,我使用dplyr通過art_id

df %>% 
    group_by(art_id) %>% 
    summarise_each(funs(sum)) 

    art_id scr1 scr2 scr3 
    <int> <int> <int> <int> 
1  1 110 48 96 
2  3 40 55 19 
3  4 62 26 22 

我的問題總結:如何添加名爲top_r的另一列,其中包含src1:src3中最大的列名稱。得到的DF會是什麼樣子:

art_id scr1 scr2 scr3 top_r 
    <int> <int> <int> <int> <char> 
1  1 110 48 96 scr1 
2  3 40 55 19 scr2 
3  4 62 26 22 scr1 

我習慣使用dplyr所以如果有一個答案使用該庫的精彩!

回答

2

這會工作:

df %>% 
    group_by(art_id) %>% 
    summarise_each(funs(sum)) %>% 
    mutate(top_r=apply(.[,2:4], 1, function(x) names(x)[which.max(x)])) 

# A tibble: 3 × 5 
    art_id scr1 scr2 scr3 top_r 
    <int> <int> <int> <int> <chr> 
1  1 110 48 96 scr1 
2  3 40 55 19 scr2 
3  4 62 26 22 scr1 
+0

它_did_工作!謝謝。 – jmb277

+0

如果您想添加最大值而不是列名,該怎麼辦? – zsad512

+1

@ zsad512最後兩行將是'summarise_all(funs(sum))%>% mutate(max = apply(。[,2:4],1,max))' –

0
library(dplyr) 
library(tidyr) 

df2 <- df %>% 
    group_by(art_id) %>% 
    summarise_each(funs(sum)) 

df3 <- df2 %>% 
    gather(top_r, Value, -art_id) %>% 
    arrange(art_id, desc(Value)) %>% 
    group_by(art_id) %>% 
    slice(1) %>% 
    select(-Value) 

df_final <- df2 %>% 
    left_join(df3, by = "art_id") 

df_final 
# A tibble: 3 × 5 
    art_id scr1 scr2 scr3 top_r 
    <int> <int> <int> <int> <chr> 
1  1 110 48 96 scr1 
2  3 40 55 19 scr2 
3  4 62 26 22 scr1 
3

它只是這個簡單的基礎R使用max.col

df$top_r <- names(df)[-1][max.col(df[-1])] 
+1

whoa - that _is_ simple。謝謝! – jmb277