2016-11-16 54 views
0

因此,我有下面的基本數據框,其中包含用逗號分隔的長字符串。我使用Tidyr的「單獨」來創建新列。使用Tidyr的「分離」將字符串分隔成多個列,然後使用Counts創建新列

如何添加另一個新列,並計算每個包含答案的人有多少個新列? (沒有NA)。

我想通過計算有多少個由逗號分隔的字符串元素,可以在分離之後或之前統計這些列?

任何幫助,將不勝感激。我想留在Tidyverse和dplyr。

Name<-c("John","Chris","Andy") 

Goal<-c("Go back to school,Learn to drive,Learn to cook","Go back to school,Get a job,Learn a new Skill,Learn to cook","Learn to drive,Learn to Cook") 

df<-data_frame(Name,Goal) 

df<-df%>%separate(Goal,c("Goal1","Goal2","Goal3","Goal4"),sep=",") 

回答

1

我們可以嘗試str_count

library(stringr) 
df %>% 
    separate(Goal,paste0("Goal", 1:4), sep=",", remove=FALSE) %>% 
    mutate(Count = str_count(Goal, ",")+1) %>% 
    select(-Goal) 
# Name    Goal1   Goal2    Goal3   Goal4 Count 
# <chr>    <chr>   <chr>    <chr>   <chr> <dbl> 
#1 John Go back to school Learn to drive  Learn to cook   <NA>  3 
#2 Chris Go back to school  Get a job Learn a new Skill Learn to cook  4 
#3 Andy Learn to drive Learn to Cook    <NA>   <NA>  2 
相關問題