2017-10-19 99 views
0

我有它的整個條目傳播幾個月甚至幾年的字符串列:查找月份和年份內串

df <- data.frame(STRINGS = c("January 2017 Blah Blah", 
         "February Blah Blah", 
         "2016 Yeah Yeah", 
         "March Bleck", 
         "Stuff")) 

> df 
       STRINGS 
1 January 2017 Blah Blah 
2  February Blah Blah 
3   2016 Yeah Yeah 
4   March Bleck 
5     Stuff 

所有年份的範圍從2015年到2017年

我想輸出如下:

    STRINGS   MONTH   YEAR 
1 January 2017 Blah Blah   January   2017 
2  February Blah Blah  February   NA 
3   2016 Yeah Yeah    NA   2016 
4   March Bleck   March   NA 
5     Stuff    NA   NA 

這樣做的最簡單方法是什麼?

首先,我有

months <- c("January", "February", "March", "April", "May", "June", 
       "July", "August", "September", "October", "November", "December") 
years <- c(2015, 2016, 2017) 

回答

3

使用dplyrrebus,並stringr溶液。請注意,它假定每行只有1個匹配的月份和年份。

library(dplyr) 
library(rebus) 
library(stringr) 

df2 <- df %>% 
    mutate(STRINGS = as.character(STRINGS)) %>% 
    mutate(MONTH = str_extract(STRINGS, or1(months)), 
     YEAR = str_extract(STRINGS, or1(years))) 
df2 
       STRINGS MONTH YEAR 
1 January 2017 Blah Blah January 2017 
2  February Blah Blah February <NA> 
3   2016 Yeah Yeah  <NA> 2016 
4   March Bleck March <NA> 
5     Stuff  <NA> <NA>