那麼,我試圖爲您的需求創建一個可行的解決方案。不過,可能有更好的方法來執行它,可能使用的軟件包如data.table
和/或stringr
。無論如何,這個片段可能是一個工作的起點。哦,我修改了Ad_title
數據,以便物種名稱在標題中。
# Re-create data
Ad_title <- c("1 year old Ball Python", "Young Red Blood Python. - For Sale",
"1 Year Old Male Bearded Dragon - For Sale")
df2 <- data.frame(Latin_name = c("Python regius", "Python brongersmai", "Pogona barbata"),
Common_name = c("E: Ball Python, Royal Python G: Königspython",
"E: Red Blood Python, Malaysian Blood Python",
"E: Eastern Bearded Dragon, Bearded Dragon"),
stringsAsFactors = F)
# Aggregate common names
Common_name <- paste(df2$Common_name, collapse = ", ")
Common_name <- unlist(strsplit(Common_name, "(E:)|(G:)|(,)"))
Common_name <- Common_name[Common_name != ""]
# Data frame latin names vs common names
df3 <- data.frame(Common_name, Latin_name = sapply(Common_name, grep, df2$Common_name),
row.names = NULL, stringsAsFactors = F)
df3$Latin_name <- df2$Latin_name[df3$Latin_name]
# Data frame Ad vs common names
Ad_Common_name <- unlist(sapply(Common_name, grep, Ad_title))
df4 <- data.frame(Ad_title, Common_name = sapply(1:3, function(i) names(Ad_Common_name[Ad_Common_name==i])),
stringsAsFactors = F)
你的輸入文件都是字符串,對嗎?您是否嘗試修改第二個數據框,以便它成爲所有常用名稱的列表/矢量? – zyurnaidi