2017-07-14 92 views
0

我有兩個數據框,但問題是合併「by」列在不同情況下具有值。合併2個數據幀與R中相同但不同的案例列

sn1capx1e0001 vs SN1CAPX1E0001。

authors <- data.frame(
surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")), 
nationality = c("US", "Australia", "US", "UK", "Australia"), 
deceased = c("yes", rep("no", 4))) 

books <- data.frame(
name = I(c("tukey", "venables", "tierney", 
      "tipley", "ripley", "McNeil", "R Core")), 
title = c("Exploratory Data Analysis", 
      "Modern Applied Statistics ...", 
      "LISP-STAT", 
      "Spatial Statistics", "Stochastic Simulation", 
      "Interactive Data Analysis", 
      "An Introduction to R"), 
other.author = c(NA, "Ripley", NA, NA, NA, NA, 
       "Venables & Smith")) 
m1 <- merge(authors, books, by.x = "surname", by.y = "name") 

姓死者國籍標題other.author

麥克尼爾澳大利亞沒有交互式數據分析NA

所以我想是不區分大小寫合併它們。我無法使用合併或加入。

我看到我們可以使用正則表達式來使用循環來匹配值。

回答

1

爲什麼不將它們轉換爲相同的形式?

library(stringr) 

authors <- data.frame(
    surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")), 
    nationality = c("US", "Australia", "US", "UK", "Australia"), 
    deceased = c("yes", rep("no", 4))) 

books <- data.frame(
    name = I(c("tukey", "venables", "tierney", 
      "tipley", "ripley", "McNeil", "R Core")), 
    title = c("Exploratory Data Analysis", 
      "Modern Applied Statistics ...", 
      "LISP-STAT", 
      "Spatial Statistics", "Stochastic Simulation", 
      "Interactive Data Analysis", 
      "An Introduction to R"), 
    other.author = c(NA, "Ripley", NA, NA, NA, NA, 
        "Venables & Smith")) 

authors$surname <- str_to_title(authors$surname) 
books$name <- str_to_title(books$name) 

m1 <- merge(authors, books, by.x = "surname", by.y = "name") 

surname nationality deceased       title other.author 
1 Mcneil Australia  no  Interactive Data Analysis   <NA> 
2 Ripley   UK  no   Stochastic Simulation   <NA> 
3 Tierney   US  no      LISP-STAT   <NA> 
4 Tukey   US  yes  Exploratory Data Analysis   <NA> 
5 Venables Australia  no Modern Applied Statistics ...  Ripley 
0

我發現這很簡單

祕密都使用 「TOUPPER()」

books$name<-toupper(books$name) 

簡單....

相關問題