2017-10-15 89 views
1

假設有兩個數據幀喜歡以下(從this post給出):條件在兩個數據幀JOIN中的R

df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3))) 
df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1))) 

df1 
# CustomerId Product 
#   1 Toaster 
#   2 Toaster 
#   3 Toaster 
#   4 Radio 
#   5 Radio 
#   6 Radio 

df2 
# CustomerId State 
#   2 Alabama 
#   4 Alabama 
#   6 Ohio 

問題是我該怎麼辦R中下面的SQL查詢:

SELECT * FROM df1 JOIN df2 on df1.CustomerId <= df2.CustomerId 

我所知道的是,我可以使用merge(df1, df2, by = "CustomerId")來進行內連接。但它不滿足加入的條件。

+2

'庫(sqldf); sqldf(「SELECT * FROM df1 JOIN df2 on df1.CustomerId <= df2.CustomerId」)' –

+0

@ G.Grothendieck所以,它不能使用'merge'函數完成? – OmG

+0

看到這個[鏈接](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-leftright) – L30n1d45

回答

0

正如我在由格羅騰迪克親愛的意見發現,一個簡單的解決方案是使用sqldf包,並得到正是我的結果SQL格式:

library(sqldf) 
sqldf("SELECT * FROM df1 JOIN df2 on df1.CustomerId <= df2.CustomerId") 
0

這是一個令人困惑的方式來做到這一點。但是,它的工作原理,但:

library(tidyverse) 
df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3))) 
df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1))) 

map2_df(
    df1$CustomerId, df1$Product, 
    .f = ~ { 
    temp <- df2 %>% filter(.x <= CustomerId) 
    tibble(CustomerId.x = .x, Product = .y, 
      CustomerId.y = temp$CustomerId, State = temp$State) 
    } 
)