2013-03-02 75 views
13

data.table FAQ中,nomatch = NA參數被認爲與外連接類似。但是,我一直無法獲得data.table做一個完整的外連接 - 只有正確的外連接。如何使用data.table完成全連接?

例如:

a <- data.table("dog" = c(8:12), "cat" = c(15:19)) 

    dog cat 
1: 8 15 
2: 9 16 
3: 10 17 
4: 11 18 
5: 12 19 

b <- data.table("dog" = 1:10, "bullfrog" = 11:20) 

    dog bullfrog 
1: 1  11 
2: 2  12 
3: 3  13 
4: 4  14 
5: 5  15 
6: 6  16 
7: 7  17 
8: 8  18 
9: 9  19 
10: 10  20 

setkey(a, dog) 
setkey(b, dog) 

a[b, nomatch = NA] 

    dog cat bullfrog 
1: 1 NA  11 
2: 2 NA  12 
3: 3 NA  13 
4: 4 NA  14 
5: 5 NA  15 
6: 6 NA  16 
7: 7 NA  17 
8: 8 15  18 
9: 9 16  19 
10: 10 17  20 

所以,nomatch = NA產生右外連接(這是默認值)。如果我需要全面加入,該怎麼辦?例如:

merge(a, b, by = "dog", all = TRUE) 
# Or with plyr: 
join(a, b, by = "dog", type = "full") 

    dog cat bullfrog 
1: 1 NA  11 
2: 2 NA  12 
3: 3 NA  13 
4: 4 NA  14 
5: 5 NA  15 
6: 6 NA  16 
7: 7 NA  17 
8: 8 15  18 
9: 9 16  19 
10: 10 17  20 
11: 11 18  NA 
12: 12 19  NA 

這可能與data.table

+0

對於加入與data.table看到[此帖]最後的答案[1 ] [1]:http://stackoverflow.com/questions/14076065/data-table-inner-outer-join-with-na-in-join-column-of-type-double-bug ?rq = 1 – statquant 2013-03-03 22:45:58

+0

對於與data.table加入各種見[此帖]最後的答案[1] [1]:http://stackoverflow.com/questions/14076065/data-table-inner-outer -a-in-join-column-of-type-double-bug?rq = 1 – statquant 2013-03-03 22:48:00

回答

19

你實際上就在那裏。使用merge.data.table這是你在做什麼,當你調用

merge(a, b, by = "dog", all = TRUE) 

因爲adata.tablemerge(a, b, ...)調用merge.data.table(a, b, ...)

+0

啊,當然。我應該知道這一點。謝謝。 – 2013-03-03 05:40:34

0
x= data.table(a=1:5,b=11:15) 
y= data.table(a=c(1:4,6),c=c(101:104,106)) 

setkey(x,a) 
setkey(y,a) 

unique_keys <- unique(c(x[,a], y[,a])) 
y[x[.(unique_keys), on="a"] ] # Full Outer Join