使用兩個變量過濾data.table，一個優雅的快速方式

我想問你是否有一種方法可以根據多個變量的組合進行過濾。更具體地講：使用兩個變量過濾data.table，一個優雅的快速方式

library(dplyr) 
library(plyr) 
library(data.table) 

data <- iris %>% cbind(group = rep(c("a", "b", "c"), nrow(iris))) %>% as.data.table() 

    Sepal.Length Sepal.Width Petal.Length Petal.Width Species group 
1:   5.1   3.5   1.4   0.2 setosa  a 
2:   4.9   3.0   1.4   0.2 setosa  b 
3:   4.7   3.2   1.3   0.2 setosa  c 
4:   4.6   3.1   1.5   0.2 setosa  a 
5:   5.0   3.6   1.4   0.2 setosa  b 
6:   5.4   3.9   1.7   0.4 setosa  c

，我想基於對它們進行過濾下面的數據表

filter <- data.table(Species = c("setosa", "versicolor", 'setosa'), group = c('a', "b", 'c')) 
     Species group  filter1 
1:  setosa  a  setosa a 
2: versicolor  b versicolor b 
3:  setosa  c  setosa c

我能做到這一點以這種方式：

data[paste(Species, group) %in% filter[, filter1 := paste(Species, group)]$filter1]

不過，我想知道是否有更有效/更快/更容易地做到這一點：東西可能是這樣的：

data[.(Species, group) %in% filter] # does not work

來源

2017-10-11 George Sotiropoulos

@Jaap我想這個鏈接是爲了更復雜的過濾操作，比如'on =。（x = x，y！= y）'。在這裏，我認爲'data [filter，on = names（filter），nomatch = 0]'可能是目標，或者可能是https://stackoverflow.com/questions/18969420/perform-a-semi-join-with- data-table – Frank

是的，的確，@Frank回答了我的問題，也回答了我期待的內容。因爲正如我所說的，我正在尋找一種更優雅，更簡單的方式來做到這一點。弗蘭克的答案是足夠的，如果你把它寫成答案，那麼我可以接受它。 –

在這種情況下，你可以做

data[filter, on=names(filter), nomatch=0]

見 Perform a semi-join with data.table類似的濾波連接。

來源

2017-10-12 11:27:43 Frank

使用兩個變量過濾data.table，一個優雅的快速方式

回答

相關問題