2014-07-15 36 views
0

我有一組的四個向量看起來像這樣:查找矢量交點的所有可能組合?

[1] PRI2CO  HEISCO  PRI2CO  DIALGU  DIALGU  ALSEBL  
Levels: ALSEBL  DIALGU  HEISCO  PRI2CO 

[1] PRI2CO  TET2PA  ALSEBL  PRI2CO  ALSEBL  TET2PA  
[7] HEISCO  TET2PA  
Levels: ALSEBL  HEISCO  PRI2CO  TET2PA 

我想生成一個包含四個矢量的每一個可能的組合之間相匹配的所有值的向量。對於以上兩者,它將包含ALESBL,HEISCO和PRI2CO。到目前爲止,我已經手動完成了所有組合,但它的乏味,我認爲必須有更好的方法。我試着爲它寫一個循環,但我對R很新,但它還沒有工作。這是我一直在做的:

trees.species.P234<-intersect(intersect(trees.species.P2,trees.species.P3),trees.species.P4) 
> trees.species.P234 
[1] "PRI2CO  " "ALSEBL  " 

我在想一個for循環,涉及一個因子可能會這樣做,但我不能讓它工作。

+0

是這樣的:http://stackoverflow.com/questions/22624284/r-intersecting-strings/22624311樣的東西有用嗎?這聽起來像你想要做的,但我不完全確定。即''Reduce(交叉,列表(一,二))'適用於你的例子,可擴展到3個以上的向量。 – thelatemail

+0

看起來很有希望!我會在明天再試一試並回報 – brandonEm

+0

您也可以從庫(MergeGUI)' – akrun

回答

1

在這裏你去,使用相同的載體所建議的gadzooks:

v1 <- c("PRI2CO","HEISCO","PRI2CO","DIALGU","DIALGU","ALSEBL") 
v2 <- c("PRI2CO", "TET2PA","ALSEBL","PRI2CO","ALSEBL","TET2PA","HEISCO","TET2PA") 
v3 <- c("PRI2CO","HEISCO","PRI2CO","DIALGU","DIALGU","ALSEBL") 
v4 <- c("PRI2CO", "TET2PA","ALSEBL","PRI2CO","ALSEBL","TET2PA","HEISCO","TET2PA") 

veclist <- list(v1,v2,v3,v4) 
combos <- Reduce(c,lapply(2:length(veclist), 
      function(x) combn(1:length(veclist),x,simplify=FALSE))) 

lapply(combos, function(x) Reduce(intersect,veclist[x])) 

#[[1]] 
#[1] "PRI2CO" "HEISCO" "ALSEBL" 
# 
#[[2]] 
#[1] "PRI2CO" "HEISCO" "DIALGU" "ALSEBL" 
# 
#[[3]] 
#[1] "PRI2CO" "HEISCO" "ALSEBL" 
#etc etc 
+0

工作正常!非常感謝。我將lapply列表分配給'intesects'並添加了'table(unlist(intersects)'來獲得我特別尋找的內容 - 基於ID所在​​組合的唯一ID的計數。 – brandonEm

0

首先你必須列出所有的組合。對於那個使用combn函數。

> combn(1:4,2) 
    [,1] [,2] [,3] [,4] [,5] [,6] 
[1,] 1 1 1 2 2 3 
[2,] 2 3 4 3 4 4 

現在我們可以使用apply功能找到你的向量之間的交叉點。但在此之前, 可以創建一個向量列表。爲了便於重現,我創建了這個列表。

c <- combn(1:4,2) 
l <- list(c("a","b"),c("b","c"),c("c","d"),c("d","e")) 
Result <- apply(c,2,function(x){intersect(l[[x[1]]],l[[x[2]]])}) 

這個結果將是,如果你想它作爲載體可以使用do.call

do.call("c",Result) 
[1] "b" "c" "d" 

對於此可用於大型列表,以及獨特的組件

unique(do.call("c",Result)) 

列表。

0
v1 <- c("PRI2CO","HEISCO","PRI2CO","DIALGU","DIALGU","ALSEBL") 
v2 <- c("PRI2CO", "TET2PA","ALSEBL","PRI2CO","ALSEBL","TET2PA","HEISCO","TET2PA") 
v3 <- c("PRI2CO","HEISCO","PRI2CO","DIALGU","DIALGU","ALSEBL") 
v4 <- c("PRI2CO", "TET2PA","ALSEBL","PRI2CO","ALSEBL","TET2PA","HEISCO","TET2PA") 

vall <- unique(c(v1,v2,v3,v4)) 
for(x in vall){ 
    if((x %in% v1)&(x %in% v2)&(x %in% v3)&(x %in% v4)){ 
    print(x)} 
}