創建二進制向量的組合

我想創建由固定數量的0和1組成的二進制向量的所有可能組合。例如： dim（v）= 5x1; N1 = 3; N0 = 2; 在這種情況下，我想有這樣的：創建二進制向量的組合

1,1,1,0,0 
    1,1,0,1,0 
    1,1,0,0,1 
    1,0,1,1,0 
    1,0,1,0,1 
    1,0,0,1,1 
    0,1,1,1,0 
    0,1,1,0,1 
    0,1,0,1,1 
    0,0,1,1,1

我發現了一些幫助閱讀這篇文章 Create all possible combiations of 0,1, or 2 "1"s of a binary vector of length n 但我想只生成我需要避免空間浪費任何組合（我認爲這這個問題將有n explonentially增加）

來源

2015-02-06 Matteo Cinelli

一個不那麼有效的方法將是'X < - expand.grid（REP（列表（0L：1L），5L））; x [rowSums（x）== 3L，]'但我認爲你想要比這更快的東西。 – 2015-02-06 14:43:47

以下可能有所幫助：http://stackoverflow.com/questions/17292091/rbinary-matrix-for-all-possible-unique-results – 2015-02-06 14:45:01

Marat的答案稍快：

f.roland <- function(n, m) { 
    ind <- combn(seq_len(n), m) 
    ind <- t(ind) + (seq_len(ncol(ind)) - 1) * n 
    res <- rep(0, nrow(ind) * n) 
    res[ind] <- 1 
    matrix(res, ncol = n, nrow = nrow(ind), byrow = TRUE) 
} 

all.equal(f.2(16, 8), f.roland(16, 8)) 
#[1] TRUE 
library(rbenchmark) 
benchmark(f(16,8),f.2(16,8),f.roland(16,8)) 

#    test replications elapsed relative user.self sys.self user.child sys.child 
#2  f.2(16, 8)   100 5.693 1.931  5.670 0.020   0   0 
#3 f.roland(16, 8)   100 2.948 1.000  2.929 0.017   0   0 
#1  f(16, 8)   100 8.287 2.811  8.214 0.066   0   0

來源

2015-02-06 15:17:33 Roland

出於某種原因，我不能再現你的基準測試結果：我的基準測試認爲'f.2'和'f.roland'大致相同（在1％以內）的表現。你能否重複幾次基準來確保結果一致？ – 2015-02-06 15:24:19

而且，爲了完整性，您是否可以將其他功能納入基準測試？ – 2015-02-06 15:25:12

@MaratTalipov我重新運行了基準測試並得到了相同的結果。由於我不想安裝bioconductor，因此不能包含akrun的功能。 – Roland 2015-02-06 15:31:39

你可以試試這個方法：

f <- function(n=5,m=3) 
t(apply(combn(1:n,m=m),2,function(cm) replace(rep(0,n),cm,1))) 

f(5,3) 
#  [,1] [,2] [,3] [,4] [,5] 
# [1,] 1 1 1 0 0 
# [2,] 1 1 0 1 0 
# [3,] 1 1 0 0 1 
# [4,] 1 0 1 1 0 
# [5,] 1 0 1 0 1 
# [6,] 1 0 0 1 1 
# [7,] 0 1 1 1 0 
# [8,] 0 1 1 0 1 
# [9,] 0 1 0 1 1 
# [10,] 0 0 1 1 1

的想法是生成指數的所有組合爲1，然後到u讓他們產生最終結果。

相同方法的另一個味：

f.2 <- function(n=5,m=3) 
    t(combn(1:n,m,FUN=function(cm) replace(rep(0,n),cm,1)))

第二種方法是約兩倍快：

library(rbenchmark) 
benchmark(f(16,8),f.2(16,8)) 
#   test replications elapsed relative user.self sys.self user.child sys.child 
# 2 f.2(16, 8)   100 5.706 1.000  5.688 0.017   0   0 
# 1 f(16, 8)   100 10.802 1.893 10.715 0.082   0   0

基準

f.akrun <- function(n=5,m=3) { 

    indx <- combnPrim(1:n,m) 

    DT <- setDT(as.data.frame(matrix(0, ncol(indx),n))) 
    for(i in seq_len(nrow(DT))){ 
    set(DT, i=i, j=indx[,i],value=1) 
    } 
    DT 
} 

benchmark(f(16,8),f.2(16,8),f.akrun(16,8)) 
#   test replications elapsed relative user.self sys.self user.child sys.child 
# 2  f.2(16, 8)   100 5.464 1.097  5.435 0.028   0   0 
# 3 f.akrun(16, 8)   100 4.979 1.000  4.938 0.037   0   0 
# 1  f(16, 8)   100 10.854 2.180 10.689 0.129   0   0

@ akrun的溶液（f.akrun）的〜10 ％比f.2快。

[編輯] 另一種方法，這更加快速和簡單的：

f.3 <- function(n=5,m=3) t(combn(n,m,tabulate,nbins=n))

來源

2015-02-06 14:48:02

我真的很感謝你的幫助！ – 2015-02-06 15:33:18

'f.3'是最好的，沒有足夠突出imo ;-) – Cath 2016-07-06 09:10:12

您可以用set沿gRbase嘗試combnPrim從data.table（可能是faster）

source("http://bioconductor.org/biocLite.R") 
biocLite("gRbase") 
library(gRbase) 
library(data.table) 
n <-5 
indx <- combnPrim(1:n,3) 

DT <- setDT(as.data.frame(matrix(0, ncol(indx),n))) 
for(i in seq_len(nrow(DT))){ 
    set(DT, i=i, j=indx[,i],value=1) 
} 
DT 
# V1 V2 V3 V4 V5 
#1: 1 1 1 0 0 
#2: 1 1 0 1 0 
#3: 1 0 1 1 0 
#4: 0 1 1 1 0 
#5: 1 1 0 0 1 
#6: 1 0 1 0 1 
#7: 0 1 1 0 1 
#8: 1 0 0 1 1 
#9: 0 1 0 1 1 
#10: 0 0 1 1 1

來源

2015-02-06 15:04:08 akrun

下面是另一種方法：

func <- function(n, m) t(combn(n, m, function(a) {z=integer(n);z[a]=1;z})) 

func(n = 5, m = 2) 

    # [,1] [,2] [,3] [,4] [,5] 
# [1,] 1 1 0 0 0 
# [2,] 1 0 1 0 0 
# [3,] 1 0 0 1 0 
# [4,] 1 0 0 0 1 
# [5,] 0 1 1 0 0 
# [6,] 0 1 0 1 0 
# [7,] 0 1 0 0 1 
# [8,] 0 0 1 1 0 
# [9,] 0 0 1 0 1 
# [10,] 0 0 0 1 1

來源

2017-04-12 14:31:57 989

創建二進制向量的組合

回答

相關問題