比較布爾向量

我有四個邏輯向量數據幀，V1，V2，V3，V4是TRUE或FALSE。我需要基於布爾向量的組合（例如，「無」，「V1只有」，「V1和V3」，「全部」等數據幀的每一行進行分類）。我想這樣做，而不採取數據框的子集或嵌套ifelse語句。任何建議最好的方法來做到這一點？謝謝！比較布爾向量

來源

2011-12-14 Boom Shakalaka

看起來我已經到達在這次晚會上已經很晚了。儘管如此，我仍然可以分享我帶來的東西！

這通過處理FALSE/TRUE可能性等位，並在其上操作分配給的v1，v2每個組合，和v3 1和8（很像chmod可以表示上*NIX系統權限位）之間的唯一整數。該整數然後用作索引來選擇文本描述符向量的適當元素。

（對於演示，我用短短三列，但這種方法很好地擴展了。）

# CONSTRUCT VECTOR OF DESCRIPTIONS 
description <- c("None", "v1", "v2", "v1 and v2", 
       "v3", "v1 and v3", "v2 and v3", "All") 

# DEFINE DESCRIPTION FUNCTION 
getDescription <- function(X) { 
    index <- 1 + sum(X*c(1,2,4)) 
    description[index] 
} 

# TRY IT OUT ON ALL COMBOS OF v1, v2, and v3 
df <- expand.grid(v1=c(FALSE, TRUE), 
        v2=c(FALSE, TRUE), 
        v3=c(FALSE, TRUE)) 
df$description <- apply(df, 1, getDescription) 

# YEP, IT WORKS. 
df 
#  v1 v2 v3 description 
# 1 FALSE FALSE FALSE  None 
# 2 TRUE FALSE FALSE   v1 
# 3 FALSE TRUE FALSE   v2 
# 4 TRUE TRUE FALSE v1 and v2 
# 5 FALSE FALSE TRUE   v3 
# 6 TRUE FALSE TRUE v1 and v3 
# 7 FALSE TRUE TRUE v2 and v3 
# 8 TRUE TRUE TRUE   All

來源

2011-12-14 04:52:34

這裏有一個方法依靠TRUE/FALSE可以表示爲0和1的事實。您可以將布爾值乘以列索引，然後將所有值粘貼在一起。這會告訴你哪一列的每行的值爲1。這裏有一個例子：

set.seed(1) 
dat <- data.frame(v1 = sample(c(T,F), 10, TRUE), 
        v2 = sample(c(T,F), 10, TRUE), 
        v3 = sample(c(T,F), 10, TRUE), 
        v4 = sample(c(T,F), 10, TRUE) 
       ) 
#End fake data 
#Multiple T/F times the column index 
dat <- dat * rep(seq_len(ncol(dat)), each = nrow(dat)) 
#Paste together in a new column 
dat$v5 <- apply(dat, 1, function(x) paste(x, collapse = "")) 

> dat 
    v1 v2 v3 v4 v5 
1 0 0 3 4 0034 
2 0 2 0 4 0204 
...

結合下面的有益的意見和附加問題

我會用expand.grid()創建一個查找表，然後寫，但是你認爲合適的文本標籤來表示它們。下面是兩列的一個示例：

set.seed(1) 
dat <- data.frame(v1 = sample(c(T,F), 10, TRUE), 
        v2 = sample(c(T,F), 10, TRUE) 
     ) 

#Thanks @Joshua 
dat$comp <- as.character(apply(1 * dat, 1, paste, collapse="")) 

#Look up table 
lookup <- data.frame(comp = apply(expand.grid(0:1, 0:1), 1, paste, collapse = ""), 
        text = c("none", "v1 only", "v2 only", "all"), 
        stringsAsFactors = FALSE 
) 

#Use merge to join the look up table to your data. Note the consistent naming of the comp column 
> merge(dat, lookup) 
    comp v1 v2 text 
1 00 FALSE FALSE none 
2 00 FALSE FALSE none 
3 01 FALSE TRUE v2 only 
....

來源

2011-12-14 02:43:38 Chase

+1做得很好。使用0/1表示的另一種選擇是將每個乘以10的冪並加上;這可以通過矩陣乘法完成，就像這個`as.matrix（dat）％*％10^rev（seq_len（ncol（dat）） - 1）`一樣。（或者如果你更喜歡用二進制來思考，則使用2的冪。） – Aaron 2011-12-14 02:55:34

+1，但是我沒有看到「列索引」的需要，因爲它是由字符串中的「1」的位置定義的。替代方法是`apply（1 * dat，1，paste，collapse =「」）或`do.call（paste，c（1 * dat，sep =「」））``。 – 2011-12-14 03:15:03

謝謝。所以根據你的回答，我想到了以下幾點：`v1 < - ifelse（v1 == TRUE，1000，0）``v2 < - ifelse（v1 == TRUE，100，0）``v3 < - ifelse（v1 == TRUE，10，0）``v4 < - ifelse（v1 == TRUE，1,0）``dat $ v5 < - sum（v1，v2，v3，v4）`然後我應該創建一個要查找標籤的值列表（例如1111 ==「All」）還是有更好的方法？ – 2011-12-14 03:59:51

set.seed(123) 
> dat <- data.frame(v1 = sample(c(T,F), 10, TRUE), 
+     v2 = sample(c(T,F), 10, TRUE), 
+     v3 = sample(c(T,F), 10, TRUE), 
+     v4 = sample(c(T,F), 10, TRUE) 
+     ) 
> dat

第一策略使用的模式來索引的各種組合成字符的以1:1的缺省索引「其他」的載體：

> dat$bcateg <- c("Other", "v2 only", "v1 and v3", "All")[1+ 
+ with(dat, 1*(v2 & !v1 &!v3 &!v4)) 
+ +with(dat, 2*(v1&v3))+ 
+ with(dat, v1&v2&v3&v4)] 
> dat 
     v1 v2 v3 v4 bcateg 
1 TRUE FALSE FALSE FALSE  Other 
2 FALSE TRUE FALSE FALSE v2 only 
3 TRUE FALSE FALSE FALSE  Other 
4 FALSE FALSE FALSE FALSE  Other 
5 FALSE TRUE FALSE TRUE  Other 
6 TRUE FALSE FALSE TRUE  Other 
7 FALSE TRUE FALSE FALSE v2 only 
8 FALSE TRUE FALSE TRUE  Other 
9 FALSE TRUE TRUE TRUE  Other 
10 TRUE FALSE TRUE TRUE v1 and v3

第二種策略使用「，」的分隔符將TRUE的列名聯合起來：

> dat$bcateg2 <-paste(c("","v1")[dat[["v1"]]+1 ], c("","v2")[dat[["v2"]]+1 ], c("","v3")[dat[["v3"]]+1 ], c("","v4")[dat[["v4"]]+1 ], sep = ",") 
> dat 
     v1 v2 v3 v4 bcateg bcateg2 
1 TRUE FALSE FALSE FALSE  Other  v1,,, 
2 FALSE TRUE FALSE FALSE v2 only  ,v2,, 
3 TRUE FALSE FALSE FALSE  Other  v1,,, 
4 FALSE FALSE FALSE FALSE  Other  ,,, 
5 FALSE TRUE FALSE TRUE  Other ,v2,,v4 
6 TRUE FALSE FALSE TRUE  Other v1,,,v4 
7 FALSE TRUE FALSE FALSE v2 only  ,v2,, 
8 FALSE TRUE FALSE TRUE  Other ,v2,,v4 
9 FALSE TRUE TRUE TRUE  Other ,v2,v3,v4 
10 TRUE FALSE TRUE TRUE v1 and v3 v1,,v3,v4

來源

2011-12-14 04:17:18

讓我把我的帽子環以及

plyr::adply(dat, 1, function(x) paste(names(Filter(isTRUE, x)), collapse = " and ")) 

     v1 v2 v3 v4    V1 
1 TRUE TRUE FALSE TRUE v1 and v2 and v4 
2 TRUE TRUE TRUE FALSE v1 and v2 and v3 
3 FALSE FALSE FALSE TRUE    v4 
4 FALSE TRUE TRUE TRUE v2 and v3 and v4 
5 TRUE FALSE TRUE FALSE  v1 and v3 
6 FALSE TRUE TRUE FALSE  v2 and v3 
7 FALSE FALSE TRUE FALSE    v3 
8 FALSE FALSE TRUE TRUE  v3 and v4 
9 FALSE TRUE FALSE FALSE    v2 
10 TRUE FALSE TRUE TRUE v1 and v3 and v4

來源

2011-12-14 05:22:56 Ramnath

比較布爾向量

回答

相關問題