2017-07-04 36 views
1

我有這樣一個數據幀:[R二分法數據幀在多張獨特的尺寸

originalDF <- data.frame(A1=c(1, 1, 2, 3, 4, 5, 6, 6, 6, 6, 6), 
         A2=c(12.2, 12.2, 15.0, 34.123, 2.0, 66.0, 7.0, 7.0, 7.0, 7.0, 7.0), 
         A3=c('T1', 'T2', 'T1', 'T1', 'T2', 'T1', 'T1', 'T1', 'T1', 'T1', 'T1'), 
         A4=c('1234', '1234', '1234', '1234', '4321', '4321', '4321', '4321', '4321', '4321', '4321'), 
         A5=c('0245', '0245', '0500', '0500', '0600', '0600', '0600','0800','0700','0900', '0900')) 

A1  A2 A3 A4 A5 
1 1 12.200 T1 1234 0245 
2 1 12.200 T2 1234 0245 
3 2 15.000 T1 1234 0500 
4 3 34.123 T1 1234 0500 
5 4 2.000 T2 4321 0600 
6 5 66.000 T1 4321 0600 
7 6 7.000 T1 4321 0600 
8 6 7.000 T1 4321 0800 
9 6 7.000 T1 4321 0700 
10 6 7.000 T1 4321 0900 
11 6 7.000 T1 4321 0900 

我現在要二分這個數據幀,它終於看起來像這樣:

uniqueoriginalDF <- unique(subset(originalDF, select=c(A1, A2, A3, A4))) 
wantedDF <- cbind.data.frame(uniqueoriginalDF, 
          A5_0245=c(1, 1, 0, 0, 0, 0, 0), 
          A5_0500=c(0, 0, 1, 1, 0, 0, 0), 
          A5_0600=c(0, 0, 0, 0, 1, 1, 1), 
          A5_0800=c(0, 0, 0, 0, 0, 0, 1), 
          A5_0700=c(0, 0, 0, 0, 0, 0, 1), 
          A5_0900=c(0, 0, 0, 0, 0, 0, 1)) 

A1  A2 A3 A4 A5_0245 A5_0500 A5_0600 A5_0800 A5_0700 A5_0900 
1 1 12.200 T1 1234  1  0  0  0  0  0 
2 1 12.200 T2 1234  1  0  0  0  0  0 
3 2 15.000 T1 1234  0  1  0  0  0  0 
4 3 34.123 T1 1234  0  1  0  0  0  0 
5 4 2.000 T2 4321  0  0  1  0  0  0 
6 5 66.000 T1 4321  0  0  1  0  0  0 
7 6 7.000 T1 4321  0  0  1  1  1  1 

我該如何做到這一點? (基礎R解決方案首選!)提前致謝!

+0

容易這是'dcast' i..e容易得多。 'dcast(originalDF,...〜A5,length)'或者只能得到0和1s'dcast(originalDF,...〜A5,function(x)as.integer(length(x)> 0))' – akrun

回答

1

我們可以使用reshapebase R

d1 <- reshape(transform(originalDF, A5N = 1), idvar = 
      names(originalDF)[1:4], timevar = 'A5', direction = 'wide') 
d1[is.na(d1)] <- 0 

但它與dcast

library(data.table) 
dcast(setDT(originalDF), ...~ paste0("A5_", A5), function(x) as.integer(length(x) > 0)) 
# A1  A2 A3 A4 A5_0245 A5_0500 A5_0600 A5_0700 A5_0800 A5_0900 
#1: 1 12.200 T1 1234  1  0  0  0  0  0 
#2: 1 12.200 T2 1234  1  0  0  0  0  0 
#3: 2 15.000 T1 1234  0  1  0  0  0  0 
#4: 3 34.123 T1 1234  0  1  0  0  0  0 
#5: 4 2.000 T2 4321  0  0  1  0  0  0 
#6: 5 66.000 T1 4321  0  0  1  0  0  0 
#7: 6 7.000 T1 4321  0  0  1  1  1  1