2016-08-19 82 views
2

我有一個數據幀,如下所示:ř使用dcast,熔化並級聯重塑數據幀

mydf <- data.frame(Term = c('dog','cat','lion','tiger','pigeon','vulture'), Category = c('pet','pet','wild','wild','pet','wild'), 
    Count = c(12,14,19,7,11,10), Rate = c(0.4,0.7,0.3,0.6,0.1,0.8), Brand = c('GS','GS','MN','MN','PG','MN') ) 

導致數據幀:

 Term Category Count Rate Brand 
1  dog  pet 12 0.4 GS 
2  cat  pet 14 0.7 GS 
3 lion  wild 19 0.3 MN 
4 tiger  wild  7 0.6 MN 
5 pigeon  pet 11 0.1 PG 
6 vulture  wild 10 0.8 MN 

我希望該數據幀變換爲以下resultDF

Category   pet    wild    
Term    dog,cat,pigeon lion,tiger,vulture 
Countlessthan13 dog,pigeon  tiger,vulture  
Ratemorethan0.5 cat    tiger,vulture  
Brand   GS,PG   MN     

行標題表示像Countlessthan13這樣的操作意味着計數< 13適用於術語,然後分組。 另請注意,品牌名稱是獨一無二的,不會重複使用。

我試過dcast和融化......但沒有得到想要的結果。

回答

3

我們可以使用data.table來做到這一點。將'data.frame'轉換爲'data.table'(setDT(mydf)),按'Category'分組,創建一些總結列pasteunique值'Term',其中'Count'小於13或'Rate'更大比'0.5',以及'品牌'的unique元素。

library(data.table) 
dt <- setDT(mydf)[, .(Term = paste(unique(Term), collapse=","), 
         Countlesstthan13 = paste(unique(Term[Count < 13]), collapse=","), 

         Ratemorethan0.5 = paste(unique(Term[Rate > 0.5]), collapse=","), 
         Brand = paste(unique(Brand), collapse=",")), by = Category] 

從彙總數據集(「DT」),我們melt以「長」通過指定「id.var」作爲「類別」,然後dcast回「寬」格式格式。

dcast(melt(dt, id.var = "Category", variable.name = "category"), 
          category ~Category, value.var = "value") 
#   category   pet    wild 
#1:    Term dog,cat,pigeon lion,tiger,vulture 
#2: Countlesstthan13  dog,pigeon  tiger,vulture 
#3: Ratemorethan0.5   cat  tiger,vulture 
#4:   Brand   GS,PG     MN 
+1

太棒了。謝謝Akrun ..製作我的一天! – Tarak