2016-11-01 22 views
0

考慮顯示用尺寸爲每個治療一起用於雄性和雌性的控制結果和兩個實驗處理的數據幀:R:由爲特定的元素創建的列重塑數據幀(控制處理)

library(tidyverse) 
mydf <- data_frame(treatment = c('ctrl','low','high','ctrl','low','high'), 
       gender = c('male','male','male','female','female','female'), 
       size = c(10,20,30,10,20,30), 
       result = c(0.11, 0.32, 0.25, 0.15, 0.38, 0.55)) 

treatment gender size results 
ctrl  male  10 0.11 
low  male  20 0.32 
high  male  30 0.25 
ctrl  female 10 0.15 
low  female 20 0.35 
high  female 30 0.55 

要比較並排實驗處理一側的控制,我想重塑數據框如下:

treatment gender ctrl_size size ctrl_result result 
    low  female  10  20  0.15  0.38 
    high  female  10  30  0.15  0.55 
    low  male   10  20  0.11  0.32 
    high  male   10  30  0.11  0.25 

我下面的作品的嘗試,但似乎麻煩我,因爲它把它們合併到之前創建的輔助數據幀最後一個:

mydf_result <- mydf %>% 
    select(-size) %>% 
    spread(treatment, result) %>% 
    gather(treatment, result, c(low, high)) %>% 
    rename(ctrl_result = ctrl) 

mydf_size <- mydf %>% 
    select(-result) %>% 
    spread(treatment, size) %>% 
    gather(treatment, size, c(low, high)) %>% 
    rename(ctrl_size = ctrl) 

mydf_final <- 
    full_join(mydf_result, mydf_size, by = c('treatment', 'gender')) %>% 
    select(treatment, gender, ctrl_size, size, ctrl_result, result) %>% 
    arrange(gender) 

# A tibble: 4 × 6 
    treatment gender ctrl_size size ctrl_result result 
     <chr> <chr>  <dbl> <dbl>  <dbl> <dbl> 
1  low female  10 20  0.15 0.38 
2  high female  10 30  0.15 0.55 
3  low male  10 20  0.11 0.32 
4  high male  10 30  0.11 0.25 

以上可以在一個管道內實現嗎?

+0

難道我們需要知道'mydf'中的哪些觀察結果來自同一個人能夠做到這一點? – Joe

+0

謝謝@Joe。我看到一個'治療'列丟失,並會糾正它。 – Irakli

回答

2

雖然我不知道該期望的結果是最整潔的安排,你可以重新排列,像這樣:

library(tidyverse) 

mydf %>% gather(var, val, size, result) %>% # gather all numbers into one column 
    spread(treatment, val) %>% # spread treatment so ctrl can be separated 
    gather(treatment, ttmt, high, low) %>% # regather high and low separately 
    gather(ct_tm, val, ctrl, ttmt) %>% # regather numbers, now with ctrl/ttmt separated 
    unite(var, ct_tm, var) %>% # join column labels 
    spread(var, val) # spread to wide 

## # A tibble: 4 × 6 
## gender treatment ctrl_result ctrl_size ttmt_result ttmt_size 
## * <chr>  <chr>  <dbl>  <dbl>  <dbl>  <dbl> 
## 1 female  high  0.15  10  0.55  30 
## 2 female  low  0.15  10  0.38  20 
## 3 male  high  0.11  10  0.25  30 
## 4 male  low  0.11  10  0.32  20 
+1

這是優雅和短。謝謝! – Irakli

1

這可以用做加盟data.table

library(data.table) 
setnames(setDT(mydf)[treatment!="ctrl"][mydf[treatment=="ctrl"], 
    on = "gender"], c("i.size", "i.result"), c("ctrl_size", "ctrl_result"))[, 
        i.treatment := NULL][] 
# treatment gender size result ctrl_size ctrl_result 
#1:  low male 20 0.32  10  0.11 
#2:  high male 30 0.25  10  0.11 
#3:  low female 20 0.38  10  0.15 
#4:  high female 30 0.55  10  0.15