2016-07-05 68 views
1

我有一個包含2個因子的數據框。這樣計算與數據框中的子集的差異

Eyecolour Haircolour Points 
    <fctr> <fctr> <dbl> 
1 brown blond   4 
2 brown brunette  -8 
3 blue blond   2 
4 blue brunette  3 
5 green blond   -5 
6 green brunette  9 

我想有金髮和黑髮之間分差爲每Eyecolor或者只是簡單地從黑髮金髮減去每一個Eyecolor

我試過使用dplyr包,但我很努力地讓代碼正確。此外與diff()不喜歡負值。

回答

2

使用您的數據

df <- read.table(text = c(" 
Eyecolour Haircolour Points 
brown blond   4 
brown brunette  -8 
blue blond   2 
blue brunette  3 
green blond   -5 
green brunette  9"), header = T) 

你可以嘗試

library(dplyr) 
library(tidyr) 
df %>% 
    tidyr::spread(Haircolour, Points) %>% 
    dplyr::mutate(diff = blond - brunette) 

結果

Eyecolour blond brunette diff 
1  blue  2  3 -1 
2  brown  4  -8 12 
3  green -5  9 -14 
+1

的偉大工程,方便!乾杯 –

2

我們可以使用

library(dplyr) 
df %>% 
    mutate(Haircolour = as.character(Haircolour)) %>% 
    group_by(Eyecolour) %>% 
    summarise(Diff = Points[Haircolour=="blond"] - Points[Haircolour =="brunette"]) 
# Eyecolour Diff 
#  <fctr> <int> 
#1  blue -1 
#2  brown 12 
#3  green -14 

或者使用data.table

library(data.table) 
dcast(setDT(df), Eyecolour~Haircolour, value.var="Points")[, Diff:= blond-brunette][] 
# Eyecolour blond brunette Diff 
#1:  blue  2  3 -1 
#2:  brown  4  -8 12 
#3:  green -5  9 -14 
+0

@ZheyuanLi是的,當我得到你的回覆,你沒有在做它。 – akrun