R聚合基於多個列，然後合併到數據框中？

我有一個數據幀，看起來像：R聚合基於多個列，然後合併到數據框中？

id<-c(1,1,1,3,3) 
date1<-c("23-01-08","01-11-07","30-11-07","17-12-07","12-12-08") 
type<-c("A","B","A","B","B") 
df<-data.frame(id,date,type) 
df$date<-as.Date(as.character(df$date), format = "%d-%m-%y")

我想是添加包含每個ID爲每種類型的最早日期的新列。這第一次嘗試正常工作，並基於唯一標識進行聚合和合並。

d = aggregate(df$date, by=list(df$id), min) 
df2 = merge(df, d, by.x="id", by.y="Group.1")

我想，雖然是也是類型進行篩選，並得到這樣的結果：

data.frame(df2, desired=c("2007-11-30","2007-11-01", "2007-11-30","2007-12-17","2007-12-17"))

我已經嘗試了很多的可能性。我真的認爲這可以用列表來完成，但我在一個損失如何？

d = aggregate(df$date, by=list(df$id, df$type), min) 

# And merge the result of aggregate with the original data frame 
df2 = merge(df,d,by.x=list("id","type"),by.y=list("Group.1","Group.2"))

對於這個簡單的例子，我可以只是類型分成自己的DF，建立新的列，然後結合由此產生的2 dfs，但實際上有很多類型和第三列也必須過濾類似，這將不實際...

謝謝！

來源

2017-01-10 Soran

你有date1'和'date'之間'一個錯字錯配'@thelatemail你說得對df' – thelatemail

。我走了一圈，讓這個日期列... – Soran

我們可以使用data.table。將'data.frame'轉換爲'data.table'（setDT(df)），按'id'，'type'（或'id'），order'date'和assign（:=）'date '作爲'最早的'專欄。

library(data.table) 
setDT(df)[order(date), earliestdateid := date[1], by = id 
    ][order(date), earliestdateidtype := date[1], by = .(id, type)] 
df 
# id  date type earliestdateid earliestdateidtype 
#1: 1 2008-01-23 A  2007-11-01   2007-11-30 
#2: 1 2007-11-01 B  2007-11-01   2007-11-01 
#3: 1 2007-11-30 A  2007-11-01   2007-11-30 
#4: 3 2007-12-17 B  2007-12-17   2007-12-17 
#5: 3 2008-12-12 B  2007-12-17   2007-12-17

與dplyr類似的方法是

library(dplyr) 
df %>% 
    group_by(id) %>% 
    arrange(date) %>% 
    mutate(earliestdateid = first(date)) %>% 
    group_by(type, add = TRUE) %>% 
    mutate(earliestdateidtype = first(date))

注意：這避免分兩步這樣即得到總的輸出，然後加入

來源

2017-01-10 02:59:34 akrun

哇這就是爲什麼我喜歡R.複雜的一堆行動照顧在1行。我認爲2行很棒。如果我碰到類似的東西，但是在數字列而不是日期上，我是否只是將order（date）更改爲數字（或數字），或者對於data.table方式的某種效果？ – Soran

@Soran如果你只是想要'mean（numbers）'，那麼不需要'order'，即'setDT（df）[，Mean：= mean（numbers），。（id，type）]' – akrun

你可以得到兩個最小值由不同組別用ave代替：

df$minid <- with(df, ave(date, id, FUN=min, drop=TRUE)) 
df$minidtype <- with(df, ave(date, list(id,type), FUN=min, drop=TRUE)) 
df 

# id  date type  minid minidtype 
#1 1 2008-01-23 A 2007-11-01 2007-11-30 
#2 1 2007-11-01 B 2007-11-01 2007-11-01 
#3 1 2007-11-30 A 2007-11-01 2007-11-30 
#4 3 2007-12-17 B 2007-12-17 2007-12-17 
#5 3 2008-12-12 B 2007-12-17 2007-12-17

如果你是棘手的，你可以做到這一切在一個電話也：製作時

df[c("minid", "minidtype")] <- lapply(list("id", c("id","type")), 
            FUN=function(x) ave(df$date, df[x], FUN=min, drop=TRUE))

來源

2017-01-10 03:04:57 thelatemail

R聚合基於多個列，然後合併到數據框中？

回答

相關問題