正如@Arun在評論中建議,reshape
會爲您做到這一點。
d<-read.table(text='City Date Revenue Costs
"New York" "Feb 1" 2000 200
"San Fran" "Feb 3" 1200 300
Boston "Feb 1" 1500 400', header=TRUE)
reshape(d[! names(d) %in% 'Costs'], idvar='Date', timevar='City', direction='wide')
# Date Revenue.New York Revenue.San Fran Revenue.Boston
# 1 Feb 1 2000 NA 1500
# 2 Feb 3 NA 1200 NA
如果有你想先結合起來,城市/日期多個條目,就可以使用aggregate
。
d<-read.table(text='City Date Revenue Costs
"New York" "Feb 1" 2000 200
"New York" "Feb 1" 1000 100
"San Fran" "Feb 3" 1200 300
Boston "Feb 1" 1500 400', header=TRUE)
dd<-with(d, aggregate(Revenue, by=list(City=City, Date=Date), sum))
# City Date x
# 1 Boston Feb 1 1500
# 2 New York Feb 1 3000
# 3 San Fran Feb 3 1200
ddd<-reshape(dd, idvar='Date', timevar='City', direction='wide')
# Date x.Boston x.New York x.San Fran
# 1 Feb 1 1500 3000 NA
# 3 Feb 3 NA NA 1200
然後用0
代替NA
s。
ddd[is.na(ddd)] <- 0
# Date x.Boston x.New York x.San Fran
# 1 Feb 1 1500 3000 0
# 3 Feb 3 0 0 1200
爲了解決點@Arun下面帶來了,前面的步驟之前,你可以使用merge
功能填補丟失的日期。
missing.Dates <- c('Feb 2')
ddd<-merge(ddd, data.frame(Date=missing.Dates), by='Date', all=TRUE)
# Date x.Boston x.New York x.San Fran
#1 Feb 1 1500 3000 NA
#2 Feb 3 NA NA 1200
#3 Feb 2 NA NA NA
ddd[is.na(ddd)] <- 0
# Date x.Boston x.New York x.San Fran
# 1 Feb 1 1500 3000 0
# 2 Feb 3 0 0 1200
# 3 Feb 2 0 0 0
不僅重塑,按照日期聚集,城市,以及 – 2013-03-22 13:51:03
謝謝阿倫,彙總+重塑這兩個簡單的步驟,節省了我寫很長的循環功能的麻煩。 – 2013-03-22 14:11:44