我想展示一種替代方式來做這樣的事情,即使我經常感到它不被讚賞這樣做:使用sql。
sqldf(paste("SELECT a.ID,a.Score"
," , a.Score - (SELECT b.Score"
," FROM df b"
," WHERE b.ID < a.ID"
," ORDER BY b.ID DESC"
," ) diff"
," FROM df a"
)
)
的代碼看起來很複雜,但它不是,它有一定的優勢,因爲你可以在結果中看到:
ID Score diff
1 1 40 <NA>
2 2 36 -4.0
3 3 32 -4.0
4 4 28 -4.0
5 5 24 -4.0
6 6 20 -4.0
7 7 16 -4.0
8 8 12 -4.0
9 9 8 -4.0
10 10 4 -4.0
一個優點是,您使用原始數據框(不轉換成其他類),你會得到一個數據框(把它放在res < - ....)。另一個好處是你還有所有的行。第三個優點是您可以輕鬆考慮分組因素。例如:
df2 <- data.frame(ID=1:10,grp=rep(c("v","w"), each=5),Score=4*10:1)
sqldf(paste("SELECT a.ID,a.grp,a.Score"
," , a.Score - (SELECT b.Score"
," FROM df2 b"
," WHERE b.ID < a.ID"
," AND a.grp = b.grp"
," ORDER BY b.ID DESC"
," ) diff"
," FROM df2 a"
)
)
ID grp Score diff
1 1 v 40 <NA>
2 2 v 36 -4.0
3 3 v 32 -4.0
4 4 v 28 -4.0
5 5 v 24 -4.0
6 6 w 20 <NA>
7 7 w 16 -4.0
8 8 w 12 -4.0
9 9 w 8 -4.0
10 10 w 4 -4.0
+1正面和尾巴的好手法。 – 2013-04-25 10:30:23
+1富有想象力。我永遠不會認爲'-1'的頭會返回除第一行之外的所有內容。聰明 – 2013-04-25 10:36:54