2017-04-03 115 views
1

我有一個數據框與日期和我使用seq()年份之間的最低和最高日期。沒有for循環seq的間隔數

數據框:

 daysOfStop dateConsult 
1:   NA 2002-11-17 
2:   NA 2003-11-03 
3:   NA 2004-12-16 
4:   NA 2006-01-31 
5:   NA 2006-01-31 
6:   NA 2003-02-05 
7:   NA 2003-09-29 
8:   NA 2005-08-01 
9:   NA 2005-08-01 
10:   NA 2005-08-01 

seq()結果:

"2002-11-17" "2003-11-17" "2004-11-17" "2005-11-17" 

我想這樣做是爲了創造一個值是由seq()間隔不爲循環定義一個新列(從我有超過120 000行開始,這花費了很多時間)。

因此: 在"2002-11-17""2003-11-17"之間,它是年份編號1(第一個區間); "2003-11-17""2004-11-17"之間,它的年份數字2(第二區間) 等等。

結果預計:

 daysOfStop dateConsult numYear 
1:   NA 2002-11-17 1 
2:   NA 2003-11-03 1 
3:   NA 2004-12-16 3 
4:   NA 2006-01-31 4 
5:   NA 2006-01-31 4 
6:   NA 2003-02-05 1 
7:   NA 2003-09-29 1 
8:   NA 2005-08-01 3 
9:   NA 2005-08-01 3 
10:   NA 2005-08-01 3 

數據:

structure(list(daysOfStop = c(NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), 
    dateConsult = structure(c(12008, 12359, 12768, 13179, 13179, 
    12088, 12324, 12996, 12996, 12996), class = "Date")), .Names = c("daysOfStop", 
"dateConsult"), class = c("data.table", "data.frame"), row.names = c(NA, 
-10L), .internal.selfref = <pointer: 0x0000000006360788>) 

回答

3

我們可以使用findInterval

dt1[, numYear := findInterval(dateConsult, seq(min(dateConsult), 
         max(dateConsult), "1 year"))] 
+1

這就是我一直在尋找的功能。很強大。謝謝 ! –