2017-02-13 38 views
1

我有一個像data.table:如何尋找一個模式在data.table列

ID    Time Event 
1: 1 2016-09-25 14:47:52  1 
2: 1 2016-10-03 19:35:04  1 
3: 1 2016-10-03 21:11:00 -1 
4: 1 2016-10-04 14:25:56  1 
5: 1 2016-11-05 01:40:13  1 
6: 1 2016-11-27 04:40:21  1 
7: 1 2016-12-04 02:36:37  1 
8: 1 2017-01-12 13:48:01  1 
9: 1 2017-01-15 03:32:35  1 
10: 1 2017-02-05 01:35:07  1 
11: 1 2017-02-05 02:29:31  1 
12: 1 2017-02-05 02:34:33  1 
13: 2 2016-07-15 08:14:11  1 
14: 2 2016-07-22 22:15:44  1 
15: 2 2016-07-23 12:00:00 -1 
16: 2 2016-11-30 18:21:51  1 
17: 2 2016-12-03 07:00:31  1 
18: 2 2016-12-06 06:30:34  1 
19: 2 2016-12-16 10:00:50  1 
20: 2 2017-01-16 08:33:16  1 

,我試圖檢查後負一層由ID分組陽性事件發生。我的理想輸出與data.table:

ID Outcome 
1 TRUE 
2 TRUE 

我不知道如何制定應考慮到時間列,事件列過濾條件:我想知道,對於一個給定ID,事件= 1,事件-1的時間>時間......但是我無法用代碼來表達它......任何人都可以提供幫助嗎?

附上這裏演示數據集:

fakedata <- structure(list(ID = c(1L, 1L, 1L, 
        1L, 1L, 1L, 1L, 1L, 
        1L, 1L, 1L, 1L, 2L, 
        2L, 2L, 2L, 2L, 2L, 
        2L, 2L), Time = c("2016-09-25 14:47:52", "2016-10-03 19:35:04", 
                 "2016-10-03 21:11:00", "2016-10-04 14:25:56", "2016-11-05 01:40:13", 
                 "2016-11-27 04:40:21", "2016-12-04 02:36:37", "2017-01-12 13:48:01", 
                 "2017-01-15 03:32:35", "2017-02-05 01:35:07", "2017-02-05 02:29:31", 
                 "2017-02-05 02:34:33", "2016-07-15 08:14:11", "2016-07-22 22:15:44", 
                 "2016-07-23 12:00:00", "2016-11-30 18:21:51", "2016-12-03 07:00:31", 
                 "2016-12-06 06:30:34", "2016-12-16 10:00:50", "2017-01-16 08:33:16" 
        ), Event = c(1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, 1, 
           1, 1, 1, 1)), .Names = c("ID", "Time", "Event"), class = c("data.table", 
                          "data.frame"), row.names = c(NA, -20L)) 
+0

'sapply(split(fakedata,fakedata $ ID),function(x)is.na(which(diff(x $ Event)== 2))== FALSE)' –

+0

Thanks,I see it works but it it看起來不是一個合適的data.table解決方案 – user299791

+1

可以做'fakedata [order(as.POSIXct(Time)),any(Event - shift(Event,fill = 0)== 2),keyby = ID]' –

回答

1

下面是使用基礎R函數anywhich&&操作者沿一個data.table方法。

fakedata[order(ID, as.POSIXct(Time)), 
     .(outcome=any(Event == -1) && Event[which(Event == -1)+1] > 0), by=ID] 
    ID outcome 
1: 1 TRUE 
2: 2 TRUE 

正如評論所說,如果是一個好主意,以確保該數據集的計算正確之前下令大衛 - arenburg。對於data.table,我們可以在i參數中做到這一點。遵循大衛 - 阿姆伯格的評論,我在ID上訂購了它,然後在as.POSIXct(Time)上訂購。

在j參數中,.(outcome=any(Event==-1) && Event[which(Event == -1)+1] > 0),any(Event == -1)檢查是否存在-1,如果是,則Event[which(Event == -1)+1] > 0)檢查在存在-1的每個實例中,緊接的Event值是否爲正值。如果第一個實例失敗,則返回FALSE。

+1

需要確保'時間'欄的順序也是我猜測的。 –

+0

良好的通話。我會補充一點。 – lmo

+0

我的data.table是像fakedata [命令(ID,時間)]訂購 – user299791