2016-08-01 43 views
-2

我的表是這樣的(輸入):如何知道事件前的最後一個日誌? R輸入語言

user_id event  timestamp 
Rob  business 111111 
Rob  business 222222 
Mike  progress 111111 
Mike  progress 222222 
Rob  progress 000001 
Mike  business 333333 
Mike  progress 444444 
Lee  progress 111111 
Lee  progress 222222 

dput

dput(input) 
structure(list(user_id = structure(c(3L, 3L, 2L, 2L, 3L, 2L, 
2L, 1L, 1L), .Label = c("Lee", "Mike", "Rob"), class = "factor"), 
    event = structure(c(1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L), .Label = c("business", 
    "progress"), class = "factor"), timestamp = c(111111, 222222, 
    111111, 222222, 1, 333333, 444444, 111111, 222222)), .Names = c("user_id", 
"event", "timestamp"), row.names = c(NA, -9L), class = "data.frame") 

我想知道最後progress事件之前先business事件發生(輸出):

user_id event  timestamp 
    Mike  progress 222222 
    Rob  progress 000001 

感謝您的幫助!

+3

我想你需要更好地解釋這一點。 – Frank

回答

2

我們可以data.table

012嘗試
library(data.table) 
setDT(df1)[df1[order(as.numeric(timestamp)), if(any(event == "business")) 
     .I[tail(which(cumsum(event == "business")==0),1)], user_id]$V1] 
# user_id event timestamp 
#1:  Rob progress 000001 
#2: Mike progress 222222 
+1

工作不錯!!!謝謝 – Smasell

1

不知道如果我完全得到你想要做的。使用which你可以得到所有非業務活動的指標(您的數據被稱爲input):

indexes <- which(input$event != "business") 

然後你就可以索引的這一載體進行過濾,僅擁有非商業活動,直至最後的商業活動:

indexes <- indexes[indexes < max(which(input$event == "business"))] 

看着剩餘的行,我們有:

> input[indexes,] 
    user_id event timestamp 
3 Mike progress 111111 
4 Mike progress 222222 
5  Rob progress   1 
相關問題