2017-08-30 83 views
1

我有這樣一個數據幀的類型,計算類型計數,並添加用逗號分隔的data.table

ID <- c("ID001","ID001","ID001","ID002","ID002","ID002") 
ToolID <- c("SWP","SWP","SWP","ISP","ISP","ISP") 
Type <- c("A","B","C","D","E","A") 
WHEN <- c("2017-08-15 12:44:11","2017-08-15 12:44:11","2017-08-14 19:07:11", 
      "2017-08-17 11:24:15","2017-08-17 11:24:15","2017-08-17 11:24:15") 

df <- data.frame(ID,ToolID,Type,WHEN) 
df$WHEN <- as.POSIXct(df$WHEN,format="%Y-%m-%d %H:%M:%S") 

我試圖把所有類型的用逗號分隔的一列,並且還計算計數對於ID,按照(Tool_ID & ID)進行分組,同時僅取最大值(WHEN),即相應ID的最近時間戳。

所需的輸出

 ID ToolID Type Type_count    WHEN 
    ID001 SWP A,B   2 2017-08-15 12:44:11 
    ID002 ISP D,E,A   3 2017-08-17 11:24:15 

我嘗試使用data.table而且做得這樣

library(data.table) 
setDT(df)[, WHEN := as.POSIXct(WHEN)] 
df1 <- df[, max(WHEN), by = list(ID,ToolID)] 
colnames(df1)[which(names(df1) == "V1")] <- "WHEN" 

如何獲得的類型和種類數增加DF1讓我期望的輸出? 有人能指出我在正確的方向嗎?

回答

1

我們可以創建基於邏輯條件的rowIndex,然後通過使用組,在i指定索引,並獲得摘要

i1 <- setDT(df)[, .I[WHEN == max(WHEN)], .(ID, ToolID)]$V1 
df[i1, .(Type = toString(unique(Type)), Type_count = uniqueN(Type), 
     WHEN = WHEN[1]), .(ID, ToolID)] 
#  ID ToolID Type Type_count    WHEN 
#1: ID001 SWP A, B   2 2017-08-15 12:44:11 
#2: ID002 ISP D, E, A   3 2017-08-17 11:24:15