按R data.table中相應事件的數量標記行

我是R中的新成員，我一直在用條件掙扎，我想應用在data.table中。按R data.table中相應事件的數量標記行

我的data.table由Order_id和Date排序，看起來像這樣。

我需要的是與這些條件檢舉人變量創建新列：

如果有3個以上的連續0 hours_delta柱然後用flag_1之前標記這些線和線
如果有小於3，並且在hours_delta 大於1的連續的0，然後前標記這些線和線與flag_2
如果僅存在一個0 2是超過0象行索引之間[8]然後標記這些線與flag_3
標記所有其餘與flag_4

這是我想該表看起來像在新列後面。

任何幫助，將不勝感激。

謝謝！

來源

2017-05-24 oikonang

如果在非零值之間有3個確切的零，該怎麼辦？ – amonk

也可以請澄清*小於3和超過1 *的含義。就代數而言，它是[1,3]，（1,3]，[1,3]還是（1,3）？ – amonk

我覺得像這樣的東西可能會爲你正在努力完成的工作。

library(dplyr) 

# Create test dataframe 
index <- c(0:19) 
Order_id <- c(rep(001,8),rep(002,3),rep(003,4),rep(004,3),rep(005,2)) 
hours_delta <- c(720,552,rep(0,5),432,0,72,96,121,0,0,0,33,0,0,77,0) 

df <- data.frame(index,Order_id,hours_delta) 


# Start dplyr modifications 
df <- df %>% 
     # Group data by Order_id 
     group_by(Order_id) %>% 
     # Get the number of repitions of 0 for in the hours_delta field for that Order_id 
     mutate(rle = ifelse(hours_delta == 0,rle(hours_delta)[[1]][rle(hours_delta)[[2]] == 0],NA), 
      # Set the row above a zero sequence to the number of repetitions 
      rle = ifelse(is.na(rle),lead(rle),rle)) %>% 
     # ungroup the data 
     ungroup() %>% 
     # Set the flags based on number of repetitions 
     mutate(flagger = case_when(is.na(.$rle) 
           ~ "flag_4", 
           .$rle == 1 
           ~ "flag_3", 
           (.$rle <= 3 & .$rle > 1) 
           ~ "flag_2", 
           .$rle > 3 
           ~ "flag_1" 
           ) 
      ) %>% 
    # Remove the temporary rle column 
    select(-rle)

來源

2017-05-24 16:07:21

這就是我正在尋找的！非常感謝你！ – oikonang

是否可以將相同的功能應用於data.tables？我的意思是，不使用管道和rle（）函數，data.table格式的結果是什麼。問題在於，當我將它應用於示例數據框時它可以工作，但是當我將代碼應用於主data.table時，我得到完全不同的結果。將整個data.table轉換爲data.frame只適用於上面然後返回data.table是明智的嗎？ – oikonang

我想出了問題所在。嘗試通過用Order_id < - c（rep（001,8））'替換Order_id來應用相同的方法。因此，如果在同一個Order_id中有多個連續的0次計數，它會混亂起來。有沒有辦法呢？ – oikonang

按R data.table中相應事件的數量標記行

回答

相關問題