2016-11-04 55 views
-8

我有一系列事件及其時間。我可以使用hist繪製它們的直方圖,但我不知道如何對它們進行累積繪圖。在R中創建事件的累積情節

這是我開始使用的那種數據。 (假設它已經在POSIXct格式)

> events$time 

[1] 2015-10-05 16:58:41.986797 2015-10-05 16:59:23.389583 
[3] 2015-10-05 16:59:44.99402 2015-10-05 16:59:53.225178 
[5] 2015-10-05 16:59:59.594524 2015-10-05 17:00:05.555564 
[7] 2015-10-05 17:00:44.173783 2015-10-05 17:00:46.289552 
[9] 2015-10-05 17:00:56.772485 2015-10-05 17:01:18.937458 
[11] 2015-10-05 17:02:04.661378 

and so on for ~8000 values 

舉例來說,在我的直方圖,我有這樣的:

2015-10-05 4:00: 20 events 
2015-10-05 4:15: 30 events 
2015-10-05 4:30: 11 events 

我想得到這樣理貨:

2015-10-05 4:00: 20 events 
2015-10-05 4:15: 50 events 
2015-10-05 4:30: 61 events 

如何我要這樣做嗎?

+1

'圖(ECDF(事件$時間))'[PS:如提到宋哲元,在R標籤的用戶發現,使用'dput'移除的模糊數據添加數據。所以在你的情況下,如果你可以用'dput(events $ time [1:10])編輯你的問題''。歡呼] – user20650

+0

......或者一個例子,比如'timez < - sample(Sys.time()+ 1:1000,100)'。所以首先你想把數據彙總到15分鐘的時間段內,計算事件,然後繪圖? – user20650

+0

隨着hist我用了任意數量的休息像100,只是想試試 –

回答

1

一種可能的解決方案:

library(lubridate) 

# example time data 
time = c(
    "2015-10-05 15:44:41.986797", "2015-10-05 15:59:23.389583", "2015-10-05 16:59:44.99402", 
    "2015-10-05 16:59:44.99402", "2015-10-05 16:59:44.99402", "2015-10-05 16:59:44.99402", 
    "2015-10-05 17:59:59.594524", "2015-10-05 17:59:59.594524", "2015-10-05 18:00:05.555564" 
) 

# transform time strings to POSIXct objects for count 
time <- ymd_hms(time) 

# count by second 
event <- data.frame(table(time)) 

# transform time factors to POSIXct objects for df 
event$time <- ymd_hms(event$time) 

# find start and end time for 15min sequence 
start <- round(min(event$time), "mins") 
if (min(event$time) < start) { 
    minute(start) <- minute(start) - 1 
} 
while (minute(start) %% 15 != 0) { 
    minute(start) <- minute(start) - 1 
} 

end <- round(max(event$time), "mins") 
if (max(event$time) > end) { 
    minute(end) <- minute(end) + 1 
} 
while (minute(end) %% 15 != 0) { 
    minute(end) <- minute(end) + 1 
} 

# create sequence and result data.frame 
ft.seq <- seq(start, end, "15 mins") 

ft.event <- data.frame(
    start = ft.seq[1:(length(ft.seq)-1)], 
    end = ft.seq[2:(length(ft.seq))], 
    sum = 0 
) 

# ugly, nested loop to attribute values to 15min time slices 
for (p1 in 1:nrow(ft.event)) { 
    for (p2 in 1:nrow(event)) { 
    if (event$time[p2] > ft.event$start[p1] && 
     event$time[p2] < ft.event$end[p1]) { 
     ft.event$sum[p1] <- ft.event$sum[p1] + event$Freq[p2] 
    } 
    } 
} 

# cumsum 
ft.event$cumsum <- cumsum(ft.event$sum) 

# example plot 
library(ggplot2) 

ggplot(ft.event) + 
    geom_line(aes(x = end, y = cumsum)) 
+0

你的回答很明確,但我沒有那個「事件」專欄,只有一個與時俱進。 IE瀏覽器。這些事件沒有「價值」,唯一相關的數據就是他們在某個特定時間發生的事實。 –

+0

看來,OP需要爲每個15分鐘的時間戳聚合事件數 – agenis

+0

@JonathanAllard這是否行得通? – nevrome