重新採樣數據沒有在所有的工作所需的時間率

我已經在1分鐘採樣率的時間序列數據，重新採樣數據沒有在所有的工作所需的時間率

library(xts) 
#create timestamp with 1 mintue sampling rage 
timerange <- seq(as.POSIXct("2016-06-09"),as.POSIXct("2016-06-22 23:59:59"), by = "1 min") 
# create xts object 
data_xts <- xts(rnorm(length(timerange),200,5),timerange)

現在產生的，我想重新取樣（其他城市的採樣率）到50分鐘。因此，我創建了一個自定義函數：

resample_data_minutely_daywise <- function(data_xts,xminutes) { 
    day_data <- split.xts(data_xts,"days",k=1) # divide data daywise 
    # Now resample data according to parameter xminutes 
    day_list <- lapply(day_data, function(x) { 
    ds_data <- period.apply(x,INDEX = endpoints(index(x), on = "minutes", k = xminutes), FUN= mean) 
    align_data <- align.time(ds_data,xminutes*60) # aligning to x seconds 
    return(align_data) 
    }) 
    return(day_list) 
}

此函數將時間序列數據和所需採樣頻率作爲輸入。接下來，它每天分割數據，最後每天通過平均值改變採樣。

現在，每當我把這個功能

p <- resample_data_minutely_daywise(data_xts,50) 
sapply(p,length) # check no. of observations in each day

輸出是：

sapply(p,length) # check no. of observations in each day 
[1] 30 30 30 29 29 30 30 30 29 29 30 30 30 29

這說明，不是每天都包含相同數量的讀數。幾天包含29個，一些包含30個觀察。什麼可能是這種未知行爲的原因。請注意，每當我在10秒重新取樣時，每天20,30,60分鐘包含相同數量的讀數。這個問題只發生在我嘗試50分鐘時。

來源

2017-06-19 Haroon Rashid

50不會平均分配到每一天，所以一些obs。第二天開始，你減少1。 –

@StevenMortimer但我每天分開，然後嘗試以50分鐘的速度結合。 –

看看'p'。你的每一天都不是從00:00:00開始的。第二天的第一次觀察有時也包括在內。 – CPak

您的問題是，period.apply()使用endpoints()來查找中斷的位置，並且endpoints()輸出始終偏離UNIX紀元/原點（1970-01-01 00:00:00）。但是你希望休息時間從當天的午夜開始抵消。

你仍然可以用period.apply()來做到這一點，但是你需要計算自定義斷點。在你的情況下，你可以通過查找自xminutes的倍數開始以來的秒數。

resample_data_minutely_daywise <- function(data_xts,xminutes) { 
    day_data <- split.xts(data_xts,"days",k=1) # divide data daywise 
    # Now resample data according to parameter xminutes 
    day_list <- lapply(day_data, function(x) { 
    timeT <- .index(x) - .index(x)[1] 
    # when does timeT cross a multiple of xminutes? 
    ep <- which(timeT %% (xminutes * 60) <= 0) 
    # endpoints must start with zero and end with nrow 
    ep <- c(0, ep, nrow(x)) 
    # ...and be unique 
    ep <- unique(ep) 
    ds_data <- period.apply(x, INDEX = ep, FUN = mean) 
    align_data <- align.time(ds_data,xminutes*60) # aligning to x seconds 
    return(align_data) 
    }) 
    return(day_list) 
} 
p <- resample_data_minutely_daywise(data_xts,50) 
sapply(p,length) 
# [1] 30 30 30 30 30 30 30 30 30 30 30 30 30 30

來源

2017-06-19 22:18:33

應用'period.apply'後，我放鬆了時間戳。我嘗試將'ep'索引轉換爲時間索引，然後'period.apply'將錯誤作爲'[.xts'（x，（INDEX [y] + 1）：INDEX [y + 1]）中的錯誤：的界限。我需要'align_data'的時間戳 –

@HaroonRashid：看我的編輯。如果這不能解決您的問題，請提供一個能夠重現您所描述內容的示例。因爲我對你的問題中的例子沒有問題。 –

重新採樣數據沒有在所有的工作所需的時間率

回答

相關問題