2012-02-15 92 views
12

我可以一個data.table轉換爲XTS對象,就像我做的一個data.frame:[R XTS和data.table

> df = data.frame(x = c("a", "b", "c", "d"), v = rnorm(4)) 
> dt = data.table(x = c("a", "b", "c", "d"), v = rnorm(4)) 
> xts(df, as.POSIXlt(c("2011-01-01 15:30:00", "2011-01-02 15:30:00", "2011-01-03 15:50:50", "2011-01-04 15:30:00"))) 
        x v   
2011-01-01 15:30:00 "a" "-1.2232283" 
2011-01-02 15:30:00 "b" "-0.1654551" 
2011-01-03 15:50:50 "c" "-0.4456202" 
2011-01-04 15:30:00 "d" "-0.9416562" 
> xts(dt, as.POSIXlt(c("2011-01-01 15:30:00", "2011-01-02 15:30:00", "2011-01-03 15:50:50", "2011-01-04 15:30:00"))) 
        x v   
2011-01-01 15:30:00 "a" " 1.3089579" 
2011-01-02 15:30:00 "b" "-1.7681071" 
2011-01-03 15:50:50 "c" "-1.4375100" 
2011-01-04 15:30:00 "d" "-0.2467274" 

是否有使用data.table與XTS任何問題?

+3

沒有問題,但事實上它是一個data.table丟失:該數據被轉換成一個矩陣(內部的XTS對象)。在你的例子中,它甚至是一個字符串矩陣。 – 2012-02-15 14:41:51

+0

我認爲xts在其內部實現中保留了一個data.frame對象,並且只添加了時間索引作爲屬性。我在xts native上運行索引查詢而不是data.frame或data.table查詢? – 2012-02-15 14:49:39

+0

@RobertKubrick:與其父類(動物園)一樣,xts使用具有索引屬性的矩陣(不是數據框)。 – 2012-02-15 16:29:48

回答

17

只是爲了解決一個懸而未決的問題。由於文森特在評論中指出,這方面沒有任何問題。

它包含在data.table 1.9.5中。下面是類似的內容:

as.data.table.xts <- function(x, keep.rownames = TRUE){ 
    stopifnot(requireNamespace("xts") || !missing(x) || xts::is.xts(x)) 
    r = setDT(as.data.frame(x), keep.rownames = keep.rownames) 
    if(!keep.rownames) return(r[]) 
    setnames(r,"rn","index") 
    setkeyv(r,"index")[] 
} 

as.xts.data.table <- function(x){ 
    stopifnot(requireNamespace("xts") || !missing(x) || is.data.table(x) || any(class(x[[1]] %in% c("POSIXct","Date")))) 
    colsNumeric = sapply(x, is.numeric)[-1] # exclude first col, xts index 
    if(any(!colsNumeric)){ 
    warning(paste("Following columns are not numeric and will be omitted:",paste(names(colsNumeric)[!colsNumeric],collapse=", "))) 
    } 
    r = setDF(x[,.SD,.SDcols=names(colsNumeric)[colsNumeric]]) 
    rownames(r) <- x[[1]] 
    xts::as.xts(r) 
} 
+2

不錯 - 已+1。也許馬特和阿倫可以將它拉入data.table本身? – 2014-12-08 13:21:38

+0

'as.data.table.xts'函數將索引轉換爲'character','as.xts.data.table'函數不允許不是'numeric'的xts'對象(例如all'字符''xts') – GSee 2014-12-08 13:46:11

+0

@DirkEddelbuettel不確定,但回到這裏添加到[#882](https://github.com/Rdatatable/data.table/issues/882)討論... – 2014-12-08 22:54:12

7

由於quantmod,是很常見的有嵌入到所有的列名標誌的xts。 (例如「SPY.Open」,「SPY.High」等)。所以,這裏是Jan的as.data.table.xts的一個替代方案,它將符號放在一個單獨的列中,這在data.tables中更自然(因爲在進行任何分析之前您可能會將其中的一部分刪除)。

as.data.table.xts <- function(x, ...) { 
    cn <- colnames(x) 
    sscn <- strsplit(cn, "\\.") 
    indexClass(x) <- c('POSIXct', 'POSIXt') #coerce index to POSIXct 
    DT <- data.table(time=index(x), coredata(x)) 
    #DT <- data.table(IDateTime(index(x)), coredata(x)) 

    ## If there is a Symbol embedded in the colnames, strip it out and make it a 
    ## column 
    if (all(sapply(sscn, "[", 1) == sscn[[1]][1])) { 
    Symbol <- sscn[[1]][1] 
    setnames(DT, names(DT)[-1], sub(paste0(Symbol, "."), "", cn)) 
    DT <- DT[, Symbol:=Symbol] 
    setkey(DT, Symbol, time)[] 
    } else { 
    setkey(DT, time)[] 
    } 
} 

library(quantmod) 
getSymbols("SPY") 
as.data.table(SPY) 
      time Open High Low Close Volume Adjusted Symbol 
    1: 2007-01-03 142.25 142.86 140.57 141.37 94807600 120.36 SPY 
    2: 2007-01-04 141.23 142.05 140.61 141.67 69620600 120.61 SPY 
    3: 2007-01-05 141.33 141.40 140.38 140.54 76645300 119.65 SPY 
    4: 2007-01-08 140.82 141.41 140.25 141.19 71655000 120.20 SPY 
    5: 2007-01-09 141.31 141.60 140.40 141.07 75680100 120.10 SPY 
    ---                 
1993: 2014-12-01 206.30 206.60 205.38 205.64 12670100 205.64 SPY 
1994: 2014-12-02 205.81 207.34 205.78 207.09 72105500 207.09 SPY 
1995: 2014-12-03 207.30 208.15 207.10 207.89 69450000 207.89 SPY 
1996: 2014-12-04 207.54 208.27 206.70 207.66 89928200 207.66 SPY 
1997: 2014-12-05 207.87 208.47 207.55 208.00 85031000 208.00 SPY