2017-09-01 177 views
5

我有一個時間序列netCDF文件和時間變量具有如下典型的元數據:一個的NetCDF時間變量轉換成的R日期對象

double time(time) ; 
      time:standard_name = "time" ; 
      time:bounds = "time_bnds" ; 
      time:units = "days since 1979-1-1 00:00:00" ; 
      time:calendar = "standard" ; 
      time:axis = "T" ; 

裏面RI想將時間轉換成R日期對象。我現在以硬連線的方式通過讀取單位屬性並分割字符串並使用第三個條目作爲我的原點(因此假定間隔是「天」並且時間是00:00等)來實現此目的:

require("ncdf4") 
f1<-nc_open("file.nc") 
time<-ncvar_get(f1,"time") 
tunits<-ncatt_get(f1,"time",attname="units") 
tustr<-strsplit(tunits$value, " ") 
dates<-as.Date(time,origin=unlist(tustr)[3]) 

這個硬連線解決方案適用於我的具體示例,但我希望R中可能會有一個包,很好地處理UNIDATA netcdf數據約定的時間單位並將它們安全地轉換爲R日期對象?

+0

請注意,新建議和目前正在開發的真棒'stars'包將自動處理日期,請參閱第一篇博客文章中的示例:http://r-spatial.org/r/2017/11 /23/stars1.html – AF7

+0

啊,我忘了補充說'''包'似乎處理日期優雅。值得一試。 – AF7

+0

在我的回答中查看我的編輯示例 – AF7

回答

2

沒有,我知道的。我有這個方便的功能,使用lubridate,這與你的基本相同。

getNcTime <- function(nc) { 
    require(lubridate) 
    ncdims <- names(nc$dim) #get netcdf dimensions 
    timevar <- ncdims[which(ncdims %in% c("time", "Time", "datetime", "Datetime", "date", "Date"))[1]] #find time variable 
    times <- ncvar_get(nc, timevar) 
    if (length(timevar)==0) stop("ERROR! Could not identify the correct time variable") 
    timeatt <- ncatt_get(nc, timevar) #get attributes 
    timedef <- strsplit(timeatt$units, " ")[[1]] 
    timeunit <- timedef[1] 
    tz <- timedef[5] 
    timestart <- strsplit(timedef[4], ":")[[1]] 
    if (length(timestart) != 3 || timestart[1] > 24 || timestart[2] > 60 || timestart[3] > 60 || any(timestart < 0)) { 
     cat("Warning:", timestart, "not a valid start time. Assuming 00:00:00\n") 
     warning(paste("Warning:", timestart, "not a valid start time. Assuming 00:00:00\n")) 
     timedef[4] <- "00:00:00" 
    } 
    if (! tz %in% OlsonNames()) { 
     cat("Warning:", tz, "not a valid timezone. Assuming UTC\n") 
     warning(paste("Warning:", timestart, "not a valid start time. Assuming 00:00:00\n")) 
     tz <- "UTC" 
    } 
    timestart <- ymd_hms(paste(timedef[3], timedef[4]), tz=tz) 
    f <- switch(tolower(timeunit), #Find the correct lubridate time function based on the unit 
     seconds=seconds, second=seconds, sec=seconds, 
     minutes=minutes, minute=minutes, min=minutes, 
     hours=hours,  hour=hours,  h=hours, 
     days=days,  day=days,  d=days, 
     months=months, month=months, m=months, 
     years=years,  year=years,  yr=years, 
     NA 
    ) 
    suppressWarnings(if (is.na(f)) stop("Could not understand the time unit format")) 
    timestart + f(times) 
} 

編輯:你也可能想看看ncdf4.helpers::nc.get.time.series

EDIT2:請注意,新提出的,目前在研究與開發真棒stars包會自動處理日期,請參閱the first blog post爲例。

編輯3:另一種方法是直接使用units包,這是stars使用的。人們可以做這樣的事情:(仍然沒有正確處理日曆,我不知道units可以)

getNcTime <- function(nc) { ##NEW VERSION, with the units package 
    require(units) 
    require(ncdf4) 
    options(warn=1) #show warnings by default 
    if (is.character(nc)) nc <- nc_open(nc) 
    ncdims <- names(nc$dim) #get netcdf dimensions 
    timevar <- ncdims[which(ncdims %in% c("time", "Time", "datetime", "Datetime", "date", "Date"))] #find (first) time variable 
    if (length(timevar) > 1) { 
     warning(paste("Found more than one time var. Using the first:", timevar[1])) 
     timevar <- timevar[1] 
    } 
    if (length(timevar)!=1) stop("ERROR! Could not identify the correct time variable") 
    times <- ncvar_get(nc, timevar) #get time data 
    timeatt <- ncatt_get(nc, timevar) #get attributes 
    timeunit <- timeatt$units 
    units(times) <- make_unit(timeunit) 
    as.POSIXct(time) 
} 
+1

注意:AF7的函數或SnowFrog的函數都不能正確處理'calendar = 365_day'屬性,而'ncdf4.helpers :: nc.get.time.series'工作於365天日曆! – tbc

2

我不能讓@ AF7的功能與我的文件工作,所以我寫了我自己。下面的函數創建一個POSIXct日期向量,從nc文件中讀取開始日期,時間間隔,單位和長度。它適用於許多(但可能不是每個...)形狀或形式的nc文件。

ncdate <- function(nc) { 
    ncdims <- names(nc$dim) #Extract dimension names 
    timevar <- ncdims[which(ncdims %in% c("time", "Time", "datetime", "Datetime", 
              "date", "Date"))[1]] # Pick the time dimension 
    ntstep <-nc$dim[[timevar]]$len 
    t <- ncvar_get(nc, timevar) # Extract the timestep count 
    tunits <- ncatt_get(nc, timevar, "units") # Extract the long name of units 
    tspace <- t[2] - t[1] # Calculate time period between two timesteps, for the "by" argument 
    tstr <- strsplit(tunits$value, " ") # Extract string components of the time unit 
    a<-unlist(tstr[1]) # Isolate the unit .i.e. seconds, hours, days etc. 
    uname <- a[which(a %in% c("seconds","hours","days"))[1]] # Check unit 
    startd <- as.POSIXct(gsub(paste(uname,'since '),'',tunits$value),format="%Y-%m-%d %H:%M:%S") ## Extract the start/origin date 
    tmulti <- 3600 # Declare hourly multiplier for date 
    if (uname == "days") tmulti =86400 # Declare daily multiplier for date 
    ## Rename "seconds" to "secs" for "by" argument and change the multiplier. 
    if (uname == "seconds") { 
     uname <- "secs" 
     tmulti <- 1 } 
    byt <- paste(tspace,uname) # Define the "by" argument 
    if (byt == "0.0416666679084301 days") { ## If the unit is "days" but the "by" interval is in hours 
    byt= "1 hour"      ## R won't understand "by < 1" so change by and unit to hour. 
    uname = "hours"} 
    datev <- seq(from=as.POSIXct(startd+t[1]*tmulti),by= byt, units=uname,length=ntstep) 
} 
+0

非常感謝 - 我借用了一些AF7代碼想法,並將它們合併到我的R腳本中。我想知道這樣的功能是否可以貢獻給ncdf4軟件包本身?標準內置這樣的東西會很棒。 –

+0

請注意,這隻適用於有規律的間隔時間,所有NetCDF都不一定適用。爲什麼我的功能不適合你?我會盡量讓它更一般。 – AF7

+0

@ AF7好的時間重複有規律的時間間隔。我有一個錯誤信息到最後(對於'f'我認爲)。當我回到電腦時,我會發布錯誤信息。 – SnowFrog