2013-04-11 74 views
1

我想分組在舊金山的所有日期房屋年銷售。我使用下面的代碼`切'功能的錯誤

geo_big$month <- as.Date(paste0(strftime(geo_big$date, format = "%Y-%m"), "-01")) 

geo_big$date_r <- cut(geo_big$month, breaks = as.Date(c("2003-04-01", "2004-01-01", "2005-01-01", "2006-01-01", "2007-01-01", "2008-11-01")), include.lowest = TRUE, labels = as.Date(c("2003-01 - 2004-12", "2004-01 - 2004-12", "2005-01 - 2005-12", "2006-01 - 2006-12", "2007-01 - 2007-12", "2008-01 - 2008-11"))) 

而得到這個消息:

Error in charToDate(x) : 
    character string is not in a standard unambiguous format 

任何人都知道這是怎麼回事?

+0

'geo_big $ date'以什麼格式存儲? – mnel 2013-04-11 00:19:41

+0

as.Date(strptime(geo_big $ date,「%Y-%m-%d」)) – sue143 2013-04-11 00:29:39

+0

看起來很可疑的一個方面是'標籤'參數。應該是一個字符向量而不是日期。另一個看起來有問題的方面(在看'help(cut.Date)'後)是break參數。使用Date值序列進行測試會爲我返回一個錯誤。 – 2013-04-11 00:53:51

回答

0

給出的錯誤應該表明問題不是cut而是as.Date。 (這是抱怨你無法確定日期的格式)

更具體地說,這是你所得到的標籤。不需要包裝as.Date

標籤應該是characterc(.)並且引號就足夠了。


就像有點兒手,上面的代碼可以在幾個區域清理。
另外,lubridate程序包可能對您非常有用。

# instead of: 
geo_big$month <- as.Date(paste0(strftime(geo_big$date, format = "%Y-%m"), "-01")) 

# you can use `floor_date`: 
library(lubridate) 
geo_big$month <- floor_date(geo_big$date, "month") # from the `lubridate` pkg 


# instead of: 
... a giant cut statement... 

# use variables for ease of reading and debugging 

# bks <- as.Date(c("2003-04-01", "2004-01-01", "2005-01-01", "2006-01-01", "2007-01-01", "2008-11-01")) 
# or: 
bks <- c(dmin, seq.Date(ceiling_date(dmin, "year"), floor_date(dmax, "year"), by="year"), dmax) # still using library(lubridate) 

# basing your labels on your breaks helps guard against human error & typos 
lbls <- head(floor_date(bks, "year"), -1) # dropping the last one, and adding dmax 
lbls <- paste(substr(lbls, 1, 7), substr(c(lbls[-1] - 1, dmax), 1, 7), sep=" - ") 

# a cleaner, more readable `cut` statement 
cut(geo_big$month, breaks=bks, include.lowest=TRUE, labels=lbls)