2015-11-07 99 views
0

我有一些JSON,看起來像這樣:如何在R中讀取嵌套的JSON結構?

"total_rows":141,"offset":0,"rows":[ 
{"id":"1","key":"a","value":{"SP$Sale_Price":"240000","CONTRACTDATE$Contract_Date":"2006-10-26T05:00:00"}}, 
{"id":"2","key":"b","value":{"SP$Sale_Price":"2000000","CONTRACTDATE$Contract_Date":"2006-08-22T05:00:00"}}, 
{"id":"3","key":"c","value":{"SP$Sale_Price":"780000","CONTRACTDATE$Contract_Date":"2007-01-18T06:00:00"}}, 
... 

在R,這將是生產的SP$Sale_PriceCONTRACTDATE$Contract_Date一個散點圖最簡單的方法?

我能走到今天:

install.packages("rjson") 
library("rjson") 
json_file <- "http://localhost:5984/testdb/_design/sold/_view/sold?limit=100" 
json_data <- fromJSON(file=json_file) 
install.packages("plyr") 
library(plyr) 
asFrame <- do.call("rbind.fill", lapply(json_data, as.data.frame)) 

但現在我卡住了...

> plot(CONTRACTDATE$Contract_Date, SP$Sale_Price) 
Error in plot(CONTRACTDATE$Contract_Date, SP$Sale_Price) : 
    object 'CONTRACTDATE' not found 

如何使這項工作?

+0

錯誤不會得到比'對象「CON更清晰TRACTDATE'找不到' – rawr

回答

2

假設你有以下的JSON文件:

library(jsonlite) 
json_data <- fromJSON(txt, flatten = TRUE) 

# get the needed dataframe 
dat <- json_data$rows 
# set convenient names for the columns 
# this step is optional, it just gives you nicer columnnames 
names(dat) <- c("id","key","sale_price","contract_date") 
# convert the 'contract_date' column to a datetime format 
dat$contract_date <- strptime(dat$contract_date, format="%Y-%m-%dT%H:%M:%S", tz="GMT") 

現在你可以繪製:

plot(dat$contract_date, dat$sale_price) 

txt <- '{"total_rows":141,"offset":0,"rows":[ 
    {"id":"1","key":"a","value":{"SP$Sale_Price":"240000","CONTRACTDATE$Contract_Date":"2006-10-26T05:00:00"}}, 
    {"id":"2","key":"b","value":{"SP$Sale_Price":"2000000","CONTRACTDATE$Contract_Date":"2006-08-22T05:00:00"}}, 
    {"id":"3","key":"c","value":{"SP$Sale_Price":"780000","CONTRACTDATE$Contract_Date":"2007-01-18T06:00:00"}}]}' 

然後你可以用jsonlite包如下閱讀

其中給出:

enter image description here


如果您選擇不扁平化JSON,你可以這樣做:

json_data <- fromJSON(txt) 

dat <- json_data$rows$value 

sp <- strtoi(dat$`SP$Sale_Price`) 
cd <- strptime(dat$`CONTRACTDATE$Contract_Date`, format="%Y-%m-%dT%H:%M:%S", tz="GMT") 
plot(cd,sp) 

可以得到相同的情節:

enter image description here

+0

太好了,謝謝!我有點失望,扁平化丟棄字段名稱。我在真實的json數據中有700個屬性。有沒有辦法保存屬性名稱? –

+0

@AlexR展平不會丟棄字段名稱。看看'json_data'的結構。你會看到'value.'被添加到字段名稱的前面。設置'dat'的步驟不是必需的,因此是可選的。查看更新。 – Jaap

0

我發現了一個方式,不會丟棄字段名稱:

install.packages("jsonlite") 
install.packages("curl") 
json <- fromJSON(json_file) 
r <- json$rows 

此時r看起來是這樣的:

> class(r) 
[1] "data.frame" 
> colnames(r) 
[1] "id" "key" "value" 

後一些更多的谷歌搜索和試錯我登陸了這一點:

f <- r$value 
sp <- strtoi(f[["SP$Sale_Price"]]) 
cd <- strptime(f[["CONTRACTDATE$Contract_Date"]], format="%Y-%m-%dT%H:%M:%S", tz="GMT") 
plot(cd,sp) 

而且在我的整個數據集的結果...

enter image description here