2017-06-06 56 views
0

我有一個包含看起來像這樣多個條目的嵌套的JSON數據集:解析JSON到CSV使用Python:AttributeError的:「統一」對象有沒有屬性「鑰匙」

{ 
"coordinates": null, 
"acoustic_features": { 
    "instrumentalness": "0.00479", 
    "liveness": "0.18", 
    "speechiness": "0.0294", 
    "danceability": "0.634", 
    "valence": "0.342", 
    "loudness": "-8.345", 
    "tempo": "125.044", 
    "acousticness": "0.00035", 
    "energy": "0.697", 
    "mode": "1", 
    "key": "6" 
}, 
"artist_id": "b2980c722a1ace7a30303718ce5491d8", 
"place": null, 
"geo": null, 
"tweet_lang": "en", 
"source": "Share.Radionomy.com", 
"track_title": "8eeZ", 
"track_id": "cd52b3e5b51da29e5893dba82a418a4b", 
"artist_name": "Dominion", 
"entities": { 
    "hashtags": [{ 
     "text": "nowplaying", 
     "indices": [0, 11] 
    }, { 
     "text": "goth", 
     "indices": [51, 56] 
    }, { 
     "text": "deathrock", 
     "indices": [57, 67] 
    }, { 
     "text": "postpunk", 
     "indices": [68, 77] 
    }], 
    "symbols": [], 
    "user_mentions": [], 
    "urls": [{ 
     "indices": [28, 50], 
     "expanded_url": "cathedral13.com/blog13", 
     "display_url": "cathedral13.com/blog13", 
     "url": "t.co/Tatf4hEVkv" 
    }] 
}, 
"created_at": "2014-01-01 05:54:21", 
"text": "#nowplaying Dominion - 8eeZ Tatf4hEVkv #goth #deathrock #postpunk", 
"user": { 
    "location": "middle of nowhere", 
    "lang": "en", 
    "time_zone": "Central Time (US & Canada)", 
    "name": "Cathedral 13", 
    "entities": null, 
    "id": 81496937, 
    "description": "I\u2019m a music junkie who is currently responsible for Cathedral 13 internet radio (goth, deathrock, post-punk)which has been online since 06/20/02." 
}, 
"id": 418243774842929150 
} 

我想將其轉換成csv文件,其中有多個列包含每個JSON對象的相應條目。以下是Python代碼我寫這樣做:

import json 
import csv 
from pprint import pprint 
data = [] 
with open('data_subset.json') as data_file: 
    for line in data_file: 
     data.append(json.loads(line)) 

# open a file for writing 
data_csv = open('Data_csv.csv', 'w') 
# create the csv writer object 
csvwriter = csv.writer(data_csv) 

for i in range(1,10): 
    count = 0 
    for dat in data[i]: 
     if count == 0: 
      header = dat.keys() 
      csvwriter.writerow(header) 
      count += 1 
     csvwriter.writerow(emp.values()) 
data_csv.close() 

在運行上面的代碼,我得到的錯誤:AttributeError的:「統一」對象有沒有屬性「鑰匙」。 可能是什麼問題?

回答

2

可以讀取JSON文件都在一次這樣的:

with open('a.txt') as data_file:  
    data = json.load(data_file) 

現在你有JSON作爲data字典。

既然你想從JSON到CSV特定條目(如entities不保存到CSV)你可以保持一個自定義的列標題,然後遍歷所有的數據寫入特定的鑰匙,CSV:

# Example to save the artist_id and user id; can be extended for the actual data 
header = ['artist_id', 'id'] 

# open a file for writing 
data_csv = open('Data_csv.csv', 'wb') 

# create the csv writer object 
csvwriter = csv.writer(data_csv) 

# write the csv header 
csvwriter.writerow(header) 

for entry in data: 
    csvwriter.writerow([entry['artist_id'], entry['user']['id']]) 

data_csv.close() 
+0

實際的json文件在上面給出的格式中有10000個條目。所以我想我需要遍歷JSON對象並將它們存儲在一個數組中。我想CSV文件有作爲列如下: {座標,\t instrumentalness,\t活躍,\t speechiness,\t danceability,\t價,\t響度,\t節奏,\t acousticness,\t能源,\t模式,\t鍵,\t artist_id,\t地方,\t地理位置,\t tweet_lang,\t源,\t TRACK_TITLE,\t track_id,\t ARTIST_NAME,\t created_at,\t文本,\t位置,\t郎,\t噸ime_zone,\t name,\t entities,\t id,\t description} 此外,由hashtags組成的實體可以具有可變數量的文本和索引字段。 –

+0

@AsmitaPoddar,我已根據您的輸入更新了答案。您可以添加json中的其他字段,將它們寫入csv。 – yeniv

+0

非常感謝您的幫助。有效。 –

相關問題