2017-08-10 120 views
0

首次發佈!我將JSON數據(字典)從服務器轉換爲csv文件。除了嵌套「宇航員」(這是一個陣列)之外,所採用的鍵和值都很好。基本上每個單獨的JSON字符串都是一個數據,可以包含從0到無限數量的宇航員,這些特徵我想要作爲獨立值提取。例如這樣的事情:從巢中獲取JSON嵌套數組中的鍵和值

  • Astronaut1_Spaceships_First:Katabom
  • Astronaut1_Spaceships_Second:海怪
  • Astronaut1_name:Jebeddia
  • (...)
  • Astronaut2_gender:希望女性

和等等。這裏的問題是,巢被設置爲一個數組而不是字典,所以我不知道該怎麼做。我已經嘗試了dpath庫以及奉承巢,但沒有任何改變。有任何想法嗎?

import json 
import os 
import csv 
import datetime 
import dpath.util #Dpath library needs to be installed first 

datum = {"Mission": "Make Earth Greater Again", "Objective": "Prove Earth is flat", "Astronauts": [{"Spaceships": {"First": "Katabom", "Second": "The Kraken"}, "Name": "Jebeddiah", "Gender": "Hopefully male", "Age": 35, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}, {"Spaceships": {"First": "The Kraken", "Second": "Minnus I"}, "Name": "Bob", "Gender": "Hopefully female", "Age": 23, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}]} 

#Parsing process 
     parsed = json.loads(datum) #datum is the JSON string retrieved from the server 

def flattenjson(parsed, delim): 
    val = {} 
    for i in parsed.keys(): 
     if isinstance(parsed[i], dict): 
      get = flattenjson(parsed[i], delim) 
      for j in get.keys(): 
       val[i + delim + j] = get[j] 
     else: 
     val[i] = parsed[i] 

    return val 
flattened = flattenjson(parsed,"__") 

#process of creating csv file 
keys=['Astronaut1_Spaceship_First','Astronaut2_Spaceship_Second', 'Astronaut1_Name] #reduced to 3 keys for this example 

writer = csv.DictWriter(OD, keys ,restval='Null', delimiter=",", quotechar="\"", quoting=csv.QUOTE_ALL, dialect= "excel") 
     writer.writerow(flattened) 

#JSON DATA FROM SERVER 
{ 
"Mission": "Make Earth Greater Again", 
"Objective": "Prove Earth is flat", 
"Astronauts": [ { 
    "Spaceships": { 
    "First": "Katabom", 
    "Second": "The Kraken" 
    }, 
    "Name": "Jebeddiah", 
    "Gender": "Hopefully male", 
    "Age": 35, 
    "Prefered colleages": [], 
    "Following missions": [ 
    { 
     "Payment_status": "TO BE CONFIRMED" 
    } 
    ] 
}, 
{ 
    "Spaceships": { 
    "First": "The Kraken", 
    "Second": "Minnus I" 
    }, 
    "Name": "Bob", 
    "Gender": "Hopefully female", 
    "Age": 23, 
    "Prefered colleages": [], 
    "Following missions": [ 
    { 
     "Payment_status": "TO BE CONFIRMED" 
    } 
    ] 
}, 
    ] 
} 
] 

回答

0

首先,這裏定義的數據不是從服務器中提取的數據。來自服務器的數據將是一個字符串。你在這個程序中的數據已經被處理了。現在,假設數據爲:

datum = '{"Mission": "Make Earth Greater Again", "Objective": "Prove Earth is flat", "Astronauts": [{"Spaceships": {"First": "Katabom", "Second": "The Kraken"}, "Name": "Jebeddiah", "Gender": "Hopefully male", "Age": 35, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}, {"Spaceships": {"First": "The Kraken", "Second": "Minnus I"}, "Name": "Bob", "Gender": "Hopefully female", "Age": 23, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}]}' 

您不需要dpath庫。這裏的問題是你的json flattener不處理嵌入式列表。嘗試使用我在下面提到的那個。 假設你要一行csv文件,

import json 
def flattenjson(data, delim, topname=''): 
    """JSON flattener that can handle embedded lists and dictionaries""" 
    flattened = {} 
    def internalflat(int_data, name=topname): 
     if type(int_data) is dict: 
      for key in int_data: 
       internalflat(int_data[key], name + key + delim) 
     elif type(int_data) is list: 
      i = 1 
      for elem in int_data: 
       internalflat(elem, name + str(i) + delim) 
       i += 1 
     else: 
      flattened[name[:-len(delim)]] = int_data 
    internalflat(data) 
    return flattened 
#If you don't want mission or objective in csv file 
flattened_astronauts = flattenjson(json.loads(datum)["Astronauts"], "__", "Astronaut") 
keys = flattened_astronauts.keys().sort() 
writer = csv.DictWriter(OD, keys ,restval='Null', delimiter=",", quotechar="\"", quoting=csv.QUOTE_ALL, dialect= "excel") 
writer.writerow(flattened_astronauts) 
+0

試了幾次後我只得到了同樣的錯誤: flattened_astronauts = flattenjson({json.loads(基準) 「宇航員」]}) 類型錯誤:unhashable鍵入:'list' 基本上「宇航員」沒有被編碼爲字典,並且在那個函數中沒有改變...... – Saphiron

+0

我的不好。只是編輯功能,以更好地適應您的要求,並刪除花括號(他們不是必需的,輸出已經是一本字典)。 –

+0

工程!非常感謝! 現在的問題是宇航員的數量取決於基準。因此,無論何時數字發生變化都會生成標題(函數:writer.writeheader())。有沒有什麼辦法可以設置修正頭文件(宇航員在數據中的最大數量是25),並據此寫入csv文件? – Saphiron