2016-11-06 56 views
2

我很新的python/json/&編碼..試圖學習如何獲得以下格式的json到sql表。我使用了python熊貓,它將json節點轉換爲字典。請有人能幫助我理解如何做到這一點。轉換Json到SQL表

相同JSON:

{ 
    "Volumes": [ 
     { 
      "AvailabilityZone": "us-east-1a", 
      "Attachments": [ 
       { 
        "AttachTime": "2013-12-18T22:35:00.000Z", 
        "InstanceId": "i-1234567890abcdef0", 
        "VolumeId": "vol-049df61146c4d7901", 
        "State": "attached", 
        "DeleteOnTermination": true, 
        "Device": "/dev/sda1" 
       } 
      ], 
      "Tags": [ 
      { 
       "Value": "DBJanitor-Private", 
       "Key": "Name" 
      }, 
      { 
       "Value": "DBJanitor", 
       "Key": "Owner" 
      }, 
      { 
       "Value": "Database", 
       "Key": "Product" 
      }, 
      { 
       "Value": "DB Janitor", 
       "Key": "Portfolio" 
      }, 
      { 
       "Value": "DB Service", 
       "Key": "Service" 
      } 
     ], 
      "VolumeType": "standard", 
      "VolumeId": "vol-049df61146c4d7901", 
      "State": "in-use", 
      "SnapshotId": "snap-1234567890abcdef0", 
      "CreateTime": "2013-12-18T22:35:00.084Z", 
      "Size": 8 
     }, 
     { 
      "AvailabilityZone": "us-east-1a", 
      "Attachments": [], 
      "VolumeType": "io1", 
      "VolumeId": "vol-1234567890abcdef0", 
      "State": "available", 
      "Iops": 1000, 
      "SnapshotId": null, 
      "CreateTime": "2014-02-27T00:02:41.791Z", 
      "Size": 100 
     } 
    ] 
} 

到現在爲止..這就是我試圖在...蟒蛇:

asg_list_json_Tags=asg_list_json["AutoScalingGroups"] 
Tags=pandas.DataFrame(asg_list_json_Tags) 
n = [] 
for i in Tags.columns: 
    n.append(i) 
print n 

engine = create_engine("mysql+mysqldb://user:"+'pwd'+"@mysqlserver/dbname") 
Tags.to_sql(name='TableName', con=engine, if_exists='append', index=True) 

任何幫助深深appreacated ..謝謝!

+0

似乎是什麼問題?爲什麼該代碼不起作用? –

+0

所以我得到一個錯誤,說不能插入字符串字典 – DataJanitor

+0

@DataJanitor,你想存儲__flatten__數據嗎? – MaxU

回答

3

我會做這種方式:

fn = r'D:\temp\.data\40450591.json' 

with open(fn) as f: 
    data = json.load(f) 

# some of your records seem NOT to have `Tags` key, hence `KeyError: 'Tags'` 
# let's fix it 
for r in data['Volumes']: 
    if 'Tags' not in r: 
     r['Tags'] = [] 

v = pd.DataFrame(data['Volumes']).drop(['Attachments', 'Tags'],1) 
a = pd.io.json.json_normalize(data['Volumes'], 'Attachments', ['VolumeId'], meta_prefix='parent_') 
t = pd.io.json.json_normalize(data['Volumes'], 'Tags', ['VolumeId'], meta_prefix='parent_') 

v.to_sql('volume', engine) 
a.to_sql('attachment', engine) 
t.to_sql('tag', engine) 

輸出:

In [179]: v 
Out[179]: 
         AvailabilityZone    CreateTime Iops Size    SnapshotId  State VolumeType 
VolumeId 
vol-049df61146c4d7901  us-east-1a 2013-12-18T22:35:00.084Z  NaN  8 snap-1234567890abcdef0  in-use standard 
vol-1234567890abcdef0  us-east-1a 2014-02-27T00:02:41.791Z 1000.0 100     None available  io1 

In [180]: a 
Out[180]: 
       AttachTime DeleteOnTermination  Device   InstanceId  State    VolumeId  parent_VolumeId 
0 2013-12-18T22:35:00.000Z    True /dev/sda1 i-1234567890abcdef0 attached vol-049df61146c4d7901 vol-049df61146c4d7901 
1 2013-12-18T22:35:11.000Z    True /dev/sda1 i-1234567890abcdef1 attached vol-049df61146c4d7111 vol-049df61146c4d7901 

In [217]: t 
Out[217]: 
     Key    Value  parent_VolumeId 
0  Name DBJanitor-Private vol-049df61146c4d7901 
1  Owner   DBJanitor vol-049df61146c4d7901 
2 Product   Database vol-049df61146c4d7901 
3 Portfolio   DB Janitor vol-049df61146c4d7901 
4 Service   DB Service vol-049df61146c4d7901 

測試JSON文件:

{ 
    "Volumes": [ 
     { 
      "AvailabilityZone": "us-east-1a", 
      "Attachments": [ 
       { 
        "AttachTime": "2013-12-18T22:35:00.000Z", 
        "InstanceId": "i-1234567890abcdef0", 
        "VolumeId": "vol-049df61146c4d7901", 
        "State": "attached", 
        "DeleteOnTermination": true, 
        "Device": "/dev/sda1" 
       }, 
       { 
        "AttachTime": "2013-12-18T22:35:11.000Z", 
        "InstanceId": "i-1234567890abcdef1", 
        "VolumeId": "vol-049df61146c4d7111", 
        "State": "attached", 
        "DeleteOnTermination": true, 
        "Device": "/dev/sda1" 
       } 
      ], 
      "Tags": [ 
       { 
        "Value": "DBJanitor-Private", 
        "Key": "Name" 
       }, 
       { 
        "Value": "DBJanitor", 
        "Key": "Owner" 
       }, 
       { 
        "Value": "Database", 
        "Key": "Product" 
       }, 
       { 
        "Value": "DB Janitor", 
        "Key": "Portfolio" 
       }, 
       { 
        "Value": "DB Service", 
        "Key": "Service" 
       } 
      ], 
      "VolumeType": "standard", 
      "VolumeId": "vol-049df61146c4d7901", 
      "State": "in-use", 
      "SnapshotId": "snap-1234567890abcdef0", 
      "CreateTime": "2013-12-18T22:35:00.084Z", 
      "Size": 8 
     }, 
     { 
      "AvailabilityZone": "us-east-1a", 
      "Attachments": [], 
      "VolumeType": "io1", 
      "VolumeId": "vol-1234567890abcdef0", 
      "State": "available", 
      "Iops": 1000, 
      "SnapshotId": null, 
      "CreateTime": "2014-02-27T00:02:41.791Z", 
      "Size": 100 
     } 
    ] 
} 
+0

謝謝@MaxU真的很感激它!如果有任何問題,我會將其插入並解決問題 – DataJanitor

+0

擊敗我!使用匹配的Ids來獲取一對多數據庫規範化的好方法。 – Parfait

+0

@Parfait,謝謝! – MaxU