2017-09-25 191 views
0

我有一個返回JSON數據如下網址:蟒蛇json.loads到大熊貓數據幀

{ 
    u 'fields': [{ 
      u 'keyField': False, 
      u 'name': u '_blockid', 
      u 'fieldType': u 'long' 
     }, { 
      u 'keyField': False, 
      u 'name': u '_collector', 
      u 'fieldType': u 'string' 
     }, { 
      u 'keyField': False, 
      u 'name': u '_collectorid', 
      u 'fieldType': u 'long' 
     }, { 
      u 'keyField': False, 
      u 'name': u '_messageid', 
      u 'fieldType': u 'long' 
     } 
    ], 
    u 'messages': [{ 
      u 'map': { 
       u '_messageid': u '-9223368783568280026', 
       u '_collectorid': u '135927517', 
       u '_blockid': u '-9223372036519990555', 
       u '_collector': u 'collector1', 
      } 
     }, { 
      u 'map': { 
       u '_messageid': u '-92233645345280026', 
       u '_collectorid': u '13545342517', 
       u '_blockid': u '-92234254242343219990555', 
       u '_collector': u 'collector2', 
      } 
     } 
    ] 
} 

這是一個片段。真正的JSON包含在[「消息」] [「地圖」]

上千個值我有一個運行如下

rJSON = requests.get(JsonURL, auth=(username, password)) 
DATA = json.loads(rJSON.text) 
for x in DATA[u'messages']: 
    print type(x[u'map']) 
    for i in x[u'map']: 
     print np.isscalar(x[u'map'][i]) 

    df = pd.DataFrame.from_dict(x[u'map']) 
    break ### TESTING ### 

此輸出以下

<type 'dict'> 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 

--------------------------------------------------------------------------- 
ValueError        Traceback (most recent call last) 
<ipython-input-151-1b71c28d4d83> in <module>() 
    11  for i in x[u'map']: 
    12   print np.isscalar(q[i]) 
---> 13  df = pd.DataFrame.from_dict(x[u'map']) 
    14 
    15  #if isinstance(msgData, pd.DataFrame): # If the variable is a dataframe, append to it... 

C:\Users\USERID\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.pyc in from_dict(cls, data, orient, dtype) 
    849    raise ValueError('only recognize index or columns for orient') 
    850 
--> 851   return cls(data, index=index, columns=columns, dtype=dtype) 
    852 
    853  def to_dict(self, orient='dict'): 

C:\Users\USERID\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.pyc in __init__(self, data, index, columns, dtype, copy) 
    273         dtype=dtype, copy=copy) 
    274   elif isinstance(data, dict): 
--> 275    mgr = self._init_dict(data, index, columns, dtype=dtype) 
    276   elif isinstance(data, ma.MaskedArray): 
    277    import numpy.ma.mrecords as mrecords 

C:\Users\USERID\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.pyc in _init_dict(self, data, index, columns, dtype) 
    409    arrays = [data[k] for k in keys] 
    410 
--> 411   return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype) 
    412 
    413  def _init_ndarray(self, values, index, columns, dtype=None, copy=False): 

C:\Users\USERID\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.pyc in _arrays_to_mgr(arrays, arr_names, index, columns, dtype) 
    5494  # figure out the index, if necessary 
    5495  if index is None: 
-> 5496   index = extract_index(arrays) 
    5497  else: 
    5498   index = _ensure_index(index) 

C:\Users\USERID\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.pyc in extract_index(data) 
    5533 
    5534   if not indexes and not raw_lengths: 
-> 5535    raise ValueError('If using all scalar values, you must pass' 
    5536        ' an index') 
    5537 

ValueError: If using all scalar values, you must pass an index 

我明白了一個腳本它瘋了,因爲字典包含標量值,但我無法弄清楚爲什麼它們被json.loads()作爲標量加載到字典中,或者如何將它們從標量轉換爲字符串。

我的最終目標是將所有['messages'] ['map']數據和pd.concat在循環中放入一個我可以分析的大數據框中。

是否有可能阻止json.loads加載它們作爲標量?或者有沒有辦法將它們從標量轉換爲可以加載到數據框中的其他東西?

+0

嘗試'東方='index''參數? – ako

回答

0

消息在數據字典的列表,你可以用DataFrame.from_records加載它,然後使用apply(pd.Series)到內部字典轉換爲最終數據幀的行:

pd.DataFrame.from_records(data['messages']).map.apply(pd.Series) 

#     _blockid _collector _collectorid   _messageid 
#0  -9223372036519990555 collector1 135927517 -9223368783568280026 
#1 -92234254242343219990555 collector2 13545342517 -92233645345280026 
+1

謝謝!!!!那樣做了! – user3246693