如何將嵌套字典轉換爲數據框？

我有一個嵌套字典。這是一些納斯達克類型的數據。就像這樣：如何將嵌套字典轉換爲數據框？

{'CLSN':  
Date  Open High Low Close Volume Adj Close             
2015-12-31 1.92 1.99 1.87 1.92 79600  1.92 
2016-01-04 1.93 1.99 1.87 1.93 39700  1.93 
2016-01-05 1.89 1.94 1.85 1.90 50200  1.90, 
'CCC':  
Date   Open  High  Low  Close Volume Adj Close                
2015-12-31 17.270000 17.389999 17.120001 17.250000 177200 16.965361 
2016-01-04 17.000000 17.219999 16.600000 17.180000 371600 16.896516 
2016-01-05 17.190001 17.530001 17.059999 17.450001 417500 17.162061, 
}

爲了幫助您瞭解，這是關鍵其次值和值是數據框！

在問之前，我嘗試了pd.Panel(nas)['CLSN']的方式，所以我確定它的值是一個數據幀。但方式pd.Panel(nas).to_frame().reset_index()並沒有幫助我！它輸出一個空數據框，其中包含數千列填充股票名稱的列。

現在，它的困擾，我想這樣的數據幀：

index Date  Open  High  Low  Close  Volume  Adj Close           CLSN 2015-12-31 1.92  1.99  1.87  1.92  79600.0 1.92 
CLSN 2016-01-01 NaN  NaN  NaN  NaN  NaN  NaN 
ClSN 2016-01-04 1.93  1.99  1.87  1.93  39700.0 1.93 
CCC 2015-12-31 17.270000 17.389999 17.120001 17.250000 177200.0 16.965361 
CCC 2016-01-04 17.000000 17.219999 16.600000 17.180000 371600.0 16.896516 
CCC 2016-01-05 17.190001 17.530001 17.059999 17.450001 417500.0 17.162061

當然，我可以使用for循環得到每隻股票的數據幀，但它殺死我加入他們。

你有更好的主意嗎？非常願意知道！

要MaxU：使用方法print(nas['CLSN'].head())後，其輸出，如：

  Open High Low Close Volume Adj Close 
Date             
2015-12-31 1.92 1.99 1.87 1.92 79600  1.92 
2016-01-04 1.93 1.99 1.87 1.93 39700  1.93 
2016-01-05 1.89 1.94 1.85 1.90 50200  1.90 
2016-01-06 1.86 1.89 1.77 1.78 62100  1.78 
2016-01-07 1.75 1.80 1.75 1.77 117000  1.77

來源

2017-04-09 Pan Kevin

更新：

假設t帽子Date是一個索引（而不是常規列）：

來源詞典：

In [70]: d2 
Out[70]: 
{'CCC':     Open  High  Low  Close Volume Adj Close 
Date 
2015-12-31 17.270000 17.389999 17.120001 17.250000 177200 16.965361 
2016-01-04 17.000000 17.219999 16.600000 17.180000 371600 16.896516 
2016-01-05 17.190001 17.530001 17.059999 17.450001 417500 17.162061, 
'CLSN':    Open High Low Close Volume Adj Close 
Date 
2015-12-31 1.92 1.99 1.87 1.92 79600  1.92 
2016-01-04 1.93 1.99 1.87 1.93 39700  1.93 
2016-01-05 1.89 1.94 1.85 1.90 50200  1.90}

解決方案：

In [73]: pd.Panel(d2).swapaxes(0, 2).to_frame().reset_index(level=0).sort_index() 
Out[73]: 
      Date  Open  High  Low  Close Volume Adj Close 
minor 
CCC 2015-12-31 17.270000 17.389999 17.120001 17.250000 177200.0 16.965361 
CCC 2016-01-04 17.000000 17.219999 16.600000 17.180000 371600.0 16.896516 
CCC 2016-01-05 17.190001 17.530001 17.059999 17.450001 417500.0 17.162061 
CLSN 2015-12-31 1.920000 1.990000 1.870000 1.920000 79600.0 1.920000 
CLSN 2016-01-04 1.930000 1.990000 1.870000 1.930000 39700.0 1.930000 
CLSN 2016-01-05 1.890000 1.940000 1.850000 1.900000 50200.0 1.900000

或者你可以離開Date爲索引的一部分：

In [74]: pd.Panel(d2).swapaxes(0, 2).to_frame().sort_index() 
Out[74]: 
         Open  High  Low  Close Volume Adj Close 
Date  minor 
2015-12-31 CCC 17.270000 17.389999 17.120001 17.250000 177200.0 16.965361 
      CLSN 1.920000 1.990000 1.870000 1.920000 79600.0 1.920000 
2016-01-04 CCC 17.000000 17.219999 16.600000 17.180000 371600.0 16.896516 
      CLSN 1.930000 1.990000 1.870000 1.930000 39700.0 1.930000 
2016-01-05 CCC 17.190001 17.530001 17.059999 17.450001 417500.0 17.162061 
      CLSN 1.890000 1.940000 1.850000 1.900000 50200.0 1.900000

OLD答案 - 它假定Date是一個普通的列（非指數）試試這個：

In [59]: pd.Panel(d).swapaxes(0, 2).to_frame().reset_index('major', drop=True).sort_index() 
Out[59]: 
      Date Open High Low Close Volume Adj Close 
minor 
CCC 2015-12-31 17.27 17.39 17.12 17.25 177200 16.9654 
CCC 2016-01-04  17 17.22 16.6 17.18 371600 16.8965 
CCC 2016-01-05 17.19 17.53 17.06 17.45 417500 17.1621 
CLSN 2015-12-31 1.92 1.99 1.87 1.92 79600  1.92 
CLSN 2016-01-04 1.93 1.99 1.87 1.93 39700  1.93 
CLSN 2016-01-05 1.89 1.94 1.85 1.9 50200  1.9

其中d是您nested dictionary：

In [60]: d 
Out[60]: 
{'CCC':   Date  Open  High  Low  Close Volume Adj Close 
0 2015-12-31 17.270000 17.389999 17.120001 17.250000 177200 16.965361 
1 2016-01-04 17.000000 17.219999 16.600000 17.180000 371600 16.896516 
2 2016-01-05 17.190001 17.530001 17.059999 17.450001 417500 17.162061, 
'CLSN':   Date Open High Low Close Volume Adj Close 
0 2015-12-31 1.92 1.99 1.87 1.92 79600  1.92 
1 2016-01-04 1.93 1.99 1.87 1.93 39700  1.93 
2 2016-01-05 1.89 1.94 1.85 1.90 50200  1.90}

來源

2017-04-09 11:46:54 MaxU

我按照你所說的做了，但是它返回一個錯誤：'KeyError：'找不到主要級別'，我很難理解你的代碼是什麼意思，'swapaxes（0，2）'，看起來我沒有有'major'的定義。 –

@PanKevin，您也可以使用'reset_index（level = 0，drop = True）'。但是這很奇怪，因爲我在做'reset_index（）'之後會期望一個'major'列... ... – MaxU

太棒了！它所輸出是次要的，但日期一欄是missing..And它就像'小的開盤價最高價最低價收盤價成交量關閉一個41.900002 42.349998 41.720001 41.810001 1449300.0 41.357005 一個37.369999 37.950001 37.000000 37.689999 2666200.0 37.281641 一個37.400002 38.029999 37.400002 37.610001 1831200.0 ADJ 37.202510 A 40.240002 40.990002 40.049999 40.730000 2103600.0 40.288705'，你能解釋一下嗎？ –

也許pandas.concat是你在找什麼：

In [8]: data = dict(A=pd.DataFrame([[1,2], [3,4]], columns=['X', 'Y']), 
        B=pd.DataFrame([[1,2], [3,4]], columns=['X', 'Y']),) 

In [9]: data 
Out[9]: 
{'A': X Y 
0 1 2 
1 3 4, 
'B': X Y 
0 1 2 
1 3 4} 

In [10]: pd.concat(data) 
Out[10]: 
    X Y 
A 0 1 2 
    1 3 4 
B 0 1 2 
    1 3 4

來源

2017-04-09 10:40:30 Garrett

謝謝你，但是Python運行過長時間輸出結果。我使用這個：'df = pd.concat（pd.Panel（nas）[k] for k in nas.keys（））'，它永遠在運行。 –

如何將嵌套字典轉換爲數據框？

回答

相關問題