如何計算給定數據結構中列的平均值？

我有以下數據結構ds：如何計算給定數據結構中列的平均值？

{('AD', 'TYPE_B', 'TYPE_D'): [array([84.0, 85.0, 115.0], dtype=object), array([31.0, 23.0, 599.0], dtype=object), array([75.0, 21.0, nan], dtype=object), array([59.0, 52.0, 29.0], dtype=object)],('AD', 'TYPE_A', 'TYPE_N'): [array([84.0, 85.0, 115.0], dtype=object), array([31.0, 23.0, 599.0], dtype=object), array([75.0, 21.0, 300.0], dtype=object), array([59.0, 52.0, 29.0], dtype=object)]}

我需要在第一列，第二列和每每個鍵（即('AD', 'TYPE_B', 'TYPE_D')和('AD', 'TYPE_A', 'TYPE_N')）第3列來估計平均值。

array([75.0, 21.0, nan]像有些陣列包含nan串，我想通過0

例如替代，對於鍵('AD', 'TYPE_B', 'TYPE_D')以下結果應達到（解釋步步）：

步驟1：

84.0 85.0 115.0 
31.0 23.0 599.0 
75.0 21.0 nan 
59.0 52.0 29.0

步驟2：

84.0 85.0 115.0 
31.0 23.0 599.0 
75.0 21.0 0 
59.0 52.0 29.0

步驟3（最終結果）：

('AD', 'TYPE_B', 'TYPE_D'): [62.25, 45.25, 185.75]

來源

2017-02-17 Dinosaurius

雖然您並不需要兩個步驟，但您的方法似乎是合理的。你有什麼嘗試，你卡在哪裏？ – zwer

使用內置函數從numpy的。

import numpy as np 

ds = {('AD', 'TYPE_B', 'TYPE_D'): [np.array([84.0, 85.0, 115.0], dtype=object), 
            np.array([31.0, 23.0, 599.0], dtype=object), 
            np.array([75.0, 21.0, np.nan], dtype=object), 
            np.array([59.0, 52.0, 29.0], dtype=object)], 
     ('AD', 'TYPE_A', 'TYPE_N'): [np.array([84.0, 85.0, 115.0], dtype=object), 
            np.array([31.0, 23.0, 599.0], dtype=object), 
            np.array([75.0, 21.0, 300.0], dtype=object), 
            np.array([59.0, 52.0, 29.0], dtype=object)]} 

for key in ds.keys(): 
    #first cast to float and replace nan 
    item = np.nan_to_num(np.asarray(ds[key], dtype=np.float64)); 
    #calculate the mean 
    mean = np.mean(item, axis=0) 
    #store it in the dictionary 
    ds[key] = mean 

print ds

來源

2017-02-17 13:47:24 Marco

將單個'object'數組轉換爲2d'float'數組是一個關鍵步驟。當元素是「對象」時，'nan'替換不起作用。 – hpaulj

如何計算給定數據結構中列的平均值？

回答

相關問題