如何在Python中複製另一個字典中的唯一鍵和值

我有一個數據框df，其中可以重複列Col中的值。我使用計數器dictionary1來計算每個Col值的頻率，然後我想對數據的子集運行for循環並獲取值pit。我想創建一個新的字典dict1，其中密鑰是dictionary1的密鑰，值是pit的值。這是我到目前爲止的代碼：如何在Python中複製另一個字典中的唯一鍵和值

dictionary1 = Counter(df['Col']) 
dict1 = defaultdict(int) 

for i in range(len(dictionary1)):  
    temp = df[df['Col'] == dictionary1.keys()[i]] 
    b = temp['IsBuy'].sum() 
    n = temp['IsBuy'].count() 
    pit = b/n 
    dict1[dictionary1.keys()[i]] = pit

我的問題是，我該如何分配基礎上的dictionary1的關鍵，並從pit計算得到的值dict1鍵和值。換句話說，在上面的腳本中編寫最後一行代碼的正確方法是什麼。

謝謝。

來源

2014-11-23 roland

由於您使用的是pandas，所以我應該指出您遇到的問題已經足夠普遍，以至於有一種內置的方法可以實現。我們稱之爲將「類似」數據收集到組中，然後對它們執行操作groupby操作。這可能是wortwhile閱讀組的教程部分由split-apply-combine成語 - 有很多你可以做的整潔的事情！

的pandorable方式計算pit值會像

df.groupby("Col")["IsBuy"].mean()

例如：

>>> # make dummy data 
>>> N = 10**4 
>>> df = pd.DataFrame({"Col": np.random.randint(1, 10, N), "IsBuy": np.random.choice([True, False], N)}) 
>>> df.head() 
    Col IsBuy 
0 3 False 
1 6 True 
2 6 True 
3 1 True 
4 5 True 
>>> df.groupby("Col")["IsBuy"].mean() 
Col 
1  0.511709 
2  0.495697 
3  0.489796 
4  0.510658 
5  0.507491 
6  0.513183 
7  0.522936 
8  0.488688 
9  0.490498 
Name: IsBuy, dtype: float64

它，如果你堅持，你可以變成一個字典從系列：

>>> df.groupby("Col")["IsBuy"].mean().to_dict() 
{1: 0.51170858629661753, 2: 0.49569707401032703, 3: 0.48979591836734693, 4: 0.51065801668211308, 5: 0.50749063670411987, 6: 0.51318267419962338, 7: 0.52293577981651373, 8: 0.48868778280542985, 9: 0.49049773755656106}

來源

2014-11-23 00:22:56 DSM

thanks @DSM！這工作完美，並且沒有必要做for循環。 – roland 2014-11-23 01:43:52

如何在Python中複製另一個字典中的唯一鍵和值

回答

相關問題