2013-05-06 97 views
2

我有一個計算Python中每個鍵的不同值的問題。蟒蛇字典的唯一值計數

我有一個字典d像

[{"abc":"movies"}, {"abc": "sports"}, {"abc": "music"}, {"xyz": "music"}, {"pqr":"music"}, {"pqr":"movies"},{"pqr":"sports"}, {"pqr":"news"}, {"pqr":"sports"}] 

我需要單獨打印每每個鍵的不同值的數目。

這意味着我將要打印

abc 3 
xyz 1 
pqr 4 

請幫助。

謝謝

+2

你的意思是你有字典的名單?還是它複製不正確? – thegrinner 2013-05-06 20:05:39

+1

這不是一本字典,它至多是一個字典列表(它只包含一個鍵/值對) - 真的嗎?這是什麼樣的數據結構?我猜它實際上是'[{「abc」:「電影」},...,對吧? – 2013-05-06 20:05:55

+0

@TimPietzcker沒錯。對不起,代表性錯誤 – user1189851 2013-05-06 20:06:55

回答

7

使用collections.Counter() instance,一些鏈接在一起:

from collections import Counter 
from itertools import chain 

counts = Counter(chain.from_iterable(e.keys() for e in d)) 

這保證了在你的輸入列表中有多個鍵的字典進行正確計數。

演示:

>>> from collections import Counter 
>>> from itertools import chain 
>>> d = [{"abc":"movies"}, {"abc": "sports"}, {"abc": "music"}, {"xyz": "music"}, {"pqr":"music"}, {"pqr":"movies"},{"pqr":"sports"}, {"pqr":"news"}, {"pqr":"sports"}] 
>>> Counter(chain.from_iterable(e.keys() for e in d))Counter({'pqr': 5, 'abc': 3, 'xyz': 1}) 

或與輸入的詞典多個鍵:

>>> d = [{"abc":"movies", 'xyz': 'music', 'pqr': 'music'}, {"abc": "sports", 'pqr': 'movies'}, {"abc": "music", 'pqr': 'sports'}, {"pqr":"news"}, {"pqr":"sports"}] 
>>> Counter(chain.from_iterable(e.keys() for e in d))                    Counter({'pqr': 5, 'abc': 3, 'xyz': 1}) 

Counter()具有附加的,有益的功能,例如,該目錄排序的元件,反向其計數.most_common() method訂購:

for key, count in counts.most_common(): 
    print '{}: {}'.format(key, count) 

# prints 
# 5: pqr 
# 3: abc 
# 1: xyz 
+0

請注意,計數器](http://docs.python.org/2/library/collections.html#collections.Counter)類是在Python 2.7中引入的。有[backport](http://code.activestate.com/recipes/576611-counter-class/)。我想你[知道這件事](Martijn)(http://stackoverflow.com/a/13311111/566644)。 – 2013-05-06 20:14:50

+0

@ LauritzV.Thaulow:以其他方式作爲[backport for 2.5 and 2.6](http://code.activestate.com/recipes/576611-counter-class/)。 – 2013-05-06 20:17:20

+0

...或者你可以在int中使用'defaultdict'。 – 2013-05-06 20:18:23

2
>>> d = [{"abc":"movies"}, {"abc": "sports"}, {"abc": "music"}, {"xyz": "music"}, 
... {"pqr":"music"}, {"pqr":"movies"},{"pqr":"sports"}, {"pqr":"news"}, 
... {"pqr":"sports"}] 
>>> from collections import Counter 
>>> counts = Counter(key for dic in d for key in dic.keys()) 
>>> counts 
Counter({'pqr': 5, 'abc': 3, 'xyz': 1}) 
>>> for key in counts: 
...  print (key, counts[key]) 
... 
xyz 1 
abc 3 
pqr 5 
3

什麼你所描述的 - 與每個鍵多個值的列表 - 可以由下面得到更好的可視化這樣的:

{'abc': ['movies', 'sports', 'music'], 
'xyz': ['music'], 
'pqr': ['music', 'movies', 'sports', 'news'] 
} 

在這種情況下,你必須做一些更多的工作要插入:

看到
  1. 查找鍵,如果它已經存在
    • 如果不存在,創建具有價值[](空單)新的密鑰
  2. 檢索VALU E(與鑰匙相關聯的列表)
  3. 使用if value in,看是否被檢查的值存在於列表
  4. 如果新值不在,.append()

這也導致了簡單的方法來統計存儲的元素總數:

# Pseudo-code 
for myKey in myDict.keys(): 
    print "{0}: {1}".format(myKey, len(myDict[myKey]) 
1

使用collections.Counter。假設您有一個項目詞典的列表...

from collections import Counter 
listOfDictionaries = [{'abc':'movies'}, {'abc':'sports'}, {'abc':'music'}, 
    {'xyz':'music'}, {'pqr':'music'}, {'pqr':'movies'}, 
    {'pqr':'sports'}, {'pqr':'news'}, {'pqr':'sports'}] 
Counter(list(dict)[0] for dict in zzz) 
4

不需要使用計數器。您可以通過這種方式實現:

# input dictionary 
d=[{"abc":"movies"}, {"abc": "sports"}, {"abc": "music"}, {"xyz": "music"}, {"pqr":"music"}, {"pqr":"movies"},{"pqr":"sports"}, {"pqr":"news"}, {"pqr":"sports"}] 

# fetch keys 
b=[j[0] for i in d for j in i.items()] 

# print output 
for k in list(set(b)): 
    print "{0}: {1}".format(k, b.count(k)) 
+0

這比使用計數器更快。 – akashdeep 2013-05-08 08:49:36

+0

是的,計數器有一些性能問題http://stackoverflow.com/questions/27801945/surprising-results-with-python-timeit-counter-vs-defaultdict-vs-dict – sashab 2015-07-09 10:45:10

1

大廈@akashdeep解決方案,它採用了一套,但給出了一個錯誤的結果,因爲沒有在問題中提到的「明顯」的要求計算(pqr應該是4,不是5 )。

# dictionary 
d=[{"abc":"movies"}, {"abc": "sports"}, {"abc": "music"}, {"xyz": "music"}, {"pqr":"music"}, {"pqr":"movies"},{"pqr":"sports"}, {"pqr":"news"}, {"pqr":"sports"}] 

# merged dictionary 
c = {} 
for i in d: 
    for k,v in i.items(): 
     try: 
      c[k].append(v) 
     except KeyError: 
      c[k] = [v] 

# counting and printing 
for k,v in c.items(): 
    print "{0}: {1}".format(k, len(set(v))) 

這會給出正確的:

xyz: 1 
abc: 3 
pqr: 4