2013-05-02 163 views
2

這個問題可能太小巧了,但我仍然無法弄清楚如何正確地做到這一點。計算數組中的相同元素並創建字典

我有一個給定的數組[0,0,0,0,0,0,1,1,2,1,0,0,0,0,1,0,1,2,1,0,2,3](從0-5的任意元素),我想有一個計數器在一行中發生的零。

1 times 6 zeros in a row 
1 times 4 zeros in a row 
2 times 1 zero in a row 

=> (2,0,0,1,0,1) 

所以字典由n*0值作爲索引和計數器作爲值。

最終數組由500多萬個未分類的值組成,如上所述。

+0

=>(2,0,0,1,0,1)???這與連續的6個,4個和1個零有什麼關係? – dansalmo 2013-05-02 15:18:37

回答

2

這應該得到你想要的東西:

import numpy as np 

a = [0,0,0,0,0,0,1,1,2,1,0,0,0,0,1,0,1,2,1,0,2,3] 

# Find indexes of all zeroes 
index_zeroes = np.where(np.array(a) == 0)[0] 

# Find discontinuities in indexes, denoting separated groups of zeroes 
# Note: Adding True at the end because otherwise the last zero is ignored 
index_zeroes_disc = np.where(np.hstack((np.diff(index_zeroes) != 1, True)))[0] 

# Count the number of zeroes in each group 
# Note: Adding 0 at the start so first group of zeroes is counted 
count_zeroes = np.diff(np.hstack((0, index_zeroes_disc + 1))) 

# Count the number of groups with the same number of zeroes 
groups_of_n_zeroes = {} 
for count in count_zeroes: 
    if groups_of_n_zeroes.has_key(count): 
     groups_of_n_zeroes[count] += 1 
    else: 
     groups_of_n_zeroes[count] = 1 

groups_of_n_zeroes認爲:

{1: 2, 4: 1, 6: 1} 
+0

非常感謝!確實它幫助了很多! – nit 2013-05-06 08:48:59

0

這看起來太過複雜,但我似乎無法找到更好的東西:

>>> l = [0, 0, 0, 0, 0, 0, 1, 1, 2, 1, 0, 0, 0, 0, 1, 0, 1, 2, 1, 0, 2, 3] 

>>> import itertools 
>>> seq = [len(list(j)) for i, j in itertools.groupby(l) if i == 0] 
>>> seq 
[6, 4, 1, 1] 

>>> import collections 
>>> counter = collections.Counter(seq) 
>>> [counter.get(i, 0) for i in xrange(1, max(counter) + 1)] 
[2, 0, 0, 1, 0, 1] 
1

與@ fgb's類似,但是對事件計數的處理更加簡單:

items = np.array([0,0,0,0,0,0,1,1,2,1,0,0,0,0,1,0,1,2,1,0,2,3]) 
group_end_idx = np.concatenate(([-1], 
           np.nonzero(np.diff(items == 0))[0], 
           [len(items)-1])) 
group_len = np.diff(group_end_idx) 
zero_lens = group_len[::2] if items[0] == 0 else group_len[1::2] 
counts = np.bincount(zero_lens) 

>>> counts[1:] 
array([2, 0, 0, 1, 0, 1], dtype=int64) 
+0

我喜歡這種只使用numpy功能的方法。非常感謝您的意見! – nit 2013-05-06 08:48:19