2017-08-23 2660 views
-1

我想獲取我的k-means結果數據框的熵,並且得到錯誤:TypeError:'numpy.int32'對象不可迭代 I不明白爲什麼。Python類型錯誤:'numpy.int32'對象是不可迭代的

from collections import Counter 
def calcEntropy(x): 
    p, lens = Counter(x), np.float(len(x)) 
    return -np.sum(count/lens*np.log2(count/lens) for count in p.values()) 
k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']] 

,然後我得到的錯誤信息:

<ipython-input-26-d375ecf00330> in <module>() 
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']] 

<ipython-input-26-d375ecf00330> in <listcomp>(.0) 
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']] 

<ipython-input-23-f5508ea8782c> in calcEntropy(x) 
     1 from collections import Counter 
     2 def calcEntropy(x): 
----> 3  p, lens = Counter(x), np.float(len(x)) 
     4  return -np.sum(count/lens*np.log2(count/lens) for count in p.values()) 

/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in __init__(*args, **kwds) 
    535    raise TypeError('expected at most 1 arguments, got %d' % len(args)) 
    536   super(Counter, self).__init__() 
--> 537   self.update(*args, **kwds) 
    538 
    539  def __missing__(self, key): 

/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in update(*args, **kwds) 
    622      super(Counter, self).update(iterable) # fast path when counter is empty 
    623    else: 
--> 624     _count_elements(self, iterable) 
    625   if kwds: 
    626    self.update(kwds) 

TypeError: 'numpy.int32' object is not iterable 

k_means_sp.head() 

     credit debit cluster 
0 9.207673 8.198884 1 
1 4.248495 8.202181 0 
2 8.149668 7.735145 2 
3 5.138677 7.859741 0 
4 8.058163 7.918614 2 
+1

假設'k_means_sp'持有'numpy.int32',那麼你就會''numpy.int32'傳遞給'Counter'。 'Counter'應該採用'iterable'。 –

+0

這是什麼意思,我應該使羣集列爲cluster = [0,1,2]和y = iter(cluster),還是我這樣做完全錯了?謝謝! – bananablue1

+0

@ bananablue1這意味着你不能像當前寫入的那樣將一個整數傳遞給'calcEntropy'。正確的事情取決於你的目標。如果你想'calcEntropy'使用整數(這是否有意義?),那麼你應該修復它,如果你想傳遞別的東西到'calcEntropy'然後傳遞其他東西,等等。 – Goyo

回答

0

確定這是第一次嘗試。看起來您的數據框存儲了'cluster'列中的羣集索引。所以,你需要做的是讓基於索引的每個羣集,然後傳遞集羣您calcEntropy功能,像

for i in xrange(len(k_means_sp['cluster'].unique())) # loop thru cluster indices: 
    cluster = k_means_sp.ix[k_means_sp['cluster'] == i][['credit', 'debit']] 
    entropy = calcEntropy(cluster) 

第二行濾掉行只具有相同集羣的那些指數。這有幫助嗎?

+0

是的非常感謝你! – bananablue1

相關問題