2017-06-01 83 views
1

我正在實現K均值聚類算法。到目前爲止,這是我:RuntimeError:在cmp中超出最大遞歸深度:K意味着集羣

import copy 
import csv 
import math 
import random 
import sys 


class Centroid(): 
    def __init__(self, coordinates, _id): 
     self.id = _id 
     self.coordinates = coordinates 
     self.elements = [] 

    def __repr__(self): 
     return 'Centroid: ' + str(self.id) 

    @property 
    def count(self): 
     return len(self.elements) 



    def recalculate_coordinates(self): 
     x = [sum(y)/len(y) for y in zip(*self.elements)] 
     self.coordinates = x 

    def reset_elements(self): 
     self.previous_elements = [] 
     for el in self.elements: 
      self.previous_elements.append(el) 
     self.elements = [] 

class Kmeans(): 
    def __init__(self): 
    self.k = int(sys.argv[2]) 
    self.prepare_data() 
    self.iterations = 0 

    def prepare_data(self): 
    filename = sys.argv[1] 
    self.dataset = [] 
    with open(filename, 'rb') as csvfile: 
     reader = csv.reader(csvfile, delimiter=' ') 
     for row in reader: 
      tuplified = tuple(map(float, row)) 
      self.dataset.append(tuplified) 
    self.create_centroids() 

    def create_centroids(self): 
    self.centroids = [] 
    for i in xrange(self.k): 
     chosen = random.choice(self.dataset) 
     cent = Centroid(chosen, i+1) 
     self.centroids.append(cent) 

def main(): 
    k = Kmeans() 
    def iterate(k): 
    k.iterations += 1 
    for item in k.dataset: 
     candidates = [] 
     for centroid in k.centroids: 
      z = zip(item, centroid.coordinates) 
      squares = map(lambda x: (x[0]-x[1])**2, z) 
      added = sum(squares) 
      edistance = math.sqrt(added) 
      candidates.append((centroid, edistance)) 
     winner = min(candidates, key=lambda x: x[1]) 
     winner[0].add_element(item) 
    for centroid in k.centroids: 
     centroid.reset_elements() 
     centroid.recalculate_coordinates() 

    status_list = [] 
    for centroid in k.centroids: 
     boole = sorted(centroid.elements) == sorted(centroid.previous_elements) 
     status_list.append(boole) 

    if False in status_list: 
     iterate(k) 
    print k.centroids 
    print k.iterations 
    iterate(k) 


if __name__ == '__main__': 
    main() 

不過,我不斷收到一個錯誤RuntimeError: maximum recursion depth exceeded in cmp。我嘗試了幾次重構,但都沒有成功。任何人都可以告訴我可能是什麼問題。先謝謝你。

+0

一些縮進是錯誤的,而且有一些相關的代碼失蹤。我沒有看到你向我們展示的任何東西遞歸。 –

+0

它是在'def iterate'中的第三行。 – theFarkle

+0

異常發生在哪一行? – Billy

回答

0

如果錯誤是在這條線:

boole = sorted(centroid.elements) == sorted(centroid.previous_elements) 

什麼是最有可能發生的是,你有內centroids.elementscentroids.previous_elements循環引用,所以比較操作(均sorted呼叫和==執行)繼續循環遍歷每個列表。

的這種行爲(在Python 3)一個簡單的演示:

>>> x = [] 
>>> y = [x] 
>>> x.append(y) 
>>> x == y 
Traceback (most recent call last) 
    .... 
    x == y  
RecursionError: maximum recursion depth exceeded in comparison 
+0

謝謝你是這個問題。不能upvote它雖然因爲代表 – theFarkle

+0

但你可以接受:) – Billy

相關問題