k-means算法不起作用

我想在Python 3中使用Numpy實現k-means算法。我的輸入數據矩陣是一個點的簡單的N×2矩陣數據：k-means算法不起作用

[[1, 2], 
[3, 4], 
    ... 
[7, 13]]

出於某種原因，在迭代的每個步驟，沒有我的標籤是相同的。每一個標籤都是不同的。有人看到我在做什麼明顯的錯誤嗎？我試圖給我的代碼添加一些評論，以便人們可以瞭解我正在做的各種步驟。

def kmeans(X,k): 

    # Initialize by choosing k random data points as centroids 
    num_features = X.shape[1] 
    centroids = X[np.random.randint(X.shape[0], size=k), :] # find k centroids 
    iterations = 0 
    old_labels, labels = [], [] 

    while not should_stop(old_labels, labels, iterations): 
     iterations += 1 

     clusters = [[] for i in range(0,k)] 
     for i in range(k): 
      clusters[i].append(centroids[i]) 

     # Label points 
     old_labels = labels 
     labels = [] 
     for point in X: 
      distances = [np.linalg.norm(point-centroid) for centroid in centroids] 
      max_centroid = np.argmax(distances) 
      labels.append(max_centroid) 
      clusters[max_centroid].append(point) 

     # Compute new centroids 
     centroids = np.empty(shape=(0,num_features)) 
     for cluster in clusters: 
      avgs = sum(cluster)/len(cluster) 
      centroids = np.append(centroids, [avgs], axis=0) 

    return labels 

def should_stop(old_labels, labels, iterations): 
    count = 0 
    if len(old_labels) == 0: 
     return False 
    for i in range(len(labels)): 
     count += (old_labels[i] != labels[i]) 
    print(count) 
    if old_labels == labels or iterations == 2000: 
     return True 
    return False

來源

2016-11-25 Apollo

max_centroid = np.argmax(distances)

你想找到的距離，而不是最大化它的一個最小的質心。

來源

2016-11-25 23:20:16 broncoAbierto

呃 - 非常感謝。 – Apollo

k-means算法不起作用

回答

相關問題