1
我想在Python 3中使用Numpy實現k-means算法。我的輸入數據矩陣是一個點的簡單的N×2矩陣數據:k-means算法不起作用
[[1, 2],
[3, 4],
...
[7, 13]]
出於某種原因,在迭代的每個步驟,沒有我的標籤是相同的。每一個標籤都是不同的。有人看到我在做什麼明顯的錯誤嗎?我試圖給我的代碼添加一些評論,以便人們可以瞭解我正在做的各種步驟。
def kmeans(X,k):
# Initialize by choosing k random data points as centroids
num_features = X.shape[1]
centroids = X[np.random.randint(X.shape[0], size=k), :] # find k centroids
iterations = 0
old_labels, labels = [], []
while not should_stop(old_labels, labels, iterations):
iterations += 1
clusters = [[] for i in range(0,k)]
for i in range(k):
clusters[i].append(centroids[i])
# Label points
old_labels = labels
labels = []
for point in X:
distances = [np.linalg.norm(point-centroid) for centroid in centroids]
max_centroid = np.argmax(distances)
labels.append(max_centroid)
clusters[max_centroid].append(point)
# Compute new centroids
centroids = np.empty(shape=(0,num_features))
for cluster in clusters:
avgs = sum(cluster)/len(cluster)
centroids = np.append(centroids, [avgs], axis=0)
return labels
def should_stop(old_labels, labels, iterations):
count = 0
if len(old_labels) == 0:
return False
for i in range(len(labels)):
count += (old_labels[i] != labels[i])
print(count)
if old_labels == labels or iterations == 2000:
return True
return False
呃 - 非常感謝。 – Apollo