將顧客包裝成水桶

我有25位顧客。每個客戶都有我們系統的許多用戶，例如客戶1有45個用戶，客戶2有46個用戶...客戶25有1000個用戶。將顧客包裝成水桶

我想將每個客戶分成一個存儲桶，其中每個存儲桶包含大致相同數量的用戶。我知道我總共需要5桶。

（這裏的桶代表服務器，我想分配我的客戶端到不同的服務器，每個服務器的用戶總數大致相等，以防止服務器過載.1客戶端必須在同一臺服務器上即不能拆分1個客戶端超過2個服務器）

任何想法分配客戶桶適當的方法？我認爲一些聚類方法可能工作（我嘗試使用R的kmeans），但我似乎不能找到方法規定每個羣集中的用戶總數大致相同

這是我的R代碼，作爲我迄今爲止所做的一個示例：

#Create dataset 
r <- data.frame(users=c(1000, 960, 920, 870, 850, 700, 600, 550, 520, 500, 420, 400, 390, 300, 210, 200, 160, 80, 70, 50, 49, 48, 47, 46, 45)) 
#Try kmeans clustering 
fit <- kmeans(r, 5) 
#get cluster means 
aggregate(r, by=list(fit$cluster),FUN = mean) 
#append cluster assignment 
r <- data.frame(r,fit$cluster) 

#Plot cluster 
library(cluster) 
clusplot(r, fit$cluster, color=TRUE, shade=TRUE, labels=2, lines=0) 
library(fpc) 
plotcluster(r, fit$cluster)

這會將我的客戶集中到存儲區中，但每個存儲區中的用戶數不會大致相等。

我這個標記作爲的R問題，但如果有其他一些包裝簡單的解決方案，我:-)

來源

2016-11-10 B_Real

我不知道推薦的解決方案對於這種「恆定採樣」是。這裏是我的注意事項 - 對項目進行排序，將其轉換爲矩陣，其中每列表示一個樣本，每隔一行反向。

下面的代碼：

set.seed(1024) 
r <- data.frame(users=c(1000, 960, 920, 870, 850, 700, 600, 550, 520, 500, 420, 400, 390, 300, 210, 200, 160, 80, 70, 50, 49, 48, 47, 46, 45)) 

a<- r$users #runif(n = 25, 100,400) #rnorm(25,100,100) # 1:25 
#hist(a) 
df<- data.frame(id=1:25,x=a) 

# sort 
x<- df$id[order(df$x)] 
# convert to matrix 
#each column of this matrix represetns one sample 
xm<-matrix(x,ncol=5,byrow = T); xm 
oldsum<-apply(matrix(df$x,ncol=5,byrow = T), 2,sum) 

#flip alternate rows of this sorted matrix 
i= 1:nrow(xm) 
im=i[c(F,T)] 
xm[im,] 
xm[im,]<- rev(xm[im,]) 

# new matrix of indeices 
xm 

#hence the new matrix of values 
xm2<- matrix(a[c(xm)],ncol = 5, byrow = F) 
xm 
xm2 

newsum<- (apply(xm2, 2,sum)) 

# improvement 
rbind(oldsum,newsum) 
barplot(rbind(oldsum,newsum)[1,]) 
barplot(rbind(oldsum,newsum)[2,]) 

# each column of following matrix represents one sample 
#(values are indices in original vector a) 
xm

來源

2016-11-10 23:06:47

我認爲這可以做這份工作！我向你傾訴我的帽子。非常感謝你的努力:-) –

很高興知道它的幫助。感謝您的評論。 –

所有的耳朵而不是試圖集羣（它解決了一個非常diffefent問題，即把類似值轉換爲簇）您在這裏有一個經典的bin packing problem變體。

這些往往是NP難，所以找到最佳的解決方案是非常昂貴的。相反，嘗試一個貪婪的策略：估計最小桶大小爲總/桶。處理大小遞減的元素，並始終將其放入可用空間最多的存儲區中。爲了獲得更好的結果，如果這改善了結果，則添加一個優化函數，該函數可在成對桶之間交換元素。如果你有很多小的價值，這樣的策略可能會工作得很好。

來源

2016-11-10 20:56:14

沒錯，這看起來像一個裝箱問題好嗎！感謝您的迴應。 @ R.S的迴應看起來好像會起作用！很高興:-) –

將顧客包裝成水桶

回答

相關問題