2014-09-02 53 views
0

多個參數,我想計算的加權平均比因素與下面的代碼sapply與r中

factor <- factor(cut(var1, quantile(var1, seq(0,1,0.1)))) 
var2_split = split(vat2, factor) 
weight_split = split(weight, factor) 
sapply(var2_split, weighted.mean, weight_split) 

我收到以下錯誤

Error in FUN(X[[1L]], ...) : 'x' and 'w' must have the same length 

如何格式化我的矢量和權重sapply?

作爲示例

假設我有3列的x,y,z,其中x是一組目標值的矩陣M,Y是一組權重,且z是一組值的在其上我想bucket.mean(x,y)。具體而言,我希望weighted.mean(x,y)以z的四分位數爲基礎。

# Code that doesn't work 

x <- c(1,2,3,4,5,6) 
y <- c(6,3,4,2,3,4) 
z <- c(1,1,2,3,3,4) 
m <- as.matrix(c(x,y,z),nrow=6,ncol=3)) 
# bucket z by quartile. 
z.factor <- cut(z, quantile(z, seq(0,1,0.25)), include.lowest=TRUE) 
x.split = split(x, z.factor) 
y.split = split(y, z.factor) 
# want to bucket weighted.mean(x,y) on quartiles of z 
sapply(x.split, weighted.mean, y.split) 
+2

一次只能掃描一個矢量/列表。如果您想同時迭代allong var2_split和weight_split,請嘗試使用「mapply」或「Map」。如果你提供一個[可重現的例子](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)與你的問題,會給出更具體的答案會更容易。 – MrFlick 2014-09-02 19:21:00

+0

以上示例是否適用於mapply? – user196711 2014-09-03 18:50:10

+1

上面的示例不是[可重現](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)。提供一些示例輸入數據(即'var1','vat2','weight'等),這樣就可以運行並測試看起來像您實際輸入的數據。 – MrFlick 2014-09-03 18:52:31

回答

0

與您的特定樣本,嘗試

#first, note the include.lowest=TRUE to get all values 
z.factor <- factor(cut(z, quantile(z, seq(0,1,0.25)), include.lowest=TRUE)) 

#same 
x.split = split(x, z.factor) 
y.split = split(y, z.factor) 

# here we use mapply 
mapply(weighted.mean, x.split, y.split) 

這給

[1,1.25] (1.25,2.5] (2.5,3]  (3,4] 
1.333333 3.000000 4.600000 6.000000 

這似乎是正確的給你的樣品輸入。

+0

太好了,謝謝。 – user196711 2014-09-03 20:13:26