2017-06-06 87 views
0

如何在mgcv::gam中預測何時適合可能含有隨機效應的模型?你是如何預測gam?帶有可重複的示例

本網站上的 「排除」 招另一個線程不適合我(https://stats.stackexchange.com/questions/131106/predicting-with-random-effects-in-mgcv-gam


ya <- rnorm(100, 0, 1) 
yb <- rnorm(100,0,1.5) 
yc <- rnorm(100, 0, 2) 
yd <- rnorm(100, 0, 2.5) 

yy <- c(ya,yb,yc,yd) #so, now we've got data from 4 different groups. 
xx <- c(rep("a", 100), rep("b",100), rep("c",100),rep("d",100)) #groups 
zz <- rnorm(400,0,1) #some other covariate 

model <- gam(yy ~ zz + s(xx, bs = "re")) #the model 

predictdata <- data.frame(zz = 5) #new data 
predict(model, newdata = predictdata, exclude = "s(xx)") #prediction 

工作,這將產生錯誤

Error in model.frame.default(ff, data = newdata, na.action = na.act) : 
    variable lengths differ (found for 'xx') 
In addition: Warning messages: 
1: In predict.gam(model, newdata = predictdata, exclude = "s(xx)") : 
    not all required variables have been supplied in newdata! 

2: 'newdata' had 1 row but variables found have 400 rows 

我mgcv包最新的。

編輯:

如果更改predictdata到

predictdata <- data.frame(zz = 5, xx = "f") 

然後它說

Error in predict.gam(model, newdata = predictdata, exclude = "s(xx)") : 
    f not in original fit 

回答

0

我嘗試用你的榜樣,它似乎是 '排除' 語句不工作,甚至儘管您必須在新數據值中指定包含在用於擬合模型的原始數據集中的隨機效果。但是,這讓我有點不安。另一個警告是,'排除'似乎沒有工作在一個模型的方差結構,這是一個單獨的組估計(我試着用另一個數據集),即像s(xx,s =「re」,by =組)。您可能希望發佈帖子或將問題移至交叉驗證的位置,以便其他統計人員/分析人員可以看到它,或許可以提供更好的答案。

以下是我的代碼。請注意,我改變了組a和d的方法,但整體平均值應該在零附近。

ya <- rnorm(100, 1, 1) 
yb <- rnorm(100, 0,1.5) 
yc <- rnorm(100, 0, 2) 
yd <- rnorm(100, -1, 2.5) 

yy <- c(ya,yb,yc,yd) #so, now we've got data from 4 different groups. 
xx <- c(rep("a", 100), rep("b",100), rep("c",100),rep("d",100)) #groups 
zz <- rnorm(400,0,1) #some other covariate 

some.data= data.frame(yy,xx,zz) 
model <- gam(yy ~ zz + s(xx, bs = "re"),data=some.data) #the model 


# the intercept is the overall mean when zz is zero 
summary(model) 

predictdata <- data.frame(zz = c(0,0,0,0), xx =c("a","b","c","d")) #new data 

#excluding random effects. Estimate should be the same for all and should be the intercept 
predict(model, newdata = predictdata, exclude = "s(xx)") 

#including random effects. Estimates should differ by group with 'a' larger and 'd' smaller 
predict(model, newdata = predictdata)