預測泊松迴歸

的，我目前工作的一個數據集與模型預測泊松迴歸

glm1 <- glm(FALL ~ GRP + AGE + SEX + offset(log(FU)), family=poisson, data=dat)

現在我需要做的瀑布量的預測在一年對於女性誰是控制小組。

我需要做predict函數，但我不知道如何。我試圖做幾件事，最後嘗試了這一點：

levels(dat$GRP) 
levels(dat$SEX) 
SEX="FEMALE" 
GRP="CONTROL" 
FU="12" 
y<- predict(glm1, type = 'response') 
plot(x=dat$AGE[order(dat$AGE)],y=y[order(dat$FALL)],type='l')

但這隻給我一個奇怪的看起來情節。我需要做什麼？

編輯：上請求數據添加用於再現

dat <- structure(list(FALL = c(0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 1L, 
2L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 
3L, 0L, 1L, 1L, 0L, 0L, 2L, 3L, 0L, 0L, 3L, 1L, 0L, 0L, 2L, 1L, 
2L, 2L, 1L, 1L, 0L, 0L, 0L, 4L, 1L, 0L, 0L, 0L, 0L, 2L, 3L, 1L, 
0L, 1L, 2L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
3L, 4L, 0L, 1L, 0L, 0L, 1L, 1L, 2L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 
1L, 0L, 1L, 0L, 0L, 3L, 0L, 0L, 2L, 0L, 0L, 2L, 0L, 3L, 1L, 0L, 
0L, 1L, 1L, 2L, 1L, 0L, 0L, 0L, 0L, 1L, 0L), GRP = structure(c(1L, 
2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 
2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 
2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 
1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 
2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 1L), .Label = c("CONTROL", "TAI CHI"), class = "factor"), 
FU = c(18, 12, 17, 4, 23, 16, 22, 24, 23, 11, 22, 9, 23, 
8, 20, 17, 23, 17, 15, 17, 19, 21, 22, 16, 14, 21, 20, 21, 
7, 22, 19, 12, 15, 21, 24, 11, 23, 21, 10, 15, 19, 19, 16, 
24, 17, 23, 16, 17, 18, 18, 20, 8, 21, 16, 15, 19, 23, 14, 
13, 6, 16, 18, 9, 7, 16, 14, 16, 18, 13, 12, 15, 22, 17, 
17, 20, 21, 11, 24, 9, 13, 24, 12, 21, 20, 19, 17, 21, 15, 
17, 11, 24, 10, 18, 9, 16, 19, 6, 13, 22, 18, 10, 15, 14, 
21, 21, 5, 24, 21, 11, 23, 21, 16, 22, 6, 24, 18, 21), AGE = c(71, 
81, 71, 79, 77, 79, 76, 86, 75, 75, 76, 83, 71, 80, 77, 79, 
77, 74, 83, 81, 83, 79, 74, 79, 78, 85, 82, 71, 81, 78, 82, 
74, 73, 75, 83, 78, 83, 83, 65, 75, 75, 75, 75, 78, 80, 69, 
80, 73, 74, 79, 76, 78, 70, 77, 77, 76, 84, 71, 73, 76, 80, 
77, 74, 78, 68, 76, 77, 76, 72, 72, 76, 82, 72, 80, 78, 83, 
80, 73, 79, 75, 79, 75, 80, 77, 81, 78, 74, 79, 78, 74, 79, 
77, 77, 85, 79, 73, 78, 73, 70, 68, 74, 82, 75, 77, 77, 73, 
73, 83, 74, 87, 76, 81, 77, 78, 66, 79, 82), SEX = structure(c(1L, 
1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 
2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 
1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 
1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L), .Label = c("FEMALE", 
"MALE"), class = "factor")), .Names = c("FALL", "GRP", "FU", 
"AGE", "SEX"), class = "data.frame", row.names = c(NA, -117L))

此致。

編輯：上置信區間

的問題，我有一個問題。我創建瞭如下的置信區間：

prs <- predict(glm1, newdata = newdat, type = "response", se.fit=TRUE) 
newdat$pred <- prs[[1]] 
newdat$se <- prs[[2]] 
newdat$lo <- newdat$pred - 1.96 * newdat$se 
newdat$up <- newdat$pred + 1.96 * newdat$se

但是可以在同一個圖中繪製這個圖嗎？

來源

2016-12-05 Tessa Dierks

謝謝你的建議，我會加上:) –

我是我們這個：glm1 < - （glm（FALL〜GRP + AGE + SEX + offset（log（FU）），family = poisson，data = dat）） –

Oke非常感謝！我會嘗試，看看它是否有效:) –

當您使用predict時，您需要設置newdata。只需撥打predict而不用newdata就會返回擬合值。所以你的predict電話本質上是讓你glm1$fitted.values。

看，你想要預測SEX == "FEMALE"從GRP == "CONTROL"與FU == 12。使用

## I use `AGE = 65:87` because this is what `range(dat$AGE)` gives 
## we must provide all covariates used in model formula to make `predict` work 
## recycling rule is applied here. 
## `GRP`, `SEX` and `FU` are given a single value, while `AGE` has length 23 
## they will be recycled 23 times 
newdat <- data.frame(AGE = 65:87, GRP = "CONTROL", SEX = "FEMALE", FU = 12) 
pred <- predict(glm1, newdata = newdat, type = "response") 
plot(newdat$AGE, pred, type = "l")

起初，我建議：

newdat <- subset(dat, GRP == "CONTROL" & SEX == "FEMALE" & FU == 12)

但是這是一個壞主意。它會給你一個空的數據框，因爲你的dat中沒有與選擇標準匹配的列。

後續行動（實際上更值得回答比上面）

I have one more question. I created the confidence intervals like this:

prs <- predict(glm1, newdata = newdat, type = "response", se.fit=TRUE) 
newdat$pred <- prs[[1]] 
newdat$se <- prs[[2]] 
newdat$lo <- newdat$pred - 1.96 * newdat$se 
newdat$up <- newdat$pred + 1.96 * newdat$se

But is it possible to plot this in the same graph?

你的置信區間不正確計算。響應不是正態分佈的，所以你不能使用1.96。線性預測器是漸近正態的，所以您需要爲線性預測器生成置信區間，然後使用反向鏈接函數將其轉換爲響應比例。

ginv <- glm1$family$linkinv ## inverse link function 
prs <- predict(glm1, newdata = newdat, type = "link", se.fit=TRUE) 
newdat$pred <- ginv(prs[[1]]) 
newdat$lo <- ginv(prs[[1]] - 1.96 * prs[[2]]) 
newdat$up <- ginv(prs[[1]] + 1.96 * prs[[2]])

要繪製他們在同一個情節，你可以使用plot + lines：

with(newdat, plot(AGE, pred, type = "l", ylim = c(min(lo), max(up)))) 
with(newdat, lines(AGE, lo, lty = 2)) 
with(newdat, lines(AGE, up, lty = 2))

或者，你可以使用matplot：

matplot(newdat[c("pred", "lo", "up")], type = "l", col = 1, lty = c(1, 2, 2))

來源

2016-12-06 06:59:29

啊，非常感謝！這真的解釋了很多！謝謝！！ –

非常感謝！你真的是我的一天！：）謝謝！！ –

我想知道如果我跟進12個月。我是否說FU = 12還是別的？我試圖用0:12來完成，但這並不奏效，而4:12（最低）也不奏效。你也許有任何提示？ –

預測泊松迴歸

回答

相關問題