2017-08-10 216 views
0

我如何獲得並繪製R中泊松分佈的百分位數?基本上,我想創建一個圖表,顯示x軸的年份(即年),泊松分佈的第50百分位數(中位數)作爲y軸上的一條線。我的示例數據和腳本如下。如何繪製R中泊松分佈的百分位數?

dt<-structure(list(yr = 1979:2008, cn = c(9, 15, 17, 11, 9, 10, 8, 
10, 18, 12, 11, 14, 12, 16, 10, 13, 9, 9, 11, 11, 14, 14, 10, 
11, 14, 15, 14, 12, 9, 12), `inn` = c(1.12666666666667, 1.35666666666667, 
-0.0533333333333333, -0.166666666666667, 0.213333333333333, -0.0533333333333333, 
-1.32, 0.0633333333333333, -0.22, 0.01, -0.456666666666667, -1.01, 
-0.326666666666667, 0.0233333333333334, -0.496666666666667, -1.24, 
0.2, -0.46, 0.32, 0.63, 0.466666666666667, -0.0233333333333333, 
0.33, 0.503333333333333, 0.0566666666666667, -0.396666666666667, 
0.58, -0.596666666666667, 0.98, 1.01666666666667)), .Names = c("yr", 
"cn", "inn"), row.names = c(NA, -30L), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), vars = "yr", drop = TRUE, indices = list(
0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 
14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 
26L, 27L, 28L, 29L), group_sizes = c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), biggest_group_size = 1L, labels = structure(list(
yr = 1979:2008), row.names = c(NA, -30L), class = "data.frame", vars = "yr", drop = TRUE, .Names = "yr")) 

model=glm(dt$cn~dt$inn, family=poisson(link=log)) 
summary(model) 
model$fitted 

## create the plot 
P<-ggplot(dt, aes(x = yr)) + 
    geom_point(aes(y = cn)) 
P 
+0

你想如何繪製它們?也像一個不同顏色的點......連接這些手段的線? –

+0

glm函數不估計中位數。您可能需要使用分位數迴歸。目前還不清楚你打算如何處理模型,因爲你的圖只有「Y值」而不是模型中的獨立變量。是否有意在「X值」的中位數處使用預測? –

回答

1

認爲以下是你需要的代碼

# prepare a single dataframe containing all the information 
dataPlot = data.frame(x = dt$yr # x values 
         , y = dt$cn # y points 
         , q875 = qpois(0.875, model$fitted) # upper bound of 75% confidence interval 
         , q625 = qpois(0.625, model$fitted) # upper bound of 25% confidence interval 
         , q50 = qpois(0.50, model$fitted) # median 
         , q375 = qpois(0.375, model$fitted) # lower bound of 25% confidence interval 
         , q125 = qpois(0.125, model$fitted) # lower bound of 75% confidence interval 
      ); 

# create the plot object 
P <- ggplot(dataPlot, aes(x = x),) + # add data and set x-axis 
      geom_ribbon(aes(ymin=q125, ymax=q875), fill = "gray") + # color the area of 75% confidence interval (the area is colored between `ymin` and`ymax` - see ?geom_ribbon 
      geom_ribbon(aes(ymin=q375, ymax=q625), fill = "lightgray") + # color the area of 25 confidence interval 
      geom_point(aes(y = y)) + # add the points 
      geom_line(aes(y = q50)) # add median lines 
# and plot it 
P 

與您的數據帶寬是兩連敗長方形,下面的正常隨機數據將讓你看到一個情節是更類似於你的照片情節

m = rexp(NROW(dt), rate=2) # generate random means 
s = rexp(NROW(dt), rate=5) # generate random standard deviations 
dataPlot = data.frame(x = dt$yr # your x-values 
        , y = rnorm(NROW(dt), mean=m, sd=s) # random y-values 
        , q875 = qnorm(0.875, mean=m, sd=s) # from now on, see previous comments 
        , q625 = qnorm(0.625, mean=m, sd=s) 
        , q50 = qnorm(0.50, mean=m, sd=s) 
        , q375 = qnorm(0.375, mean=m, sd=s) 
        , q125 = qnorm(0.125, mean=m, sd=s) 
      ) 
+0

謝謝,這就是我一直在尋找:) – Cirrus