2017-03-15 79 views
4

我正在使用Facebook發佈的新軟件包Prophet。它做時間序列預測,我想按組應用這個函數。使用先知包以R組中的數據框預測

向下滾動至R部分。

https://facebookincubator.github.io/prophet/docs/quick_start.html

這是我的嘗試:

grouped_output = df %>% group_by(group) %>% 
    do(m = prophet(df[,c(1,3)])) %>% 
    do(future = make_future_dataframe(m, period = 7)) %>% 
    do(forecast = prophet:::predict.prophet(m, future)) 

grouped_output[[1]] 

然後我需要提取從我有麻煩做每個組的列表中的結果。

下面是我沒有組原始數據框:

ds <- as.Date(c('2016-11-01','2016-11-02','2016-11-03','2016-11-04', 
        '2016-11-05','2016-11-06','2016-11-07','2016-11-08', 
        '2016-11-09','2016-11-10','2016-11-11','2016-11-12', 
        '2016-11-13','2016-11-14','2016-11-15','2016-11-16', 
        '2016-11-17','2016-11-18','2016-11-19','2016-11-20', 
        '2016-11-21','2016-11-22','2016-11-23','2016-11-24', 
        '2016-11-25','2016-11-26','2016-11-27','2016-11-28', 
        '2016-11-29','2016-11-30')) 
y <- c(15,17,18,19,20,54,67,23,12,34,12,78,34,12,3,45,67,89,12,111,123,112,14,566,345,123,567,56,87,90) 
y<-as.numeric(y) 
df <- data.frame(ds, y) 

df 

      ds y 
1 2016-11-01 15 
2 2016-11-02 17 
3 2016-11-03 18 
4 2016-11-04 19 
5 2016-11-05 20 
6 2016-11-06 54 
7 2016-11-07 67 
8 2016-11-08 23 
9 2016-11-09 12 
10 2016-11-10 34 
11 2016-11-11 12 
12 2016-11-12 78 
13 2016-11-13 34 
14 2016-11-14 12 
15 2016-11-15 3 
16 2016-11-16 45 
17 2016-11-17 67 
18 2016-11-18 89 
19 2016-11-19 12 
20 2016-11-20 111 
21 2016-11-21 123 
22 2016-11-22 112 
23 2016-11-23 14 
24 2016-11-24 566 
25 2016-11-25 345 
26 2016-11-26 123 
27 2016-11-27 567 
28 2016-11-28 56 
29 2016-11-29 87 
30 2016-11-30 90 

目前功能工作時,我就做一個組,如下所示:

#install.packages('prophet') 
library(prophet) 
m<-prophet(df) 
future <- make_future_dataframe(m, period = 7) 
forecast <- prophet:::predict.prophet(m, future) 

forecast$yhat 
[1] -2.649032 -29.762095 128.169781 59.573684 -11.623727 107.473617 -29.949730 -42.862455 -62.378408 104.797639 46.868610 
[12] -12.502864 119.282058 -4.914921 -4.402638 -10.643570 169.309505 123.321261 74.734746 215.856347 99.290218 105.508059 
[23] 102.882915 284.245984 237.401258 185.688202 321.466962 197.451536 194.280518 180.535663 349.304365 288.684031 222.337210 
[34] 342.968499 203.648851 185.377165 

我現在想改變這種做法,它將prophet:::predict函數應用於每個組。因此,新的數據幀由組看起來是這樣的:

ds <- as.Date(c('2016-11-01','2016-11-02','2016-11-03','2016-11-04', 
      '2016-11-05','2016-11-06','2016-11-07','2016-11-08', 
      '2016-11-09','2016-11-10','2016-11-11','2016-11-12', 
      '2016-11-13','2016-11-14','2016-11-15','2016-11-16', 
      '2016-11-17','2016-11-18','2016-11-19','2016-11-20', 
      '2016-11-21','2016-11-22','2016-11-23','2016-11-24', 
      '2016-11-25','2016-11-26','2016-11-27','2016-11-28', 
      '2016-11-29','2016-11-30', 


      '2016-11-01','2016-11-02','2016-11-03','2016-11-04', 
      '2016-11-05','2016-11-06','2016-11-07','2016-11-08', 
      '2016-11-09','2016-11-10','2016-11-11','2016-11-12', 
      '2016-11-13','2016-11-14','2016-11-15','2016-11-16', 
      '2016-11-17','2016-11-18','2016-11-19','2016-11-20', 
      '2016-11-21','2016-11-22','2016-11-23','2016-11-24', 
      '2016-11-25','2016-11-26','2016-11-27','2016-11-28', 
      '2016-11-29','2016-11-30')) 
y <- c(15,17,18,19,20,54,67,23,12,34,12,78,34,12,3,45,67,89,12,111,123,112,14,566,345,123,567,56,87,90, 
    45,23,12,10,21,34,12,45,12,44,87,45,32,67,1,57,87,99,33,234,456,123,89,333,411,232,455,55,90,21) 
y<-as.numeric(y) 

group<-c("A","A","A","A","A","A","A","A","A","A","A","A","A","A","A", 
    "A","A","A","A","A","A","A","A","A","A","A","A","A","A","A", 
    "B","B","B","B","B","B","B","B","B","B","B","B","B","B","B", 
    "B","B","B","B","B","B","B","B","B","B","B","B","B","B","B") 
df <- data.frame(ds,group, y) 

df 

      ds group y 
1 2016-11-01  A 15 
2 2016-11-02  A 17 
3 2016-11-03  A 18 
4 2016-11-04  A 19 
5 2016-11-05  A 20 
6 2016-11-06  A 54 
7 2016-11-07  A 67 
8 2016-11-08  A 23 
9 2016-11-09  A 12 
10 2016-11-10  A 34 
11 2016-11-11  A 12 
12 2016-11-12  A 78 
13 2016-11-13  A 34 
14 2016-11-14  A 12 
15 2016-11-15  A 3 
16 2016-11-16  A 45 
17 2016-11-17  A 67 
18 2016-11-18  A 89 
19 2016-11-19  A 12 
20 2016-11-20  A 111 
21 2016-11-21  A 123 
22 2016-11-22  A 112 
23 2016-11-23  A 14 
24 2016-11-24  A 566 
25 2016-11-25  A 345 
26 2016-11-26  A 123 
27 2016-11-27  A 567 
28 2016-11-28  A 56 
29 2016-11-29  A 87 
30 2016-11-30  A 90 
31 2016-11-01  B 45 
32 2016-11-02  B 23 
33 2016-11-03  B 12 
34 2016-11-04  B 10 
35 2016-11-05  B 21 
36 2016-11-06  B 34 
37 2016-11-07  B 12 
38 2016-11-08  B 45 
39 2016-11-09  B 12 
40 2016-11-10  B 44 
41 2016-11-11  B 87 
42 2016-11-12  B 45 
43 2016-11-13  B 32 
44 2016-11-14  B 67 
45 2016-11-15  B 1 
46 2016-11-16  B 57 
47 2016-11-17  B 87 
48 2016-11-18  B 99 
49 2016-11-19  B 33 
50 2016-11-20  B 234 
51 2016-11-21  B 456 
52 2016-11-22  B 123 
53 2016-11-23  B 89 
54 2016-11-24  B 333 
55 2016-11-25  B 411 
56 2016-11-26  B 232 
57 2016-11-27  B 455 
58 2016-11-28  B 55 
59 2016-11-29  B 90 
60 2016-11-30  B 21 

如何我預測使用prophet包中,y帽子的組,而不是總?

回答

4

這是一個解決方案,使用tidyr::nest來逐組數據,使用purrr::map將模型擬合到這些組中,然後根據請求檢索y-hat。 我把你的代碼,但它併入mutate調用,將使用purrr::map計算新的列。

library(prophet) 
library(dplyr) 
library(purrr) 
library(tidyr) 

d1 <- df %>% 
    nest(-group) %>% 
    mutate(m = map(data, prophet)) %>% 
    mutate(future = map(m, make_future_dataframe, period = 7)) %>% 
    mutate(forecast = map2(m, future, predict)) 

這裏是在這一點上輸出:

d1 
# A tibble: 2 × 5 
    group    data   m    future 
    <fctr>   <list>  <list>    <list> 
1  A <tibble [30 × 2]> <S3: list> <data.frame [36 × 1]> 
2  B <tibble [30 × 2]> <S3: list> <data.frame [36 × 1]> 
# ... with 1 more variables: forecast <list> 

然後我用unnest()forecast列檢索數據並選擇所要求的y值的帽子。

d <- d1 %>% 
    unnest(forecast) %>% 
    select(ds, group, yhat) 

這裏是爲新預測值輸出:

d %>% group_by(group) %>% 
    top_n(7, ds) 
Source: local data frame [14 x 3] 
Groups: group [2] 

      ds group  yhat 
     <date> <fctr>  <dbl> 
1 2016-11-30  A 180.53422 
2 2016-12-01  A 349.30277 
3 2016-12-02  A 288.68215 
4 2016-12-03  A 222.33501 
5 2016-12-04  A 342.96654 
6 2016-12-05  A 203.64625 
7 2016-12-06  A 185.37395 
8 2016-11-30  B 131.07827 
9 2016-12-01  B 222.83703 
10 2016-12-02  B 236.33555 
11 2016-12-03  B 145.41001 
12 2016-12-04  B 228.59687 
13 2016-12-05  B 162.49244 
14 2016-12-06  B 68.44477 
+0

我不知道我是否應該使用'地圖(男,〜預測(.X,將來))'或'MAP2(男,未來,預測(.x,.y))?他們似乎在這裏給出了相同的輸出。 – FlorianGD

+0

我應該使用'map2',同樣的結果是optained,因爲我有一個變量在我的會話中稱爲未來 – FlorianGD

+0

這很好,謝謝。有一件事,雖然這並不適用於我,我不得不改變對於其他人看這是,預測不適合我,我用'先知::: predict.prophet'取代'預測' –

1

我一直在尋找對同一問題的解決方案。我想出了下面的代碼,這比接受的答案簡單一些。

library(tidyr) 
library(dplyr) 
library(prophet) 

data = df %>% 
     group_by(group) %>% 
     do(predict(prophet(.), make_future_dataframe(prophet(.), periods = 7))) %>% 
     select(ds, group, yhat) 

這裏是預測值

data %>% group_by(group) %>% 
     top_n(7, ds) 

# A tibble: 14 x 3 
# Groups: group [2] 
      ds group  yhat 
     <date> <fctr> <dbl> 
1 2016-12-01  A 316.9709 
2 2016-12-02  A 258.2153 
3 2016-12-03  A 196.6835 
4 2016-12-04  A 346.2338 
5 2016-12-05  A 208.9083 
6 2016-12-06  A 216.5847 
7 2016-12-07  A 206.3642 
8 2016-12-01  B 230.0424 
9 2016-12-02  B 268.5359 
10 2016-12-03  B 190.2903 
11 2016-12-04  B 312.9019 
12 2016-12-05  B 266.5584 
13 2016-12-06  B 189.3556 
14 2016-12-07  B 168.9791