2017-06-05 74 views
1

我正在尋求兩個特定的幫助點 1)如何創建一個列表給定我的數據庫(all.df)的列表 2)如何矢量化這個列表上的功能如何創建列表並向其執行矢量化函數

我試圖在使用Prophet庫的客戶/產品級別生成預測。我正在努力引導操作。 我目前運行一個for循環,我想避免並加速我的計算。

數據進行了分析

set.seed(1123) 
df1 <- data.frame(
    Date  = seq(dmy("01/01/2017"), by = "day", length.out = 365*2), 

    Customer = "a", 
    Product = "xxx", 
    Revenue = sample(1:100, 365*2, replace=TRUE)) 


df2 <- data.frame(
    Date  = seq(dmy("01/01/2017"), by = "day", length.out = 365*2), 

    Customer = "a", 
    Product = "yyy", 
    Revenue = sample(25:200, 365*2, replace=TRUE)) 


df3 <- data.frame( 
    Date  = seq(dmy("01/01/2017"), by = "day", length.out = 365*2), 

    Customer = "b", 
    Product = "xxx", 
    Revenue = sample(1:100, 365*2, replace=TRUE)) 



df4 <- data.frame( 
    Date  = seq(dmy("01/01/2017"), by = "day", length.out = 365*2), 

    Customer = "b", 
    Product = "yyy", 
    Revenue = sample(25:200, 365*2, replace=TRUE)) 

all.df <- rbind(df1, df2, df3, df4) 

這是我的預測功能

daily_forecast <- function(df, forecast.days = 365){ 


# fit actuals into prophet 
m <- prophet(df, 
      yearly.seasonality = TRUE, 
      weekly.seasonality = TRUE, 
      changepoint.prior.scale = 0.55) # default value is 0.05 

# create dummy data frame to hold prodictions 
future <- make_future_dataframe(m, periods = forecast.days, freq = "day") 

# run the prediction 
forecast <- predict(m, future) 

### Select the date and forecast from the model and then merge with actuals 
daily_fcast  <- forecast %>% select(ds, yhat) %>% dplyr::rename(Date = ds, fcast.daily = yhat) 
actual.to.merge <- df %>% dplyr::rename(Date = ds, Actual.Revenue = y) 
daily_fcast  <- merge(actual.to.merge, daily_fcast, all = TRUE) 

} 

目前,我在同一時間通過一個客戶/產品配合使用for循環

x <- df1 %>% select(-c(Customer, Product)) %>% 
    dplyr::rename(ds = Date, y = Revenue) %>% 
    daily_forecast() 

我想不是,vectorise整個操作:

1 - 創建名單列表,即通過拆分all.df:

一)產品,然後

二)客戶

2,然後請有上述

在列表1中創建的列表中daily_forecast功能圖)我非常喜歡使用功能出purrr

回答

3

這裏是我會怎麼做你與purrr問什麼:

library(tidyverse) 
library(lubridate) 
library(prophet) 

res <- 
    all.df %>% 
    split(.$Customer) %>% 
    map(~ split(.x, .x$Product)) %>% 
    at_depth(2, select, ds = Date, y = Revenue) %>% 
    at_depth(2, daily_forecast) 
str(res) 
# List of 2 
# $ a:List of 2 
# ..$ xxx:'data.frame': 1095 obs. of 3 variables: 
# .. ..$ Date   : Date[1:1095], format: "2017-01-01" ... 
# .. ..$ Actual.Revenue: int [1:1095] 76 87 87 56 83 17 19 72 92 35 ... 
# .. ..$ fcast.daily : num [1:1095] 55.9 57.9 51.9 51.9 54 ... 
# ..$ yyy:'data.frame': 1095 obs. of 3 variables: 
# .. ..$ Date   : Date[1:1095], format: "2017-01-01" ... 
# .. ..$ Actual.Revenue: int [1:1095] 62 87 175 186 168 190 30 192 119 170 ... 
# .. ..$ fcast.daily : num [1:1095] 121 121 119 119 116 ... 
# $ b:List of 2 
# ..$ xxx:'data.frame': 1095 obs. of 3 variables: 
# .. ..$ Date   : Date[1:1095], format: "2017-01-01" ... 
# .. ..$ Actual.Revenue: int [1:1095] 71 94 81 32 85 59 59 55 50 50 ... 
# .. ..$ fcast.daily : num [1:1095] 51.9 54.2 54.5 53.1 51.9 ... 
# ..$ yyy:'data.frame': 1095 obs. of 3 variables: 
# .. ..$ Date   : Date[1:1095], format: "2017-01-01" ... 
# .. ..$ Actual.Revenue: int [1:1095] 105 46 153 136 59 59 34 72 70 85 ... 
# .. ..$ fcast.daily : num [1:1095] 103.3 103.3 103.1 103.1 91.5 ... 

但是下面會更自然的我(在數據幀藏在心裏):

res_2 <- 
    all.df %>% 
    rename(ds = Date, y = Revenue) %>% 
    nest(ds, y) %>% 
    transmute(Customer, Product, res = map(data, daily_forecast)) %>% 
    unnest() 
# # A tibble: 4,380 × 5 
# Customer Product  Date Actual.Revenue fcast.daily 
#  <fctr> <fctr>  <date>   <int>  <dbl> 
# 1   a  xxx 2017-01-01    76 55.93109 
# 2   a  xxx 2017-01-02    87 57.92577 
# 3   a  xxx 2017-01-03    87 51.92263 
# 4   a  xxx 2017-01-04    56 51.86267 
# 5   a  xxx 2017-01-05    83 54.04588 
# 6   a  xxx 2017-01-06    17 52.75289 
# 7   a  xxx 2017-01-07    19 52.35083 
# 8   a  xxx 2017-01-08    72 53.91887 
# 9   a  xxx 2017-01-09    92 55.81202 
# 10  a  xxx 2017-01-10    35 49.78302 
# # ... with 4,370 more rows 
+0

謝謝!愛你的第二個解決方案,它非常優雅,並有自然流動。我希望我可以多次投票答覆這個答案! – cephalopod