Quick Start Guide

The finnts package, commonly referred to as “Finn”, is a standardized times series forecast framework developed by Microsoft Finance. It’s a result of years of effort trying to perfect a centralized forecasting practice that everyone in finance could leverage. Even though it was built for finance like forecasts, it can easily be extended to any type of time series forecast.

Finn takes years of hard work and thousands of lines of code, and simplifies the forecasting process down to one line of code. A single function, “forecast_time_series”, takes in historical data and applies dozens of models to produce a state of the art forecast. While simplifying the forecasting process down to a single function call might seem limiting, Finn actually allows for a lot of flexibility under the hood. In order to leverage the best components of Finn, please check out all of the other vignettes within the package.

library(finnts)

browseVignettes("finnts")

Getting started with Finn is as simple as 1..2..3

1. Bring Data

Data used in Finn needs to follow a few requirements, called out below.

A good example to use when producing your first Finn forecast is to leverage existing data examples from the timetk package. Let’s take a monthly example and trim it down to speed up the run time of your first Finn forecast.

library(finnts)

hist_data <- timetk::m4_monthly %>%
  dplyr::filter(date >= "2013-01-01") %>%
  dplyr::rename(Date = date) %>%
  dplyr::mutate(id = as.character(id))

print(hist_data)
#> # A tibble: 120 x 3
#>    id    Date       value
#>    <chr> <date>     <dbl>
#>  1 M1    2013-01-01  9120
#>  2 M1    2013-02-01  8280
#>  3 M1    2013-03-01  7860
#>  4 M1    2013-04-01  7150
#>  5 M1    2013-05-01  8110
#>  6 M1    2013-06-01 10860
#>  7 M1    2013-07-01 10730
#>  8 M1    2013-08-01  9610
#>  9 M1    2013-09-01  8270
#> 10 M1    2013-10-01  9200
#> # i 110 more rows

print(unique(hist_data$id))
#> [1] "M1"    "M2"    "M750"  "M1000"

The above data set contains 4 individual time series, identified using the “id” column.

2. Create Finn Forecast

Before we call the Finn forecast function. Let’s first set up some run information using set_run_info(), this helps log all components of our Finn forecast successfully.


run_info <- set_run_info(
  experiment_name = "finn_forecast", 
  run_name = "test_run"
)

Calling the “forecast_time_series” function is the easiest part. In this example we will be running just two models.


# no need to assign it to a variable, since all of the outputs are written to disk :)
forecast_time_series(
  run_info = run_info,
  input_data = hist_data,
  combo_variables = c("id"),
  target_variable = "value",
  date_type = "month",
  forecast_horizon = 3,
  back_test_scenarios = 6, 
  models_to_run = c("arima", "ets"), 
  return_data = FALSE
)

3. Get Forecast Outputs

Initial Finn Outputs


finn_output_tbl <- get_forecast_data(run_info = run_info)

print(finn_output_tbl)

Future Forecast


future_forecast_tbl <- finn_output_tbl %>%
  dplyr::filter(Run_Type == "Future_Forecast")

print(future_forecast_tbl)

Back Test Results

back_test_tbl <- finn_output_tbl %>%
  dplyr::filter(Run_Type == "Back_Test")

print(back_test_tbl)

Back Test Best Model per Time Series

best_model_tbl <- finn_output_tbl %>%
  dplyr::filter(Best_Model == "Yes") %>%
  dplyr::select(Combo, Model_ID, Model_Name, Model_Type, Recipe_ID) %>%
  dplyr::distinct()

print(best_model_tbl)

Note: the best model for the “M1” combination is a simple average of “arima” and “ets” models.

Trained Models

trained_model_tbl <- get_trained_models(run_info = run_info)

print(trained_model_tbl)

Initial Prepped Data

R1_prepped_data_tbl <- get_prepped_data(run_info = run_info, 
                                        recipe = "R1")

print(R1_prepped_data_tbl)

R2_prepped_data_tbl <- get_prepped_data(run_info = run_info, 
                                        recipe = "R2")

print(R2_prepped_data_tbl)

Run Info Metadata

run_info_tbl <- get_run_info(experiment_name = "finn_forecast")

print(run_info_tbl)