--- title: "Exogenous Variables" output: rmarkdown::html_vignette: toc: true toc_depth: 2 vignette: > %\VignetteIndexEntry{Exogenous Variables} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} library(httptest2) .mockPaths("../tests/mocks") start_vignette(dir = "../tests/mocks") original_options <- options("NIXTLA_API_KEY"="dummy_api_key", digits=7) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4 ) ``` ```{r} library(nixtlar) ``` ## 1. Exogenous variables Exogenous variables are external factors that provide additional information about the behavior of the target variable in time series forecasting. These variables, which are correlated with the target, can significantly improve predictions. Examples of exogenous variables include weather data, economic indicators, holiday markers, and promotional sales. `TimeGPT` allows you to include exogenous variables when generating a forecast. This vignette will show you how to include them. It assumes you have already set up your API key. If you haven't done this, please read the [Get Started](https://nixtla.github.io/nixtlar/articles/anomaly-detection.html) vignette first. ## 2. Load data For this vignette, we will use the electricity consumption dataset with exogenous variables included in `nixtlar`. This dataset contains hourly prices from five different electricity markets, along with two exogenous variables related to the prices and binary variables indicating the day of the week. ````{r} df_exo_vars <- nixtlar::electricity_exo_vars head(df_exo_vars) ```` When using exogenous variables, `nixtlar` differentiates between historical and future exogenous variables: - **Historical Exogenous Variables**: These should be included in the input data immediately following the `id_col`, `ds`, and `y` columns. If your dataset contains additional columns that are not exogenous variables, you must remove them before using any core functions of `nixtlar`. - **Future Exogenous Variables**: These correspond to the `X_df` parameter and should cover the entire forecast horizon. This dataset should include columns with the appropriate timestamps and, if available, unique identifiers, formatted as explained in previous sections. ````{r} future_exo_vars <- nixtlar::electricity_future_exo_vars head(future_exo_vars) ```` ## 3. Forecast with exogenous variables To generate a forecast with exogenous variables, use the `nixtla_client_forecast` function as you would for forecasts without them. The only difference is that you must add the future exogenous variables using the `X_df` argument. ````{r} fcst_exo_vars <- nixtla_client_forecast(df_exo_vars, h = 24, X_df = future_exo_vars) head(fcst_exo_vars) ```` For comparison, we will also generate a forecast without the exogenous variables. ````{r} df <- nixtlar::electricity # same dataset but without the exogenous variables fcst <- nixtla_client_forecast(df, h = 24) head(fcst) ```` ## 4. Plot TimeGPT forecast `nixtlar` includes a function to plot the historical data and any output from `nixtla_client_forecast`, `nixtla_client_historic`, `nixtla_client_anomaly_detection` and `nixtla_client_cross_validation`. If you have long series, you can use `max_insample_length` to only plot the last N historical values (the forecast will always be plotted in full). ```{r} nixtla_client_plot(df_exo_vars, fcst_exo_vars, max_insample_length = 500) ``` ```{r, include=FALSE} options(original_options) end_vignette() ```