--- title: "Sobol4R: Sobol indices for stochastic models" shorttitle: "Sobol4R: Sobol indices for stochastic models" author: - name: "Frédéric Bertrand" affiliation: - Cedric, Cnam, Paris email: frederic.bertrand@lecnam.net date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{Sobol4R: Sobol indices for stochastic models} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "figures/sobol-stochastic-", fig.width = 7, fig.height = 5, dpi = 150, message = FALSE, warning = FALSE, eval=FALSE ) LOCAL <- identical(Sys.getenv("LOCAL"), "TRUE") library(Sobol4R) library(sensitivity) set.seed(4669) ``` # Introduction This vignette illustrates the use of the Sobol4R package to compute Sobol indices for deterministic and stochastic versions of the classical Sobol g function. The designs are generated with `sensitivity::sobol` and the models are provided by Sobol4R, in particular - `sobol_g_function` for the deterministic g function - `sobol_g2_additive_noise` for a version of the model with additive Gaussian noise - `sobol_g2_with_covariate_noise` for a version with covariate dependent noise The stochastic models can be combined with the generic quantity of interest wrapper `sobol4r_qoi`. # Deterministic Sobol g function ## Order 1 and Total via `sensitivity::sobol()` ```{r g-det-design, cache=TRUE, eval=LOCAL} n <- 1e4 p <- 8 X1 <- data.frame(matrix(runif(p * n), nrow = n)) X2 <- data.frame(matrix(runif(p * n), nrow = n)) sob_det <- sobol4r_design(X1 = X1, X2 = X2, order = 2, nboot = 50) Y <- sobol_g_function(sob_det$X) sensitivity::tell(sob_det, Y) ``` ```{r g-det-plot, cache=TRUE, eval=LOCAL} print(sob_det) Sobol4R::autoplot(sob_det, ncol = 1) ``` ## Order 1 and Total via `sensitivity::sobol2007()` ```{r, cache=TRUE, eval=LOCAL} n <- 1e4 p <- 8 X1 <- data.frame(matrix(runif(p * n), nrow = n)) X2 <- data.frame(matrix(runif(p * n), nrow = n)) sob_det_2007 <- sobol4r_design( X1 = X1, X2 = X2, nboot = 50, type = "sobol2007" ) Y <- sobol_g_function(sob_det_2007$X) sensitivity::tell(sob_det_2007, Y) ``` ```{r g-det-plot_2007, cache=TRUE, eval=LOCAL} print(sob_det_2007) Sobol4R::autoplot(sob_det_2007) ``` # Random effect on the output We restrict the g function to the first two inputs and add a Gaussian noise term with zero mean and unit variance. ## Order 1 and Total via `sensitivity::sobol()` ```{r g-noise-const-run, cache=TRUE, eval=LOCAL} sob_noise_add <- sobol4r_design(X1 = X1[, 1:2], X2 = X2[, 1:2], order = 2, nboot = 50, type = "sobol") Y <- sobol_g2_additive_noise(sob_noise_add$X) sensitivity::tell(sob_noise_add, Y) ``` ```{r g-noise-const-plot, cache=TRUE, eval=LOCAL} print(sob_noise_add) Sobol4R::autoplot(sob_noise_add) ``` ## Order 1 and Total via `sensitivity::sobol2007()` ```{r g-noise-const-run-2007, cache=TRUE, eval=LOCAL} sob_noise_add <- sobol4r_design(X1 = X1[, 1:2], X2 = X2[, 1:2], nboot = 50, type = "sobol2007") Y <- sobol_g2_additive_noise(sob_noise_add$X) sensitivity::tell(sob_noise_add, Y) ``` ```{r g-noise-const-plot-2007, cache=TRUE, eval=LOCAL} print(sob_noise_add) Sobol4R::autoplot(sob_noise_add) ``` # Quantity of interest based on repeated runs Instead of a single noisy run, we can focus on a quantity of interest, here the conditional mean of the output given the inputs. This is approximated by repeated calls to the stochastic model. ## Order 1 and Total via `sensitivity::sobol()` ```{r g-noise-const-qoi-run, cache=TRUE, eval=LOCAL} sob_noise_const_qoi <- sobol4r_qoi_indices( model = sobol_g2_additive_noise, X1 = X1[, 1:2], X2 = X2[, 1:2], order = 2, nboot = 50, type = "sobol") ``` ```{r g-noise-const-qoi-plot, cache=TRUE, eval=LOCAL} print(sob_noise_const_qoi) Sobol4R::autoplot(sob_noise_const_qoi) ``` ## Order 1 and Total via `sensitivity::sobol2007()` ```{r g-noise-const-qoi-run-2007, cache=TRUE, eval=LOCAL} sob_noise_const_qoi <- sobol4r_qoi_indices( model = sobol_g2_additive_noise, X1 = X1[, 1:2], X2 = X2[, 1:2], nboot = 50, type = "sobol2007") ``` ```{r g-noise-const-qoi-plot-2007, cache=TRUE, eval=LOCAL} print(sob_noise_const_qoi) Sobol4R::autoplot(sob_noise_const_qoi) ``` # Covariate dependent noise We now add a third input which controls the mean of the Gaussian noise term. The mean is equal to the third input, and the variance is fixed. ```{r g-noise-cov-design, cache=TRUE, eval=LOCAL} X1_cov <- data.frame( C1 = runif(n), C2 = runif(n), C3 = runif(n, min = 1, max = 100) ) X2_cov <- data.frame( C1 = runif(n), C2 = runif(n), C3 = runif(n, min = 1, max = 100) ) ``` ## Order 1 and Total via `sensitivity::sobol()` ```{r g-noise-cov-run, cache=TRUE, eval=LOCAL} sob_cov_single <- sobol4r_design(X1 = X1_cov, X2 = X2_cov, order = 2, nboot = 50, type = "sobol") Y <- sobol_g2_additive_noise(sob_cov_single$X) sensitivity::tell(sob_cov_single, Y) ``` ```{r g-noise-cov-plot, cache=TRUE, eval=LOCAL} print(sob_cov_single) Sobol4R::autoplot(sob_cov_single) ``` ## Order 1 and Total via `sensitivity::sobol2007()` ```{r g-noise-cov-qoi-run, cache=TRUE, eval=LOCAL} sob_cov_qoi <- sobol4r_qoi_indices( model = sobol_g2_with_covariate_noise, X1 = X1_cov, X2 = X2_cov, nboot = 50, type = "sobol2007") ``` ```{r g-noise-cov-qoi-plot, cache=TRUE, eval=LOCAL} print(sob_cov_qoi) Sobol4R::autoplot(sob_cov_qoi) ``` # Conclusion This vignette shows how Sobol4R can be used to study the impact of randomness in the model output on Sobol indices. More advanced examples, including models with random distributional parameters, are presented in a separate vignette.