---
title: "Benchmarking the IncidencePrevalence R package"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{a06_benchmark}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5
)
```

To check the performance of the IncidencePrevalence package we can use the benchmarkIncidencePrevalence(). This function generates some hypothetical study cohorts and the estimates incidence and prevalence using various settings and times how long these analyses take. 

We can start for example by benchmarking our example mock data which uses duckdb.

```{r, message=FALSE, warning=FALSE}
library(IncidencePrevalence)
library(visOmopResults)
library(dplyr)
library(ggplot2)

cdm <- mockIncidencePrevalence(
  sampleSize = 100,
  earliestObservationStartDate = as.Date("2010-01-01"),
  latestObservationStartDate = as.Date("2010-01-01"),
  minDaysToObservationEnd = 364,
  maxDaysToObservationEnd = 364,
  outPre = 0.1
)

timings <- benchmarkIncidencePrevalence(cdm)
timings |>
  glimpse()
```

We can see our results like so:

```{r}
visOmopTable(timings,
  hide = c(
    "variable_name", "variable_level",
    "strata_name", "strata_level"
  ),
  groupColumn = "task"
)
```

# Results from test databases
Here we can see the results from the running the benchmark on test datasets on different databases management systems. These benchmarks have already been run so we'll start by loading the results.
```{r}
test_db <- IncidencePrevalenceBenchmarkResults
test_db |>
  glimpse()
```

```{r}
visOmopTable(bind(timings, test_db),
  hide = c(
    "variable_name", "variable_level",
    "strata_name", "strata_level"
  ),
  groupColumn = "task"
)
```