--- title: "Cohort diagnostics" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{a03_CohortDiagnostics} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE, fig.width = 7 ) library(CDMConnector) if (Sys.getenv("EUNOMIA_DATA_FOLDER") == "") Sys.setenv("EUNOMIA_DATA_FOLDER" = tempdir()) if (!dir.exists(Sys.getenv("EUNOMIA_DATA_FOLDER"))) dir.create(Sys.getenv("EUNOMIA_DATA_FOLDER")) if (!eunomia_is_available()) downloadEunomiaData() ``` ## Introduction In this example we're going to summarise cohort diagnostics results for cohorts of individuals with an ankle sprain, ankle fracture, forearm fracture, or a hip fracture using the Eunomia synthetic data. Again, we'll begin by creating our study cohorts. ```{r} library(CDMConnector) library(CohortConstructor) library(CodelistGenerator) library(PatientProfiles) library(CohortCharacteristics) library(PhenotypeR) library(dplyr) library(ggplot2) con <- DBI::dbConnect(duckdb::duckdb(), dbdir = CDMConnector::eunomia_dir() ) cdm <- CDMConnector::cdm_from_con(con, cdm_schem = "main", write_schema = "main", cdm_name = "Eunomia" ) cdm$injuries <- conceptCohort(cdm = cdm, conceptSet = list( "ankle_sprain" = 81151, "ankle_fracture" = 4059173, "forearm_fracture" = 4278672, "hip_fracture" = 4230399 ), name = "injuries") ``` ## Cohort diagnostics We can run cohort diagnostics analyses for each of our overall cohorts like so: ```{r} cohort_diag <- cohortDiagnostics(cdm$injuries) ``` Our results will include a summary of the overlap between our cohorts. We could visualise this ```{r} plotCohortOverlap(cohort_diag, uniqueCombinations = TRUE) ``` Moreover, our results will also include a summary of the characteristics of each cohort, stratified by age group and sex. ```{r} tableCharacteristics(cohort_diag, groupColumn = c("age_group", "sex")) ``` You can also visualise the age distribution: ```{r} tableCharacteristics(cohort_diag, groupColumn = c("age_group", "sex")) ```