--- title: "modelsummary_rms vignettes" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{modelsummary_rms-vignette} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE) ``` # Introduction The **modelsummary_rms** function is designed to process output from models fitted using the **rms** package and generate a summarised dataframe of the results. The goal is to produce publication-ready summaries of the models. This vignette will guide you through the basic usage of the function and then move on to more advanced examples. ## Installation and Setup Make sure you have the required packages installed from CRAN or GitHub. Note, if you plan to output the results into Microsoft Word, we recommend also installing flextable and officer. ```{r} # Install the package if you haven't already # install.packages("rmsMD") library(rms) library(rmsMD) library(MASS) ``` # Basic Usage Here is a simple example using a linear regression model ("ordinary least squares"; OLS). The example data being used here is the built-in survey dataset from the MASS package. The models are for demonstration purposes only. The output dataframe contains the estimated coefficients, their 95% confidence intervals, and the associated p-values. These are in a publication ready format. ```{r} # Loading the built-in dataset from the MASS package: data("survey", package = "MASS") # Fit a linear regression model using the rms package: fit_ols <- ols(Wr.Hnd ~ Age + Exer + Sex, data = survey) # Generate a model summary, and assign it to rmsMD_summary rmsMD_summary <- modelsummary_rms(fit_ols) # displaying rmsMD dataframe output rmsMD_summary # rmsMD dataframe as a table knitr::kable(rmsMD_summary) ``` ## Customising the Output By default, the function uses the following stylistic settings: - **combine_ci = TRUE:** Combines the effect estimate and the 95% confidence interval into a single column. - **round_dp_coef = 3:** Rounds the effect estimates to three decimal places. - **round_dp_p = 3:** Rounds the p-values to three decimal places. You can modify these defaults to adjust the appearance of the output. ```{r} # Generate a model summary with custom styling options summary_custom <- modelsummary_rms(fit_ols, combine_ci = FALSE, round_dp_coef = 2, round_dp_p = 5) # to display the dataframe as a table knitr::kable(summary_custom) ``` ## Full Model Output By default, **modelsummary_rms** returns only the final formatted summary (i.e. `fullmodel = FALSE`). This does not include the model intercept, or information such as standard errors. This output is made to be concise and show the key results. If all information is required, you can set `fullmodel = TRUE`. This option returns additional results. ```{r} # Generate a model summary with custom styling options summary_fullmodel <- modelsummary_rms(fit_ols, combine_ci = FALSE, round_dp_coef = 2, round_dp_p = 5, fullmodel = TRUE) knitr::kable(summary_fullmodel) ``` ## Exponentiating Coefficients (including hazard ratios and odds ratios) Exponentiating the coefficients of certain models makes the interpretation more intuitive (e.g. as odds ratios in logistic regression and hazard ratios in Cox models). This is controlled using the `exp_coef` argument. The **modelsummary_rms** package automatically sets an appropriate value for `exp_coef` for the core **rms** models `ols`, `lrm`, and `cph`. This ensures OR and HR are displayed for logistic regression and Cox regression models respectively.Below is an example using **modelsummary_rms** on an **rms** logistic regression model. Note this automatically provides OR: ```{r} # Note: For demonstration, we create a binary outcome using the survey dataset. survey$BinaryOutcome <- ifelse(survey$Wr.Hnd > median(survey$Wr.Hnd, na.rm = TRUE), 1, 0) # fitting the model fit_lrm <- lrm(BinaryOutcome ~ Age + Exer + Sex, data = survey) # rmsMD summary summary_lrm <- modelsummary_rms(fit_lrm) # displaying as a table knitr::kable(summary_lrm) ``` The **modelsummary_rms** from **rmsMD** package is also capable of working with non-rms models, such as those fitted using base R functions like lm(). However, in these cases the package does not automatically determine the appropriate value for exp_coef, so it must be set manually. For example, when using a linear model (where exponentiation of coefficients is not required), you should explicitly set exp_coef = FALSE. ```{r} # Fit a simple linear model using lm() from base R (an example model fit without using rms package) fit_lm <- lm(Wr.Hnd ~ Age + Exer + Sex, data = survey) # Generate a model summary for the non-RMS model by explicitly setting exp_coef = FALSE summary_lm <- modelsummary_rms(fit_lm, exp_coef = FALSE) # display rmsMD results as a table knitr::kable(summary_lm) ``` # Restricted Cubic Splines Restricted Cubic Splines (RCS) are a flexible modelling tool used to capture non-linear relationships between predictors and outcomes. In medicine, for the majority of continuous variables (e.g. age, blood pressure, or biomarker levels) the assumption of linearity may not hold. A key highlight of the **rms** package is the ability to analyse variables using RCS. The **rmsMD** package is designed to report and summarise models that include RCS terms. Individual coefficients for RCS terms are difficult to interpret in isolation. Instead, an overall p-value can be generated to assess whether the overall relationship between the RCS variable and outcome is significant. By default **modelsummary_rms** removes the individual RCS coefficients, replacing them with the overall p-value for that variable. - **Display an overall p-value for the spline terms using the `rcs_overallp` option.** When this option is set to `TRUE` (which is the default), the function computes a single p-value that tests the overall significance of the spline terms for each variable. This overall p-value provides insight into whether the relationship between the predictor and the dependent variable is significant. - **Hide the individual spline coefficients using the `hide_rcs_coef` option.** Hiding the individual spline coefficients can be advantageous because these lack straightforward clinical interpretation. Instead, the focus is on the overall association captured by all RCS terms for that specific variable. This helps simplify the output. If the variable has a signficant association with outcome, we recommend plotting this relationship. Example of a model with RCS terms for the continuous outcome Age. The default settings are applied, which hides the individual RCS terms, and provides an overall p-value for the association of Age with outcome. ```{r} # Using the built-in dataset from the MASS package data("survey", package = "MASS") # Fit an OLS model including a restricted cubic spline for Age (with 4 knots) fit_spline <- ols(Wr.Hnd ~ rcs(Age, 4) + Exer + Sex, data = survey) # Generate an rmsMD model summary using default settings summary_spline <- modelsummary_rms(fit_spline) # Outputting this as a table knitr::kable(summary_spline) ``` ## Displaying RCS individual coefficients If individual RCS coefficients are required, these can be added in by setting `hide_rcs_coef` to `FALSE`: ```{r} # Fit an OLS model including a restricted cubic spline for Age (with 4 knots) fit_spline_hide <- ols(Wr.Hnd ~ rcs(Age, 4) + Exer + Sex, data = survey) # Generate a model summary with rcs_overallp set to TRUE and hide_rcs_coef set to TRUE summary_spline_hide <- modelsummary_rms(fit_spline_hide, hide_rcs_coef = FALSE) # Outputting this as a table knitr::kable(summary_spline_hide) ``` If overall p-values for the variables modelled with RCS are not wanted, `rcs_overallp` can be set to `FALSE`: ```{r} # Fit an OLS model including a restricted cubic spline for Age (with 4 knots) fit_spline_hide <- ols(Wr.Hnd ~ rcs(Age, 4) + Exer + Sex, data = survey) # Generate a model summary with rcs_overallp set to TRUE and hide_rcs_coef set to TRUE summary_spline_hide <- modelsummary_rms(fit_spline_hide, rcs_overallp = FALSE, hide_rcs_coef = FALSE) knitr::kable(summary_spline_hide) ``` # Models with Interactions In medical research, interactions can be critical, as the impact of a treatment or risk factor might differ across subgroups (for example, by age or sex). These interaction terms are handled by **modelsummary_rms**. Here is a simple example: ```{r} # Using the built-in dataset from the MASS package data("survey", package = "MASS") # Fit an OLS model using the rms package with an interaction between Age and Exer. fit_interact <- ols(Wr.Hnd ~ Age * Exer + Sex, data = survey) # Generate a model summary that includes the interaction term summary_interact <- modelsummary_rms(fit_interact) knitr::kable(summary_interact) ``` ## Interactions with RCS variables The **rms** package allows interactions with variables modelled using restricted cubic splines. In this setting, the individual coefficients for RCS terms and their interactions are difficult to interpret. **modelsummary_rms** handles this situation by providing overall p-values for RCS variables (which give the overall p-value taking into account all spline terms and all of their interaction terms), and overall p-values for the interactions (takes into account linear and non-linear terms), instead of the individual coefficients. As above, this can be altered by changing `rcs_overallp` and `hide_rcs_coef`. ```{r} # Using the built-in dataset from the MASS package data("survey", package = "MASS") # Fit an OLS model with a restricted cubic spline for Age and an interaction between Age and Exer. fit_spline_interact <- ols(Wr.Hnd ~ rcs(Age, 4) * Exer + Sex, data = survey) # Generate a model summary with default RCS output summary_spline_interact <- modelsummary_rms(fit_spline_interact) # Format the output as a nice table knitr::kable(summary_spline_interact) ``` # Exporting to Microsoft Word The output of **modelsummary_rms** is a dataframe, as this is easy to work with and further process if required. This dataframe output can easily be exported to a word document using **flextable** and **officer** packages. ```{r} library(officer) library(flextable) library(dplyr) # converting modelsummary_rms dataframe generated above into a flextable rmsMD_as_table <- flextable(rmsMD_summary) # use officer to create a doc <- read_docx() %>% body_add_flextable(rmsMD_as_table) %>% body_add_par("Model summary from rmsMD", style = "heading 2") # generating a temporary output path for demonstration. This would be replaced by the file path where the word document will be generated output_path <- file.path(tempdir(), "example_output.docx") # generating the word document print(doc, target = output_path) print(doc, target = "temp.docx") ```