--- title: "Introduction to mediation analysis with JSmediation" author: "Cédric Batailler" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to mediation analysis with JSmediation} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: "`r system.file('references.bib', package='JSmediation')`" csl: "`r system.file('apa.csl', package='JSmediation')`" --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Overview The _JSmediation_ package was designed to help intuitively typing the code to test mediations. In this vignette, we will use it to assess a simple mediation. ## Conducting a Simple Mediation Simple mediation analysis refers to the analysis testing whether the effect of an independent variable on a dependent variable goes through a third variable (the mediator). The `ho_et_al` data set, shipped with the _JSmediation_ package, contains data illustrating a case of simple mediation. This data set contains the data collected by Ho et al. in a paper focusing on hypodescent [-@ho_youre_2017], a rule sometimes use when people have to perform multiracial categorization and where a perceivers associate a biracial person more easily to their lowest status group. In this experiment, Ho et al. [-@ho_youre_2017] made the hypothesis that a Black American participants exposed to the discrimination of Black-White biracials would be more likely to associate Black-White biracials with Black Americans rather than with White Americans. In other words, participants of their experiment would be more likely to use the hypodescent rule when exposed to high discrimination content compared to low discrimination content. In the experiment that we will investigate, the authors went further and tested whether the effect of the discrimination condition on the use of hypodescent was mediated by a feeling of linked fate between the participants (Black Americans) and Black-White biracials [@ho_youre_2017]. In this vignette, we will use the `ho_et_al` data set to test whether __feeling of linked fate mediates the relationship between the exposition to a high discrimination content and the use of hypodescent__ among Black Africans. ### Formalization of Simple Mediation Simple mediation is often times summarized with one equation [@baron_moderator-mediator_1986;@cohen_applied_1983]: $$ c = c' + a \times b $$ with $c$ the total effect of the independent variable ($X$) on the dependent variable ($Y$), $c'$ the direct of $X$ on $Y$, and $a \times b$ the indirect effect of $X$ on $Y$ through the mediator variable ($M$; see Models section of the `mdt_simple` help page). To assess whether the indirect effect is different from the null, one has to assess the significance against of both $a$ (the effect of $X$ on $M$) and $b$ (effect of $M$ on $Y$ controlling for the effect of $X$). Both $a$ and $b$ need to be simultaneously significant for an indirect effect to be claimed [@yzerbyt_new_2018]. Because we want to test whether the feeling of linked fate is mediating the effect of the discrimination condition on the use of hypodescent, we must test whether the discrimination condition predicts the feeling of linked fate and whether feeling of link fate predicts the use of hypodescent (when controlling for the effect of the discrimination condition). The _JSmediation_ package will help us in that regard. Our first step will be to attach the _JSmediation_ package to our environment. This will allow us to use the functions and data sets shipped with the package. ```{r} library(JSmediation) ``` ### Data Preparation To begin with the analysis, we will take a look at the `ho_et_al` data set. ```{r} data(ho_et_al) head(ho_et_al) ``` This data set contains 5 columns: * `ìd`: a unique identifier for each participant, * `condition`: the discrimination condition of the participants (either "Low discrimination" or "High discrimination"), * `sdo`: a measure of Social Dominance Orientation (SDO) of the participant which is extensively used in our example of [moderated mediation]((moderated_mediation_analysis.html), * `linkedfate`: the feeling of linked fate between the participants and Black-White biracials, * `hypodescent`: the tendency to use the hypodescent rules in multiracial categorization (see, Ho et al. 2017). This data set is almost ready for our analysis. The only thing that we need is a data frame (or a `tibble`) with the value of our different variables for each participant (i.e., the independent variable, the dependent variable, and the mediator). Our data, however, must be properly formatted for the analysis. In particular, every variable must be coded as a numeric variable. Because the `condition` variable is coded as a character (and not as a numeric)—a format which is not supported by _JSmediation_, we will need to pre-process our data set. Thanks to the `build_contrast` function, we will create a new variable in `ho_et_al` (`condition_c`) representing the discrimination condition as a numeric variable. ```{r} ho_et_al$condition_c <- build_contrast(ho_et_al$condition, "Low discrimination", "High discrimination") head(ho_et_al) ``` ### Using `mdt_fit` Now that we have a data frame ready for analysis, we will use the `mdt_simple` function to fit a simple mediation model. Any mediation model supported by _JSmediation_ comes with a `mdt_*` function. These functions need the users to indicate the data set used for the analysis as well as the variable relevant for the analysis thanks to the function argument. Once done, it will run the relevant linear regression in order to test the conditions necessary for mediation. ```{r} mediation_fit <- mdt_simple(ho_et_al, IV = condition_c, DV = hypodescent, M = linkedfate) ``` The `mediation_fit` model that we just created contains every bit of information necessary to use a joint-significance approach to assess simple mediation [@yzerbyt_new_2018]. ### Working with `mediation_model` Objects Before diving into the results, because the joint-significance approach runs linear regression under the hood, we will test the assumptions of ordinary least square for each of the regression used by `mdt_simple` [@judd_data_2017]. To do so, we will use the `check_model` from the _performance_ package function which prints several diagnostic plots [@ludecke_performance_2021]^[Recent versions of _JSmediation_ offers the `check_assumptions` and `plot_assumptions` to help you check the OLS assumptions of the fitted model.]. We will first extract the models used by `mdt_simple`, and then run the `check_model` function. The `extract_model` function will be helpful to that regard. This function uses a mediation model as a first argument, and the model name (or model index) as a second argument. It then returns a linear model object (i.e., an `lm` object). ```{r eval=rlang::is_installed("performance"), fig.height=8, fig.width=7, out.width="100%"} first_model <- extract_model(mediation_fit, step = "X -> M") performance::check_model(first_model) ``` We will do the same thing for the two other models mdt_simple has fitted. ```{r eval=rlang::is_installed("performance"), fig.height=8, fig.width=7, out.width="100%"} second_model <- extract_model(mediation_fit, step = 2) performance::check_model(second_model) ``` ```{r eval=rlang::is_installed("performance"), fig.height=8, fig.width=7, out.width="100%"} third_model <- extract_model(mediation_fit, step = 3) performance::check_model(third_model) ``` Thanks to these plots, we can now interpret the results of the mediation knowing whether their data suffer from any violation [@judd_data_2017]. ### Interpreting the Results of a Mediation Model Now that we check for our assumptions, we can interpret our model. To do so, we simply have to call `model_fit`. ```{r, render="asis"} mediation_fit ``` In this summary, we can see that both $a$ and $b$ paths are significant, and we can therefore conclude that the indirect effect of the discrimination condition on hypodescent used passing through the feeling of linked fate is significant [@yzerbyt_new_2018]. ### Reporting a Simple Mediation Thanks to the `mdt_simple` function, we almost have every information to report our joint-significance test [@yzerbyt_new_2018]. Besides reporting the significance of $a$ and $b$, it is sometimes recommended to report the index of indirect effect, a single value accounting for $a \times b$. Wa can compute this index thanks to Monte Carlo methods thanks to the `add_index` function. This functions adds the indirect effect to the model summary object. ```{r} model_fit_with_index <- add_index(mediation_fit) model_fit_with_index ``` The only thing left to do is to report the mediation analysis: > First, we examined the effect of the discrimination condition (low vs. high) on hypodescent use. This analysis revealed a significant effect, _t_(822) = 2.13, _p_ = .034. > > We then tested our hypothesis of interest, namely, we tested whether the sentiment of linked fate between Black Americans and Black-White biracials mediated the effect of the discrimination condition on hypodescent. To do so, we conducted a joint significant test [@yzerbyt_new_2018]. This analysis revealed a significant effect of discrimination condition on linked fate, _t_(822) = 9.10, _p_ < .001, and a significant effect of linked fate on hypodescent, controlling for the discrimination condition, _t_(821) = 5.75, _p_ < .001. The effect of discrimination condition on hypodescent after controlling for the feeling of linked fate was no longer significant, _t_(821) = 0.33, _p_ = .742. Consistently with this analysis, the Monte Carlo confidence interval for the indirect effect did not contain 0, CI95% [0.0889; 0.208]. This analysis reveals that the feeling of linked fate mediates the effect of the discrimination condition on hypodescent. ## Miscellaneous `JSmediation` makes conducting a mediation analysis easy with the `mdt_*` functions, but they are not the only function in the package. Some functions will help with the linear regression models fitted during the analysis. * `check_assumptions` tests every model's OLS assumptions using the _performance_ package. * `plot_assumptions` plots plots diagnostic of the models' OLS assumptions using the _performance_ package. * `extract_model` returns one of the model used (as an `lm` object). * `extract_models` returns a named list of the models used. * `extract_tidy_models` returns a data frame containing models summary information à la _broom_ [@robinson_broom_2021]. * `display_models` print a summary of each `lm` model. ## References