sysVar_inOut_V05

Emily Butler, eabutler@u.arizona.edu & Ashley Kuelz, akuelz@email.arizona.edu

2020-05-08

Once you have gone through the steps for estimating dynamic trajectory profiles for one of the models (see “inertia_coordination” and “coupled_oscillator” vignettes), you are ready to use the profiles as either predictors or outcomes of system variables, e.g., any variables of interest that you think are related to your dyadic system, but change slower than the state variables used to represent the dynamics. We show only the basic steps for getting the profile memberships from the inertia-coordination model here (see other vignettes for full procedure).

library(rties)
data1 <- rties_ExampleDataFull

data2 <- dataPrep(basedata=data1, dyadId="couple", personId="person", obs_name="dial", dist_name="female", time_name="time", time_lag="absMaxCC") 
#> Joining by: dyad

ic <- indivInertCoord(prepData=data2, whichModel="inertCoord")

lpaData <- inspectProfiles(whichModel="inertCoord", prepData=data2, paramEst=ic$params, n_profiles=2) 

fullData <- makeFullData(basedata=data1, dyadId="couple", personId="person", dist_name="female", lpaData=lpaData, params=ic$params)
#> Joining by: dyad

Predicting the System Variable From the Profiles

The “sysVarOut” function uses the profile memberships to predict system variables, which can be either dyadic (sysVarType = “dyadic”), where both partners have the same score (e.g., relationship length) or individual (sysVarType = “indiv”), where the partners can have different scores (e.g., age). It takes as arguments the name of the dataframe containing the profile membership scores combined with your original dataframe (created by the “makeFullData” function), the name of the column in the dataframe containing the variable you would like to use as the system variable, and whether the system variable is “dyadic” or “individual”. For dyadic system variables, the only predictor is profile membership and the model is a regular regression model since all variables are at the level of the dyad. If the system variable is individual then the model is a random-intercept dyadic model and 3 models are estimated: 1) the main effect of profile membership (“profile”), 2) main effects of profile membership and the distinguishing variable (“profilePlusDist”), and 3) the interaction of profile membership and the distinguishing variable (“profileByDist”). If the system variable is not normally distributed, any of the generalized linear models supported by glm (for dyadic system variables) or glmer (for individual system variables) are available by specifying the “family” distribution (use ?sysVarOut for more information).

For normally distributed system variables, the function returns a list including the lm or lme objects containing the full results for each model (called “models”). Similarly, for non-normal system variables, the function returns a list of the glm or glmer objects containing the full results for the models.

We start with an example of a normally (or at least close enough to normally) distributed dyadic system variable, “dyadInfluence”, which was the average self-report for each couple of their attempts to influence each other during a conversation. We first run the “sysVarOut” function and then use the “sysVarOutResults” function to produce a set of results from the model. Since this is a dyadic system variable, the only model is the main effect of profile and we compare it to the base model. The results include a Likelihood Ratio test (LRT) of the overall model fit compared to the null (base) model, an omnibus anova table (which is identical to the LRT for a dyadic system variable), and the parameter estimates. In this example, we see there is no effect of profile on influence attempts.

sysOut <- sysVarOut(fullData=fullData, sysVar_name="dyadInfluence", sysVarType="dyadic", dist0name="Men", dist1name="Women")
#> Model names are base & profile
sysVarOutResults(sysOut$models$base, sysOut$models$profile)
#> $modelCompare
#> Analysis of Variance Table
#> 
#> Model 1: sysVar ~ 1
#> Model 2: sysVar ~ profile
#>   Res.Df    RSS Df Sum of Sq     F Pr(>F)
#> 1     71 95.420                          
#> 2     70 95.319  1   0.10076 0.074 0.7864
#> 
#> $omnibus
#> Analysis of Variance Table
#> 
#> Response: sysVar
#>           Df Sum Sq Mean Sq F value Pr(>F)
#> profile    1  0.101 0.10076   0.074 0.7864
#> Residuals 70 95.319 1.36170               
#> 
#> $paramEst
#> 
#> Call:
#> stats::lm(formula = sysVar ~ profile, data = basedata, na.action = na.exclude)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -2.46000 -0.71212 -0.08606  0.87333  3.04000 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)   
#> (Intercept) -0.54000    0.16503  -3.272  0.00166 **
#> profile2    -0.08121    0.29855  -0.272  0.78640   
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.167 on 70 degrees of freedom
#> Multiple R-squared:  0.001056,   Adjusted R-squared:  -0.01321 
#> F-statistic: 0.074 on 1 and 70 DF,  p-value: 0.7864

We can also use the “sysVarOutPlots” function to produce a histogram of the residuals from the model and a boxplot of the system variable for each profile.

sysVarOutPlots(fullData=fullData, sysVar_name="dyadSup", sysVarType="dyadic", testModel=sysOut$models$profile, dist0name=NULL, dist1name=NULL)
#> [[1]]
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

#> 
#> [[2]]

As a second example, we consider a binomial individual level system variable, “ambivB” which was a self-report of ambivalence about the relationship, answered yes vs. no. Since this is an individual system variable, three models are now estimated: 1) main effect of profile, 2) main effects of profile and the distinguishing variable, and 2) the interaction of profile by the distinguishing variable. We call “sysVarOutResults” three times, once for each model, again comparing each to the base model. The results contain the same information as described for the previous example, but since this a logistic model (due to the binomial system variable) the odds ratios obtained by exponentiating the parameter estimates are also provided. As in the previous example, we find no evidence of any effect of profiles or the distinguishing variable on ambivalence.

sysOut <- sysVarOut(fullData=fullData, sysVar_name="ambivB", sysVarType="indiv", dist0name="Men", dist1name="Women", family="binomial")
#> Model names are base, profile, profilePlusDist and profileByDist
sysVarOutResults(sysOut$models$base, sysOut$models$profile, Gaussian=F)
#> $modelCompare
#> Data: basedata
#> Models:
#> baseModel: sysVar ~ 1 + (1 | dyad)
#> testModel: sysVar ~ profile + (1 | dyad)
#>           npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
#> baseModel    2 182.65 188.50 -89.324   178.65                     
#> testModel    3 184.48 193.26 -89.239   178.48 0.1688  1     0.6812
#> 
#> $paramEst
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#>   Approximation) [glmerMod]
#>  Family: binomial  ( logit )
#> Formula: sysVar ~ profile + (1 | dyad)
#>    Data: basedata
#> Control: lme4::glmerControl()
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>    184.5    193.3    -89.2    178.5      135 
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -1.3067 -0.9695  0.5603  0.6037  0.8209 
#> 
#> Random effects:
#>  Groups Name        Variance Std.Dev.
#>  dyad   (Intercept) 0.8434   0.9184  
#> Number of obs: 138, groups:  dyad, 72
#> 
#> Fixed effects:
#>             Estimate Std. Error z value Pr(>|z|)  
#> (Intercept)   0.7554     0.2988   2.528   0.0115 *
#> profile2     -0.1964     0.4792  -0.410   0.6819  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Correlation of Fixed Effects:
#>          (Intr)
#> profile2 -0.548
#> 
#> $oddsRatio
#> (Intercept)    profile2 
#>   2.1284133   0.8216445
sysVarOutResults(sysOut$models$base, sysOut$models$profilePlusDist, Gaussian=F)
#> $modelCompare
#> Data: basedata
#> Models:
#> baseModel: sysVar ~ 1 + (1 | dyad)
#> testModel: sysVar ~ profile + dist + (1 | dyad)
#>           npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
#> baseModel    2 182.65 188.50 -89.324   178.65                     
#> testModel    4 185.11 196.81 -88.553   177.11 1.5413  2     0.4627
#> 
#> $paramEst
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#>   Approximation) [glmerMod]
#>  Family: binomial  ( logit )
#> Formula: sysVar ~ profile + dist + (1 | dyad)
#>    Data: basedata
#> Control: lme4::glmerControl()
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>    185.1    196.8    -88.6    177.1      134 
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -1.4658 -0.9916  0.5059  0.6612  0.9221 
#> 
#> Random effects:
#>  Groups Name        Variance Std.Dev.
#>  dyad   (Intercept) 0.9339   0.9664  
#> Number of obs: 138, groups:  dyad, 72
#> 
#> Fixed effects:
#>             Estimate Std. Error z value Pr(>|z|)
#> (Intercept)   0.5432     0.3481   1.560    0.119
#> profile2     -0.2020     0.4900  -0.412    0.680
#> distMen       0.4620     0.4003   1.154    0.248
#> 
#> Correlation of Fixed Effects:
#>          (Intr) profl2
#> profile2 -0.471       
#> distMen  -0.481 -0.020
#> 
#> $oddsRatio
#> (Intercept)    profile2     distMen 
#>   1.7215785   0.8170985   1.5872528
sysVarOutResults(sysOut$models$base, sysOut$models$profileByDist, Gaussian=F)
#> $modelCompare
#> Data: basedata
#> Models:
#> baseModel: sysVar ~ 1 + (1 | dyad)
#> testModel: sysVar ~ profile * dist + (1 | dyad)
#>           npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
#> baseModel    2 182.65 188.50 -89.324   178.65                     
#> testModel    5 186.94 201.58 -88.472   176.94 1.7038  3     0.6361
#> 
#> $paramEst
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#>   Approximation) [glmerMod]
#>  Family: binomial  ( logit )
#> Formula: sysVar ~ profile * dist + (1 | dyad)
#>    Data: basedata
#> Control: lme4::glmerControl()
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>    186.9    201.6    -88.5    176.9      133 
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -1.5091 -0.9337  0.5003  0.6280  0.8829 
#> 
#> Random effects:
#>  Groups Name        Variance Std.Dev.
#>  dyad   (Intercept) 0.9446   0.9719  
#> Number of obs: 138, groups:  dyad, 72
#> 
#> Fixed effects:
#>                  Estimate Std. Error z value Pr(>|z|)
#> (Intercept)       0.49170    0.36962   1.330    0.183
#> profile2         -0.03919    0.63610  -0.062    0.951
#> distMen           0.57399    0.48985   1.172    0.241
#> profile2:distMen -0.33846    0.84055  -0.403    0.687
#> 
#> Correlation of Fixed Effects:
#>             (Intr) profl2 distMn
#> profile2    -0.554              
#> distMen     -0.561  0.346       
#> prfl2:dstMn  0.337 -0.636 -0.575
#> 
#> $oddsRatio
#>      (Intercept)         profile2          distMen profile2:distMen 
#>        1.6350861        0.9615645        1.7753405        0.7128705

We can again use the “sysVarOutPlots” function, this time with “binomial=TRUE”, to produce a plot of the predicted probabilities of the system variable for each profile.

sysVarOutPlots(fullData=fullData, sysVar_name="ambivB", sysVarType="indiv", testModel=sysOut$models$profileByDist, dist0name=NULL, dist1name=NULL, binomial=T)

Predicting Profile Membership From the System Variable

The “sysVarIn” function turns the direction of prediction around and uses the system variable to predict couples’ profile memberships. It takes as arguments the name of the dataframe containing the profile membership scores combined with your original dataframe (created by the “makeFullData” function), the name of the column in the dataframe containing the variable you would like to use as the system variable, whether the system variable is “dyadic” or “individual”, and the number of profiles. If there are 2 profiles, then binomial regression models are used. If there are more than 2 profiles then multinomial regression is used. For dyadic system variables, a couple’s shared score is the only predictor of their profile membership (the model is called “sysVarMain”). For individual system variables, two models are tested, one with the main effects of both partner’s system variable (“sysVarMain”) and one with the main effects and their interaction (“sysVarInteract”). In both cases an intercept-only model is included as a comparison point (called “base”). The function returns a list of the model objects and result summaries can be obtained with the “sysVarInResults” function. The results include a Chisquare Deviance test (ChiSq) of the overall model fit compared to the null (base) model, the parameter estimates and odds ratios obtained by exponentiating the parameter estimates.

We first consider a dyadic system variable and again find no evidence for it predicting the profiles.

sysIn <- sysVarIn(fullData=fullData, sysVar_name="dyadInfluence", n_profiles=2, sysVarType="dyadic")
#> Model names are base and sysVarMain
sysVarInResults(sysIn$models$base, sysIn$models$sysVarMain, n_profiles=2)
#> $modelCompare
#> Analysis of Deviance Table
#> 
#> Model 1: profileN ~ 1
#> Model 2: profileN ~ sysVar
#>   Resid. Df Resid. Dev Df Deviance Pr(>Chi)
#> 1        71     88.632                     
#> 2        70     88.556  1 0.076169   0.7826
#> 
#> $paramEst
#> 
#> Call:
#> stats::glm(formula = profileN ~ sysVar, family = "binomial", 
#>     data = basedata, na.action = na.exclude)
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -0.9083  -0.8605  -0.8413   1.5144   1.6022  
#> 
#> Coefficients:
#>             Estimate Std. Error z value Pr(>|z|)   
#> (Intercept) -0.85667    0.28848  -2.970  0.00298 **
#> sysVar      -0.06146    0.22302  -0.276  0.78286   
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 88.632  on 71  degrees of freedom
#> Residual deviance: 88.556  on 70  degrees of freedom
#> AIC: 92.556
#> 
#> Number of Fisher Scoring iterations: 4
#> 
#> 
#> $oddsRatio
#> (Intercept)      sysVar 
#>   0.4245736   0.9403886

We can also use the “sysVarInPlots” function to produce a plot of profile membership at observed values of the system variable.

inPlots <- sysVarInPlots(fullData=fullData, sysVar_name="dyadInfluence", sysVarType="dyadic", n_profiles=2)

Next we consider a categorical individual level system variable. Because it is at the individual level, there are two models estimated, one for the main effects of each partner’s system variable and one for their interaction, so we use the “sysVarInResults” function twice, once for each model. We still see no evidence of any effects.

sysIn <- sysVarIn(fullData=fullData, sysVar_name="conflictCat", n_profiles=2, sysVarType="indiv")
#> Model names are base, sysVarMain and sysVarInteract
sysVarInResults(sysIn$models$base, sysIn$models$sysVarMain, n_profiles=2)
#> $modelCompare
#> Analysis of Deviance Table
#> 
#> Model 1: profileN ~ 1
#> Model 2: profileN ~ sysVar0 + sysVar1
#>   Resid. Df Resid. Dev Df Deviance Pr(>Chi)
#> 1        65     84.020                     
#> 2        61     82.709  4    1.311   0.8595
#> 
#> $paramEst
#> 
#> Call:
#> stats::glm(formula = profileN ~ sysVar0 + sysVar1, family = "binomial", 
#>     data = basedata, na.action = na.exclude)
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -1.0151  -0.9857  -0.7755   1.3489   1.7717  
#> 
#> Coefficients:
#>             Estimate Std. Error z value Pr(>|z|)  
#> (Intercept) -1.04739    0.58454  -1.792   0.0732 .
#> sysVar02     0.07459    0.57108   0.131   0.8961  
#> sysVar03    -0.28859    0.94766  -0.305   0.7607  
#> sysVar12     0.57825    0.62183   0.930   0.3524  
#> sysVar13    -0.07468    0.99398  -0.075   0.9401  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 84.020  on 65  degrees of freedom
#> Residual deviance: 82.709  on 61  degrees of freedom
#> AIC: 92.709
#> 
#> Number of Fisher Scoring iterations: 4
#> 
#> 
#> $oddsRatio
#> (Intercept)    sysVar02    sysVar03    sysVar12    sysVar13 
#>   0.3508509   1.0774383   0.7493213   1.7829217   0.9280382
sysVarInResults(sysIn$models$base, sysIn$models$sysVarInteract, n_profiles=2)
#> $modelCompare
#> Analysis of Deviance Table
#> 
#> Model 1: profileN ~ 1
#> Model 2: profileN ~ sysVar0 * sysVar1
#>   Resid. Df Resid. Dev Df Deviance Pr(>Chi)
#> 1        65      84.02                     
#> 2        57      70.93  8   13.089   0.1088
#> 
#> $paramEst
#> 
#> Call:
#> stats::glm(formula = profileN ~ sysVar0 * sysVar1, family = "binomial", 
#>     data = basedata, na.action = na.exclude)
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -1.1774  -0.9005  -0.6681   1.1774   1.7941  
#> 
#> Coefficients:
#>                     Estimate Std. Error z value Pr(>|z|)
#> (Intercept)          -0.4055     0.6455  -0.628    0.530
#> sysVar02            -18.1606  2306.1011  -0.008    0.994
#> sysVar03             18.9715  6522.6387   0.003    0.998
#> sysVar12             -0.3830     0.8412  -0.455    0.649
#> sysVar13            -18.1606  6522.6386  -0.003    0.998
#> sysVar02:sysVar12    18.9491  2306.1012   0.008    0.993
#> sysVar03:sysVar12   -19.5694  6522.6388  -0.003    0.998
#> sysVar02:sysVar13    36.0335  6918.3031   0.005    0.996
#> sysVar03:sysVar13   -18.9715 11297.5415  -0.002    0.999
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 84.02  on 65  degrees of freedom
#> Residual deviance: 70.93  on 57  degrees of freedom
#> AIC: 88.93
#> 
#> Number of Fisher Scoring iterations: 17
#> 
#> 
#> $oddsRatio
#>       (Intercept)          sysVar02          sysVar03          sysVar12 
#>      6.666667e-01      1.297030e-08      1.734732e+08      6.818182e-01 
#>          sysVar13 sysVar02:sysVar12 sysVar03:sysVar12 sysVar02:sysVar13 
#>      1.297030e-08      1.696182e+08      3.170519e-09      4.458214e+15 
#> sysVar03:sysVar13 
#>      5.764579e-09

Finally, we can produce a plot of the model predicted probabilities of being in each profile for the different combinations of the partner’s system variables.

inPlots <- sysVarInPlots(fullData=fullData, sysVar_name="conflictCat", sysVarType="indiv", n_profiles=2, testModel= sysIn$models$sysVarInteract, dist0name="men", dist1name="women")