Ordinal Regression with plssem

This vignette demonstrates how to fit an ordinal regression model with plssem using the titanic dataset.

In plssem, regression-style model syntax like y ~ x1 + x2 is supported. When the dependent variable (and/or predictors) are ordinal. Ordinal variables can be supplied via the ordered argument, or by making sure they are ordered in the dataset.

Data

head(titanic[, c("Survived", "Age", "Sex", "Female", "Pclass")])
#>   Survived Age    Sex Female Pclass
#> 1        0  22   male      0      3
#> 2        1  38 female      1      1
#> 3        1  26 female      1      3
#> 4        1  35 female      1      1
#> 5        0  35   male      0      3
#> 6        0  NA   male      0      3

Linear ordinal regression

This model predicts survival as a function of age and sex.

m_linear <- "Survived ~ Age + Female"

fit_linear <- pls(
  m_linear,
  data          = titanic,
  ordered       = "Survived",
  boot.R        = 50,
  bootstrap     = TRUE,
  boot.parallel = "multicore",
  boot.ncpus    = 2
)

summary(fit_linear)
#> plssem (0.1.1) ended normally after 1 iterations
#> 
#>   Estimator                                    OrdPLSc
#>   Link                                          PROBIT
#>                                                       
#>   Number of observations                           714
#>   Number of iterations                               1
#>   Number of latent variables                         0
#>   Number of observed variables                       3
#> 
#> Fit Measures:
#>   Chi-Square                                     0.000
#>   Degrees of Freedom                                 0
#>   SRMR                                           0.000
#>   RMSEA                                            NaN
#> 
#> R-squared (latents):
#>   Survived                                       0.347
#> 
#> Regressions:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   Survived ~    
#>     Age            -0.043      0.040   -1.076    0.282
#>     Female          0.584      0.028   21.200    0.000
#> 
#> Covariances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   Age ~~        
#>     Female         -0.093      0.043   -2.194    0.028
#> 
#> Variances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>    .Survived        0.653      0.033   19.749    0.000
#>     Age             1.000                             
#>     Female          1.000

Optional: evaluate predictive performance.

pls_predict(fit_linear, benchmark = "acc")
#> PlsSemPredict object
#> Available fields: $Y, $X.cont, $X.cont.pred, $X.ord, $X.ord.pred, $benchmark
#> 
#> Y [714 x 3] (head)
#>   Survvd    Age Female
#> 1 -0.420 -0.530 -0.759
#> 2  0.743  0.571  1.317
#> 3  0.779 -0.255  1.317
#> 4  0.752  0.365  1.317
#> 5 -0.458  0.365 -0.759
#> 7 -0.515  1.673 -0.759
#> 
#> X.cont [714 x 3] (head)
#>   Survvd    Age Female
#> 1 -0.653 -0.530 -0.759
#> 2  0.955  0.571  1.317
#> 3  0.955 -0.255  1.317
#> 4  0.955  0.365  1.317
#> 5 -0.653  0.365 -0.759
#> 7 -0.653  1.673 -0.759
#> 
#> X.cont.pred [714 x 3] (head)
#>   Survvd    Age Female
#> 1 -0.420 -0.530 -0.759
#> 2  0.743  0.571  1.317
#> 3  0.779 -0.255  1.317
#> 4  0.752  0.365  1.317
#> 5 -0.458  0.365 -0.759
#> 7 -0.515  1.673 -0.759
#> 
#> X.ord [714 x 3] (head)
#>   Survvd    Age Female
#> 1  1.000 -0.530 -0.759
#> 2  2.000  0.571  1.317
#> 3  2.000 -0.255  1.317
#> 4  2.000  0.365  1.317
#> 5  1.000  0.365 -0.759
#> 7  1.000  1.673 -0.759
#> 
#> X.ord.pred [714 x 3] (head)
#>   Survvd    Age Female
#> 1  1.000 -0.530 -0.759
#> 2  2.000  0.571  1.317
#> 3  2.000 -0.255  1.317
#> 4  2.000  0.365  1.317
#> 5  1.000  0.365 -0.759
#> 7  1.000  1.673 -0.759
#> 
#> Benchmark summary
#> acc: n=1, mean=0.780, median=0.780, min=0.780, max=0.780
#>  variable value
#>  Survived 0.780

Non-linear ordinal regression (interaction)

To include a non-linear (interaction) effect, add an interaction term. With ordinal indicators and interactions, plssem automatically switches to the Monte-Carlo ordinal PLSc estimator.

m_int <- "Survived ~ Age + Female + Age:Female"

fit_int <- pls(
  m_int,
  data          = titanic,
  ordered       = "Survived",
  boot.R        = 50,
  bootstrap     = TRUE,
  boot.parallel = "multicore",
  boot.ncpus    = 2
)

summary(fit_int)
#> plssem (0.1.1) ended normally after 107 iterations
#> 
#>   Estimator                                  MCOrdPLSc
#>   Link                                          PROBIT
#>                                                       
#>   Number of observations                           714
#>   Number of iterations                             107
#>   Number of latent variables                         0
#>   Number of observed variables                       3
#> 
#> Fit Measures:
#>   Chi-Square                                     0.052
#>   Degrees of Freedom                                -3
#>   SRMR                                           0.002
#>   RMSEA                                            NaN
#> 
#> R-squared (latents):
#>   Survived                                       0.561
#> 
#> Regressions:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   Survived ~    
#>     Age            -0.073      0.044   -1.669    0.095
#>     Female          0.697      0.038   18.576    0.000
#>     Age:Female      0.250      0.067    3.735    0.000
#> 
#> Covariances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   Age ~~        
#>     Female         -0.093      0.045   -2.053    0.040
#>     Age:Female      0.009      0.015    0.628    0.530
#>   Female ~~     
#>     Age:Female     -0.004      0.015   -0.251    0.802
#> 
#> Variances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>    .Survived        0.439      0.072    6.086    0.000
#>     Age             1.000                             
#>     Female          1.000                             
#>     Age:Female      1.000
pls_predict(fit_int, benchmark = "acc")
#> PlsSemPredict object
#> Available fields: $Y, $X.cont, $X.cont.pred, $X.ord, $X.ord.pred, $benchmark
#> 
#> Y [714 x 4] (head)
#>   Survvd    Age Female Ag:Fml
#> 1 -0.366 -0.530 -0.759  0.495
#> 2  1.087  0.571  1.317  0.845
#> 3  0.876 -0.255  1.317 -0.242
#> 4  1.034  0.365  1.317  0.574
#> 5 -0.601  0.365 -0.759 -0.184
#> 7 -0.945  1.673 -0.759 -1.176
#> 
#> X.cont [714 x 3] (head)
#>   Survvd    Age Female
#> 1 -0.644 -0.530 -0.759
#> 2  0.932  0.571  1.317
#> 3  0.932 -0.255  1.317
#> 4  0.932  0.365  1.317
#> 5 -0.644  0.365 -0.759
#> 7 -0.644  1.673 -0.759
#> 
#> X.cont.pred [714 x 3] (head)
#>   Survvd    Age Female
#> 1 -0.366 -0.530 -0.759
#> 2  1.087  0.571  1.317
#> 3  0.876 -0.255  1.317
#> 4  1.034  0.365  1.317
#> 5 -0.601  0.365 -0.759
#> 7 -0.945  1.673 -0.759
#> 
#> X.ord [714 x 3] (head)
#>   Survvd    Age Female
#> 1  1.000 -0.530 -0.759
#> 2  2.000  0.571  1.317
#> 3  2.000 -0.255  1.317
#> 4  2.000  0.365  1.317
#> 5  1.000  0.365 -0.759
#> 7  1.000  1.673 -0.759
#> 
#> X.ord.pred [714 x 3] (head)
#>   Survvd    Age Female
#> 1  1.000 -0.530 -0.759
#> 2  2.000  0.571  1.317
#> 3  2.000 -0.255  1.317
#> 4  2.000  0.365  1.317
#> 5  1.000  0.365 -0.759
#> 7  1.000  1.673 -0.759
#> 
#> Benchmark summary
#> acc: n=1, mean=0.780, median=0.780, min=0.780, max=0.780
#>  variable value
#>  Survived 0.780