Plotting Penalty Matrices • penfa

Introduction

Aim. This vignette shows how to plot the penalty matrices from the single- and multiple-group penalized factor models estimated in vignette("automatic-tuning-selection") and “multiple-group-analysis”.

Data. For illustration purposes, we use the cross-cultural data set ccdata containing the standardized ratings to 12 items concerning organizational citizenship behavior. Employees from different countries were asked to rate their attitudes towards helping other employees and giving suggestions for improved work conditions. The items are thought to measure two latent factors: help, defined by the first seven items (h1 to h7), and voice, represented by the last five items (v1 to v5). See ?ccdata for details.

This data set is a standardized version of the one in the ccpsyc package, and only considers employees from Lebanon and Taiwan (i.e., "LEB", "TAIW"). This vignette is meant as a demo of the capabilities of penfa; please refer to Fischer et al. (2019) and Fischer and Karl (2019) for a description and analysis of these data.

Let us load and inspect ccdata.

library(penfa)
data(ccdata)

summary(ccdata)
##    country                h1                 h2                h3                 h4         
##  Length:767         Min.   :-2.62004   Min.   :-2.9034   Min.   :-2.63082   Min.   :-3.0441  
##  Class :character   1st Qu.:-0.69516   1st Qu.:-0.2163   1st Qu.:-0.70356   1st Qu.:-0.2720  
##  Mode  :character   Median :-0.05354   Median : 0.4554   Median :-0.06114   Median : 0.4211  
##                     Mean   : 0.00000   Mean   : 0.0000   Mean   : 0.00000   Mean   : 0.0000  
##                     3rd Qu.: 0.58809   3rd Qu.: 0.4554   3rd Qu.: 0.58128   3rd Qu.: 0.4211  
##                     Max.   : 1.22971   Max.   : 1.1272   Max.   : 1.22370   Max.   : 1.1141  
##        h5                h6                h7                v1                  v2           
##  Min.   :-2.9105   Min.   :-2.9541   Min.   :-2.8364   Min.   :-2.627694   Min.   :-2.674430  
##  1st Qu.:-0.8662   1st Qu.:-0.9092   1st Qu.:-0.7860   1st Qu.:-0.660770   1st Qu.:-0.671219  
##  Median : 0.4966   Median : 0.4541   Median :-0.1025   Median :-0.005129   Median :-0.003482  
##  Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.000000   Mean   : 0.000000  
##  3rd Qu.: 0.4966   3rd Qu.: 0.4541   3rd Qu.: 0.5810   3rd Qu.: 0.650512   3rd Qu.: 0.664255  
##  Max.   : 1.1781   Max.   : 1.1358   Max.   : 1.2645   Max.   : 1.306154   Max.   : 1.331992  
##        v3                 v4                 v5          
##  Min.   :-2.65214   Min.   :-2.65722   Min.   :-2.51971  
##  1st Qu.:-0.68800   1st Qu.:-0.68041   1st Qu.:-0.61127  
##  Median :-0.03329   Median :-0.02148   Median : 0.02488  
##  Mean   : 0.00000   Mean   : 0.00000   Mean   : 0.00000  
##  3rd Qu.: 0.62142   3rd Qu.: 0.63746   3rd Qu.: 0.66103  
##  Max.   : 1.27613   Max.   : 1.29639   Max.   : 1.29718

Penalized factor analysis

Let us fit the penalized factor model with alasso and automatic tuning procedure as described in vignette("automatic-tuning-selection").

# Specify syntax
syntax = 'help  =~   h1 + h2 + h3 + h4 + h5 + h6 + h7 + 0*v1 + v2 + v3 + v4 + v5
          voice =~ 0*h1 + h2 + h3 + h4 + h5 + h6 + h7 +   v1 + v2 + v3 + v4 + v5'

# Get adaptive weights
mle.fit <- penfa(## factor model 
                 model = syntax, 
                 data  = ccdata,
                 std.lv = TRUE,
                 ## (no) penalization
                 pen.shrink = "none",
                 eta = list(shrink = c("none" = 0), diff = c("none" = 0)),
                 strategy = "fixed",
                 verbose = FALSE)
mle.weights <- coef(mle.fit)


# Model fit
alasso.fit <- penfa(## factor model
                    model  = syntax,
                    data   = ccdata,
                    std.lv = TRUE,
                    ## penalization
                    pen.shrink = "alasso",
                    eta = list(shrink = c("lambda" = 0.01), diff = c("none" = 0)),
                    ## automatic procedure
                    strategy = "auto",
                    gamma = 4,
                    ## alasso
                    weights = mle.weights,
                    verbose = FALSE)
alasso.fit
## penfa 0.1.1 reached convergence
## 
##   Number of observations                                    767
##                                                                
##   Estimator                                                PMLE
##   Optimization method                              trust-region
##   Information                                            fisher
##   Strategy                                                 auto
##   Number of iterations (total)                               58
##   Number of two-steps (automatic)                             2
##   Effective degrees of freedom                           27.129
##                                                                
##   Penalty function:                                            
##     Sparsity                                             alasso
##                                                                
##

The penalty matrix can be extracted through the penmat function.

alasso_penmat <- penmat(alasso.fit)

Applying the plot method to the penalty matrix allows us to visualize an interactive heatmap of the log of the absolute value of the estimated penalty matrix. Due to space constraints, it may occur that some of the parameter labels on the axes are hidden. If it is the case, users can zoom in on the area of interest and inspect the corresponding penalty values.

plot(alasso_penmat)

Penalized Multiple-Group Factor Analysis

Let us fit the penalized multiple-group factor model with alasso and automatic multiple tuning procedure as described in “multiple-group-analysis”.

# Specify syntax
syntax.mg = 'help  =~ 1*h1 +          h2 +          h3 + h4 + h5 + h6 + h7 + 0*v1 + v2 + v3 + v4 + v5
             voice =~ 0*h1 + start(0)*h2 + start(0)*h3 + h4 + h5 + h6 + h7 + 1*v1 + v2 + v3 + v4 + v5
             
             h2 + h3 + h4 + h5 + h6 + h7 + v2 + v3 + v4 + v5 ~ 1 
             help  ~ NA*1
             voice ~ NA*1 '


# Get adaptive weights
mle.fitMG <- penfa(## factor model
                   model = syntax.mg,
                   data  = ccdata,
                   group = "country",
                   ## (no) penalization
                   pen.shrink = "none",
                   pen.diff = "none",
                   eta = list(shrink = c("lambda" = 0), diff = c("none" = 0)),
                   strategy = "fixed",
                   verbose = FALSE) 
mle.weightsMG <- coef(mle.fitMG)

# Model fit
alasso.fitMG <- penfa(## factor model
                      model = syntax.mg,
                      data = ccdata,
                      group = "country",
                      int.lv.free = TRUE,
                      ## penalization
                      pen.shrink = "alasso",
                      pen.diff = "alasso",
                      eta = list(shrink = c("lambda" = 0.01),
                                 diff   = c("lambda" = 0.1, "tau" = 0.01)),
                      ## automatic procedure
                      strategy = "auto",
                      gamma = 4,
                      ## alasso
                      weights = mle.weightsMG)
## 
## Automatic procedure: 
## Iteration  1 : 0.00215674 827.66749181 0.00300201 
## Iteration  2 : 0.00196781 29643.22874071 0.00772724 
## 
## Largest absolute gradient value: 62.07895697
## Fisher information matrix is positive definite
## Eigenvalue range: [53.00926, 8.675704e+12]
## Trust region iterations: 21 
## Factor solution: admissible 
## Effective degrees of freedom: 56.75804

The complete penalty matrix is stored in alasso.fit@Penalize@Sh.info$S.h, but it can be easily extracted via the penmat function. This matrix is the sum of the penalty matrices for Penalty 1 (sparsity.penmat), Penalty 2 (loadinvariance.penmat), and Penalty 3 (intinvariance.penmat). Unique variances, factor (co)variances and factor means were not affected by the penalization, so their entries in the penalty matrices are equal to zero.

Through the plot method, we can visualize an interactive heatmap of the log of the absolute value of each estimated penalty matrix (because of the wide element range). Due to space constraints given by the large number of parameters, some parameter labels on the axes may be hidden. In this case, users can zoom in on the area of interest and inspect the penalty values for the parameters of interest.

full.penmat <- penmat(alasso.fitMG)
plot(full.penmat)

The above penalty matrix is the sum of the following three individuals penalty matrices.

Sparsity

This penalty matrix shrinks the small factor loadings of each group to zero. Apart from the group loadings, all the remaining entries of the penalty matrix are equal to zero.

sparsity.penmat <- penmat(alasso.fitMG, type = "shrink", which = "lambda")
plot(sparsity.penmat)

Loading invariance

This penalty matrix shrinks the pairwise group differences of the factor loadings to zero.

loadinvariance.penmat <- penmat(alasso.fitMG, type = "diff", which = "lambda")
plot(loadinvariance.penmat)

Intercept invariance

This penalty matrix shrinks the pairwise group differences of the intercepts to zero.

intinvariance.penmat <- penmat(alasso.fitMG, type = "diff", which = "tau")
plot(intinvariance.penmat)

R Session

sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] penfa_0.1.1
## 
## loaded via a namespace (and not attached):
##   [1] nlme_3.1-152        matrixStats_0.59.0  fs_1.5.0           
##   [4] httr_1.4.2          rprojroot_2.0.2     evd_2.3-3          
##   [7] numDeriv_2016.8-1.1 tools_4.1.0         utf8_1.2.1         
##  [10] R6_2.5.0            lazyeval_0.2.2      rgeos_0.5-5        
##  [13] DBI_1.1.1           mgcv_1.8-35         colorspace_2.0-2   
##  [16] sp_1.4-5            tidyselect_1.1.1    mnormt_2.0.2       
##  [19] compiler_4.1.0      pspline_1.0-18      textshaping_0.3.5  
##  [22] desc_1.3.0          plotly_4.9.4.1      distrEx_2.8.0      
##  [25] scales_1.1.1        sfsmisc_1.1-11      mvtnorm_1.1-2      
##  [28] psych_2.1.6         pkgdown_1.6.1       systemfonts_1.0.2  
##  [31] stringr_1.4.0       digest_0.6.27       rmarkdown_2.9      
##  [34] trustOptim_0.8.6.2  pkgconfig_2.0.3     htmltools_0.5.1.1  
##  [37] VineCopula_2.4.2    fastmap_1.1.0       stabledist_0.7-1   
##  [40] ADGofTest_0.3       htmlwidgets_1.5.3   rlang_0.4.11       
##  [43] VGAM_1.1-5          cartography_3.0.0   farver_2.1.0       
##  [46] generics_0.1.0      jsonlite_1.7.2      crosstalk_1.1.1    
##  [49] dplyr_1.0.7         magrittr_2.0.1      scam_1.2-11        
##  [52] Matrix_1.3-3        Rcpp_1.0.7          munsell_0.5.0      
##  [55] fansi_0.5.0         abind_1.4-5         distr_2.8.0        
##  [58] lifecycle_1.0.0     stringi_1.7.3       yaml_2.2.1         
##  [61] MASS_7.3-54         gamlss.dist_5.3-2   matrixcalc_1.0-4   
##  [64] grid_4.1.0          parallel_4.1.0      crayon_1.4.1       
##  [67] lattice_0.20-44     splines_4.1.0       startupmsg_0.9.6   
##  [70] tmvnsim_1.0-2       knitr_1.33          ismev_1.42         
##  [73] pillar_1.6.1        stats4_4.1.0        GJRM_0.2-4         
##  [76] magic_1.5-9         glue_1.4.2          evaluate_0.14      
##  [79] trust_0.1-8         mitools_2.4         data.table_1.14.0  
##  [82] vctrs_0.3.8         tidyr_1.1.3         gtable_0.3.0       
##  [85] purrr_0.3.4         cachem_1.0.5        ggplot2_3.3.5      
##  [88] xfun_0.24           Rmpfr_0.8-4         survey_4.1-1       
##  [91] viridisLite_0.4.0   ragg_1.1.3          survival_3.2-11    
##  [94] pcaPP_1.9-74        gsl_2.1-6           tibble_3.1.2       
##  [97] copula_1.0-1        memoise_2.0.0       gmp_0.6-2          
## [100] ellipsis_0.3.2

References

Fischer, R., Ferreira, M. C., Van Meurs, N. et al. (2019). “Does Organizational Formalization Facilitate Voice and Helping Organizational Citizenship Behaviors? It Depends on (National) Uncertainty Norms.” Journal of International Business Studies, 50(1), 125-134. https://doi.org/10.1057/s41267-017-0132-6
Fischer, R., & Karl, J. A. (2019). “A Primer to (Cross-Cultural) Multi-Group Invariance Testing Possibilities in R.” Frontiers in psychology, 10, 1507. https://doi.org/10.3389/fpsyg.2019.01507
Geminiani, E. (2020). “A Penalized Likelihood-Based Framework for Single and Multiple-Group Factor Analysis Models.” PhD thesis, University of Bologna. http://amsdottorato.unibo.it/9355/
Geminiani, E., Marra, G., & Moustaki, I. (2021). “Single- and Multiple-Group Penalized Factor Analysis: A Trust-Region Algorithm Approach with Integrated Automatic Multiple Tuning Parameter Selection.” Psychometrika, 86(1), 65-95. https://doi.org/10.1007/s11336-021-09751-8