Introduction to mixedLSR

Setup

Mixed, low-rank, and sparse multivariate regression (mixedLSR) provides tools for performing mixture regression when the coefficient matrix is low-rank and sparse. mixedLSR allows subgroup identification by alternating optimization with simulated annealing to encourage global optimum convergence. This method is data-adaptive, automatically performing parameter selection to identify low-rank substructures in the coefficient matrix.

library(mixedLSR)
set.seed(1)

Simulate Data

To demonstrate mixedLSR, we simulate a heterogeneous population where the coefficient matrix is low-rank and sparse and the number of coefficients to estimate is much larger than the sample size.

sim <- simulate_lsr(N = 100, k = 2, p = 30, m = 35)

Compute Model

Then, we compute the model. We limit the number of iterations the model can run.

model <- mixed_lsr(sim$x, sim$y, k = 2, alt_iter = 1, anneal_iter = 10, em_iter = 10, verbose = TRUE)
#> mixedLSR Start: 1 
#> Selecting Lambda..................................................
#> EM Step.....
#> Simulated Annealing Step
#> Full Cycle 1 
#> Computing Final Model...
#> Done!

Clustering Performance

Next, we can evaluate the clustering performance of mixedLSR by viewing a cross-tabulation of the partition labels and by computing the adjusted Rand index (ARI). In this case, mixedLSR perfectly clustered the data.

table(sim$true, model$assign)
#>    
#>      1  2
#>   1 52  0
#>   2  0 48
ari <- mclust::adjustedRandIndex(sim$true, model$assign)
print(paste("ARI:",ari))
#> [1] "ARI: 1"

Coefficient Heatmaps

Lastly, we can view a heatmap of the coefficient matrices and compare them to the true simulated matrices.

plot_lsr(model$a)

plot_lsr(sim$a)

Reproducibility

sessionInfo()
#> R version 4.4.3 (2025-02-28)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] mixedLSR_0.1.0 rmarkdown_2.29
#> 
#> loaded via a namespace (and not attached):
#>  [1] Matrix_1.7-3      gtable_0.3.6      jsonlite_2.0.0    compiler_4.4.3   
#>  [5] jquerylib_0.1.4   scales_1.3.0      yaml_2.3.10       fastmap_1.2.0    
#>  [9] lattice_0.22-6    ggplot2_3.5.1     R6_2.6.1          labeling_0.4.3   
#> [13] knitr_1.50        MASS_7.3-65       tibble_3.2.1      maketools_1.3.2  
#> [17] munsell_0.5.1     bslib_0.9.0       pillar_1.10.1     rlang_1.1.5      
#> [21] cachem_1.1.0      xfun_0.51         sass_0.4.9        sys_3.4.3        
#> [25] grpreg_3.5.0      viridisLite_0.4.2 cli_3.6.4         withr_3.0.2      
#> [29] magrittr_2.0.3    digest_0.6.37     grid_4.4.3        mclust_6.1.1     
#> [33] lifecycle_1.0.4   vctrs_0.6.5       evaluate_1.0.3    glue_1.8.0       
#> [37] farver_2.1.2      buildtools_1.0.0  colorspace_2.1-1  purrr_1.0.4      
#> [41] tools_4.4.3       pkgconfig_2.0.3   htmltools_0.5.8.1