Title: | Mixed, Low-Rank, and Sparse Multivariate Regression on High-Dimensional Data |
---|---|
Description: | Mixed, low-rank, and sparse multivariate regression ('mixedLSR') provides tools for performing mixture regression when the coefficient matrix is low-rank and sparse. 'mixedLSR' allows subgroup identification by alternating optimization with simulated annealing to encourage global optimum convergence. This method is data-adaptive, automatically performing parameter selection to identify low-rank substructures in the coefficient matrix. |
Authors: | Alexander White [aut, cre] |
Maintainer: | Alexander White <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2025-03-02 06:06:48 UTC |
Source: | https://github.com/alexanderjwhite/mixedlsr |
Compute Bayesian information criterion for a mixedLSR model
bic_lsr(a, n, llik)
bic_lsr(a, n, llik)
a |
A list of coefficient matrices. |
n |
The sample size. |
llik |
The log-likelihood of the model. |
The BIC.
n <- 50 simulate <- simulate_lsr(n) model <- mixed_lsr(simulate$x, simulate$y, k = 2, init_lambda = c(1,1), alt_iter = 0) bic_lsr(model$A, n = n, model$llik)
n <- 50 simulate <- simulate_lsr(n) model <- mixed_lsr(simulate$x, simulate$y, k = 2, init_lambda = c(1,1), alt_iter = 0) bic_lsr(model$A, n = n, model$llik)
Mixed Low-Rank and Sparse Multivariate Regression for High-Dimensional Data
mixed_lsr( x, y, k, nstart = 1, init_assign = NULL, init_lambda = NULL, alt_iter = 5, anneal_iter = 1000, em_iter = 1000, temp = 1000, mu = 0.95, eps = 1e-06, accept_prob = 0.95, sim_N = 200, verbose = TRUE )
mixed_lsr( x, y, k, nstart = 1, init_assign = NULL, init_lambda = NULL, alt_iter = 5, anneal_iter = 1000, em_iter = 1000, temp = 1000, mu = 0.95, eps = 1e-06, accept_prob = 0.95, sim_N = 200, verbose = TRUE )
x |
A matrix of predictors. |
y |
A matrix of responses. |
k |
The number of groups. |
nstart |
The number of random initializations, the result with the maximum likelihood is returned. |
init_assign |
A vector of initial assignments, NULL by default. |
init_lambda |
A vector with the values to initialize the penalization parameter for each group, e.g., c(1,1,1). Set to NULL by default. |
alt_iter |
The maximum number of times to alternate between the classification expectation maximization algorithm and the simulated annealing algorithm. |
anneal_iter |
The maximum number of simulated annealing iterations. |
em_iter |
The maximum number of EM iterations. |
temp |
The initial simulated annealing temperature, temp > 0. |
mu |
The simulated annealing decrease temperature fraction. Once the best configuration cannot be improved, reduce the temperature to (mu)T, 0 < mu < 1. |
eps |
The final simulated annealing temperature, eps > 0. |
accept_prob |
The simulated annealing probability of accepting a new assignment 0 < accept_prob < 1. When closer to 1, trial assignments will only be small perturbation of the current assignment. When closer to 0, trial assignments are closer to random. |
sim_N |
The simulated annealing number of iterations for reaching equilibrium. |
verbose |
A boolean indicating whether to print to screen. |
A list containing the likelihood, the partition, the coefficient matrices, and the BIC.
simulate <- simulate_lsr(50) mixed_lsr(simulate$x, simulate$y, k = 2, init_lambda = c(1,1), alt_iter = 0)
simulate <- simulate_lsr(50) mixed_lsr(simulate$x, simulate$y, k = 2, init_lambda = c(1,1), alt_iter = 0)
Heatmap Plot of the mixedLSR Coefficient Matrices
plot_lsr(a, abs = TRUE)
plot_lsr(a, abs = TRUE)
a |
A coefficient matrix from mixed_lsr model. |
abs |
A boolean for taking the absolute value of the coefficient matrix. |
A ggplot2 heatmap of the coefficient matrix, separated by subgroup.
simulate <- simulate_lsr() plot_lsr(simulate$a)
simulate <- simulate_lsr() plot_lsr(simulate$a)
Simulate Heterogeneous, Low-Rank, and Sparse Data
simulate_lsr( N = 100, k = 2, p = 30, m = 35, b = 1, d = 20, h = 0.2, case = "independent" )
simulate_lsr( N = 100, k = 2, p = 30, m = 35, b = 1, d = 20, h = 0.2, case = "independent" )
N |
The sample size, default = 100. |
k |
The number of groups, default = 2. |
p |
The number of predictor features, default = 30. |
m |
The number of response features, default = 35. |
b |
The signal-to-noise ratio, default = 1. |
d |
The singular value, default = 20. |
h |
The lower bound for the singular matrix simulation, default = 0.2. |
case |
The covariance case, "independent" or "dependent", default = "independent". |
A list of simulation values, including x matrix, y matrix, coefficients and true clustering assignments.
simulate_lsr()
simulate_lsr()
Simulate Heterogeneous, Low-Rank, and Sparse Data with Autoregressive Response
simulate_response( N = 100, k = 2, p = 30, m = 35, b = 1, d = 20, h = 0.2, case = "independent", response = "independent" )
simulate_response( N = 100, k = 2, p = 30, m = 35, b = 1, d = 20, h = 0.2, case = "independent", response = "independent" )
N |
The sample size, default = 100. |
k |
The number of groups, default = 2. |
p |
The number of predictor features, default = 30. |
m |
The number of response features, default = 35. |
b |
The signal-to-noise ratio, default = 1. |
d |
The singular value, default = 20. |
h |
The lower bound for the singular matrix simulation, default = 0.2. |
case |
The covariance case, "independent" or "dependent", default = "independent". |
A list of simulation values, including x matrix, y matrix, coefficients and true clustering assignments.
simulate_response()
simulate_response()
Simulate Heterogeneous, Low-Rank Data with Varying Sparsity
simulate_sparse(k = 2, dense = 0.1)
simulate_sparse(k = 2, dense = 0.1)
k |
The number of groups, default = 2. |
dense |
The density ratio (must be greater than 0). |
A list of simulation values, including x matrix, y matrix, coefficients and true clustering assignments.
simulate_lsr()
simulate_lsr()