# MatrixLMnet

Core functions to obtain L_{1}-penalized estimates for matrix linear models. See the associated paper, "Sparse matrix linear models for structured high-throughput data", and its reproducible code for more details.

`MatrixLMnet`

is related to the `MatrixLM`

package, which provides core functions for closed-form least squares estimates for matrix linear models.

## Installation

The `MatrixLMnet`

package can be installed by running:

```
using Pkg
Pkg.add("MatrixLMnet")
```

For the most recent version, use:

```
using Pkg
Pkg.add(url = "https://github.com/senresearch/MatrixLMnet.jl", rev="master")
```

## Usage

```
using MatrixLMnet
```

First, construct a `RawData`

object consisting of the response variable `Y`

and row/column covariates `X`

and `Z`

. All three matrices must be passed in as 2-dimensional arrays. Note that the `contr`

function can be used to set up treatment and/or sum contrasts for categorical variables stored in a DataFrame. By default, `contr`

generates treatment contrasts for all specified categorical variables (`"treat"`

). Other options include `"sum"`

for sum contrasts, `"noint"`

for treatment contrasts with no intercept, and `"sumnoint"`

for sum contrasts with no intercept.

```
using DataFrames
using Random
# Dimensions of matrices
n = 100
m = 250
# Number of column covariates
q = 20
# Randomly generate an X matrix of row covariates with 2 categorical variables
# and 4 continuous variables
Random.seed!(1)
X_df = hcat(DataFrame(catvar1=rand(1:5, n), catvar2=rand(["A", "B", "C"], n)),
DataFrame(rand(n,4)))
# Use the contr function to get contrasts for the two categorical variables
# (treatment contrasts for catvar1 and sum contrasts for catvar2).
# contr returns a DataFrame, so X needs to be converted into a 2d array.
X = convert(Array{Float64,2}, contr(X_df, [:catvar1, :catvar2],
["treat", "sum"]))
# Number of row covariates
p = size(X)[2]
# Randomly generate some data for column covariates Z and response variable Y
Z = rand(m,q)
B = rand(vcat(-5:5, zeros(19)),p,q)
E = randn(n,m)
Y = X*B*transpose(Z)+E
# Construct a RawData object
dat = RawData(Response(Y), Predictors(X, Z))
```

Create a 1d array of lambda penalty values to fit the estimates. If the lambdas are not in descending order, they will be automatically sorted by `mlmnet`

.

```
lambdas = reverse(1.8.^(1:10))
```

L_{1}-penalized estimates for matrix linear models can be obtained by running `mlmnet`

. In addition to the `RawData`

object and `lambdas`

, `mlmnet`

requires as an argument the function name for an algorithm used to fit L_{1}-penalized estimates. Current options are:`cd!`

(coordinate descent), `cd_active!`

(active coordinate descent), `ista!`

(ISTA with fixed step size), `fista!`

(FISTA with fixed step size), `fista_bt!`

(FISTA with backtracking), and `admm!`

(ADMM).

An object of type `Mlmnet`

will be returned, with variables for the penalized coefficient estimates (`B`

) and the lambda penalty values used (`lambdas`

). By default, `mlmnet`

estimates both row and column main effects (X and Z intercepts), but this behavior can be suppressed by setting `isXIntercept=false`

and/or `isZntercept=false`

; the intercepts will be regularized unless `isXInterceptReg=false`

and/or `isZInterceptReg=false`

. Individual `X`

(row) and `Z`

(column) effects can be left unregularized by manually passing in 1d boolean arrays of length `p`

and `q`

to indicate which effects should be regularized (`true`

) or not (`false`

) into `isXReg`

and `isZreg`

. By default, `mlmnet`

standardizes the columns of `X`

and `Z`

to have mean 0 and standard deviation 1 (`isStandardize=true`

). Additional keyword arguments include `isVerbose`

, which controls message printing; `thresh`

, the threshold at which the coefficients are considered to have converged; and `maxiter`

, the maximum number of iterations.

```
est = mlmnet(fista_bt!, dat, lambdas)
```

The functions for the algorithms used to fit the L_{1}-penalized estimates have keyword arguments that can be passed into `mlmnet`

when non-default behavior is desired. Irrelevant keyword arguments will be ignored.

`cd!`

(coordinate descent)

`isRandom= true`

: Bool; whether to use random or cyclic updates

`cd_active!`

(active coordinate descent)

`isRandom = true`

: Bool; whether to use random or cyclic updates

`ista!`

(ISTA with fixed step size)

`stepsize = 0.01`

: Float64; fixed step size for updates`setStepsize = true`

: Bool; whether the fixed step size should be calculated, overriding`stepsize`

`fista!`

(FISTA with fixed step size)

`stepsize = 0.01`

: Float64; fixed step size for updates`setStepsize = true`

: Bool; whether the fixed step size should be calculated, overriding`stepsize`

`fista_bt!`

(FISTA with backtracking)

`stepsize = 0.01`

: Float64; initial step size for updates`gamma = 0.5`

: Float64; multiplying factor for step size backtracking/line search

`admm!`

(ADMM)

`rho = 1.0`

: Float64; parameter that controls ADMM tuning`setRho = true`

: Float64; whether the ADMM tuning parameter should be calculated, overriding`rho`

`tau_incr = 2.0`

: Float64; parameter that controls the factor at which the ADMM tuning parameter increases`tau_decr = 2.0`

: Float64; parameter that controls the factor at which the ADMM tuning parameter decreases`mu = 10.0`

: Float64; parameter that controls the factor at which the primal and dual residuals should be within each other

The 3d array of coefficient estimates can be accessed using `coef(est)`

. Predicted values and residuals can be obtained by calling `predict`

and `resid`

. By default, both of these functions use the same data used to fit the model. However, a new `Predictors`

object can be passed into `predict`

as the `newPredictors`

argument and a new `RawData`

object can be passed into `resid`

as the `newData`

argument. For convenience, `fitted(est)`

will return the fitted values by calling `predict`

with the default arguments.

```
preds = predict(est)
resids = resid(est)
```

All four of these functions take an optional `lambda`

argument, in which case only the 2d array corresponding to that value of lambda will be returned, e.g. `coef(est, lambdas[1])`

. (If a lambda value that was not used in the fitting of the `Mlmnet`

object is passed in, an error will be rasied.) One can also extract coefficients as a flattened 2d array by calling `coef_2d(est)`

, for convenience when writing the results to flat files.

`mlmnet_perms`

permutes the response matrix `Y`

in a `RawData`

object and then calls `mlmnet`

. By default, the function used to permute `Y`

is `shuffle_rows`

, which shuffles the rows for `Y`

. Alternative functions for permuting `Y`

, such as `shuffle_cols`

, can be passed into the argument `permFun`

. Non-default behavior for `mlmnet`

can be specified by passing its keyword arguments into `mlmnet_perms`

.

```
estPerms = mlmnet_perms(fista_bt!, dat, lambdas)
```

Cross-validation for `mlmnet`

is implemented by `mlmnet_cv`

. The user can either manually specify the row/column folds of `Y`

as a 1d array of 1d arrays of row/column indices, or specify the number of folds that should be used. If the number of folds is specified, disjoint folds of approximately equal size will be generated from a call to `make_folds`

. Passing in `1`

for the number of row (or column) folds indicates that all of the rows (or columns) of `Y`

should be used in each fold. The advantage of manually passing in the row and/or column folds is that it allows the user to stratify or otherwise control the nature of the folds. For example, `make_folds_conds`

will generate folds for a set of categorical conditions and ensure that each condition is represented in each fold. Cross validation is computed in parallel when possible. Non-default behavior for `mlmnet`

can be specified by passing its keyword arguments into `mlmnet_cv`

.

In the call below, `mlmnet_cv`

generates 10 disjoint row folds but uses all columns of `Y`

in each fold (indicated by the `1`

). The function returns an `Mlmnet_cv`

object, which contains an array of the Mlmnet objects for each fold (`MLMNets`

); the lambda penalty values used (`lambdas`

); the row and column folds (`rowFolds`

and `colFolds`

); an array of the mean-squared error for each fold (`mse`

); and an array of the proportion of zero interaction effects for each fold (`propZero`

). The keyword argument `dig`

in `mlmnet_cv`

adjusts the level of precision when calculating the percent of zero coefficients. It defaults to `12`

.

```
Random.seed!(120)
estCVObjs = mlmnet_cv(fista_bt!, dat, lambdas, 10, 1)
```

`mlmnet_cv_summary`

displays a table of the average mean-squared error and proportion of zero coefficients across the folds for each value of lambda. The optimal lambda might be the one that minimizes the mean-squared error (MSE), or can be chosen based on a pre-determined proportion of zeros that is desired in the coefficient estimates.

```
println(mlmnet_cv_summary(estCVObjs))
```

The `lambda_min`

function returns the summary information for the lambdas that correspond to the minimum average test MSE across folds and the MSE that is one standard error greater.

```
lambda_min(estCVObjs)
```

Additional details can be found in the documentation for specific functions.