NonConvexPenalizedRegression.jl

Regularization paths for MCP and SCAD penalized regression models
Author pnavaro
Popularity
2 Stars
Updated Last
4 Months Ago
Started In
April 2020

Dev Build Status Codecov

NonConvexPenalizedRegression.jl

Regularization Paths for SCAD and MCP Penalized Regression Models

This is a quick, naive and partial Julia translation of the R package ncvreg. Only gaussian family is translated but if you need more, just ask, i can do it if it is useful for someone.

Algorithm is described in Breheny P and Huang J (2011) "Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection". Annals of Applied Statistics, 5: 232–253

I needed to do regression with SCAD penalty and I can't find it in any Julia package. Perharps it is now implemented in MLJLinearModels.jl.

julia> using LinearAlgebra 
julia> using Random
julia> using NonConvexPenalizedRegression 
julia> using RCall
julia> rng = MersenneTwister(1234);
julia> n, p = 50, 5
(50, 5)

julia> X = randn(rng, n, p)              # feature matrix
50×5 Array{Float64,2}:
  0.867347   -1.22672     0.183976    0.0377058   1.48494
 -0.901744   -0.541716   -1.27635    -0.490009    1.23969
  ⋮
 -1.00978    -1.66323    -0.744522    0.427383   -1.37986
 -0.543805   -0.521229   -0.191176   -0.492253   -0.984217

julia> a0 = collect(1:p)                # ground truths
5-element Array{Int64,1}:
 1
 2
 3
 4
 5

julia> y = X * a0 + 0.1 * randn(n) # generate response
50-element Array{Float64,1}:
   6.411769869798991
  -1.59817739694925
   ⋮
 -11.769008550241434
  -9.107294931708777

julia> XX = hcat(X, randn(rng, n, p))
50×10 Array{Float64,2}:
  0.867347   -1.22672     0.183976    0.0377058  …   1.69129     0.969694    0.222167    -0.953909
  ⋮                                              ⋱
 -0.543805   -0.521229   -0.191176   -0.492253      -2.5788     -0.329958    0.00775707  -0.370354

julia> @rput XX
50×10 Array{Float64,2}:
  0.867347   -1.22672     0.183976    0.0377058  …   1.69129     0.969694    0.222167    -0.953909
  ⋮                                              ⋱
 -0.543805   -0.521229   -0.191176   -0.492253      -2.5788     -0.329958    0.00775707  -0.370354

julia> @rput y
50-element Array{Float64,1}:
   6.411769869798991
  -1.59817739694925
   ⋮
 -11.769008550241434
  -9.107294931708777

julia> R"library(ncvreg)"
RObject{StrSxp}
[1] "ncvreg"    "stats"     "graphics"  "grDevices" "utils"     "datasets"
[7] "methods"   "base"

julia> R"scad <- coef(ncvreg(XX, y, lambda=0.2, penalty='SCAD', eps=.0001))"
RObject{RealSxp}
 (Intercept)           V1           V2           V3           V4           V5
-0.003322903  1.025666431  2.000108987  2.983498545  3.997543804  4.982264144
          V6           V7           V8           V9          V10
 0.000000000  0.000000000  0.000000000  0.000000000  0.000000000

julia> @rget scad
11-element Array{Float64,1}:
 -0.0033229032078964105
  1.0256664305062173
  2.000108987156345
  2.9834985454557157
  3.997543803872521
  4.982264143916537
  0.0
  0.0
  0.0
  0.0
  0.0

julia> println( " R scad = $scad")
 R scad = [-0.0033229032078964105, 1.0256664305062173, 2.000108987156345, 2.9834985454557157, 3.997543803872521, 4.982264143916537, 0.0, 0.0, 0.0, 0.0, 0.0]

julia> λ = [0.2]
1-element Array{Float64,1}:
 0.2

julia> scad = NonConvexPenalizedRegression.coef(SCAD(XX, y, λ))
SCAD([-0.003322960709765954; 1.0256660512338405; … ; 0.0; 0.0])

julia> println( " Julia scad = $scad")
 Julia scad = SCAD([-0.003322960709765954; 1.0256660512338405; 2.00010933635426; 2.983498839847109; 3.99754375703709; 4.982264100245242; 0.0; 0.0; 0.0; 0.0; 0.0])