SelfConcordantSmoothOptimization.jl
is a Julia package that implements the self-concordant regularization (SCORE) technique for (nonsmooth) convex optimization. In particular, SelfConcordantSmoothOptimization.jl
considers problems of the form
minimize f(x) + g(x)
where
Install the package via Julia's REPL
with
using Pkg
Pkg.add("SelfConcordantSmoothOptimization")
# Load the package
using SelfConcordantSmoothOptimization
using Random, Distributions, SparseArrays
# Generate a random data
Random.seed!(1234)
n, m = 100, 50;
A = Matrix(sprandn(n, m, 0.01));
y_prob = 1 ./ (1 .+ exp.(-A * zeros(m)));
y = rand.(Bernoulli.(y_prob));
unique_y = unique(y);
y = map(x -> x==unique_y[1] ? -1 : 1, y);
x0 = randn(m);
# Define objective function and choose problem parameters
f(A, y, x) = 1/m*sum(log.(1 .+ exp.(-y .* (A*x))));
# choose problem parameters
reg_name = "l1";
λ = 0.4;
μ = 1.0;
hμ = PHuberSmootherL1L2(μ);
# set problem
model = Problem(A, y, x0, f, λ);
# Choose method and run the solver (see ProxGGNSCORE below)
method = ProxNSCORE();
solution = iterate!(method, model, reg_name, hμ; max_epoch=100, x_tol=1e-6, f_tol=1e-6);
# get the solution x
solution.x
To use the ProxGGNSCORE
algorithm, a model output function ProxGGNSCORE
; Julia will decide which one to use at any instance -- thanks to the multiple dispatch feature):
# f as defined above
f(A, y, x) = 1/m*sum(log.(1 .+ exp.(-y .* (A*x))));
# f as a function of y and yhat
f(y, yhat) = -1/m*sum(y .* log.(yhat) .+ (1 .- y) .* log.(1 .- yhat))
# where yhat = Mfunc(A, x) is defined by the model output function
Mfunc(A, x) = vcat([1 ./ (1 .+ exp.(-A*x))]...)
# set problem
model = Problem(A, y, x0, f, λ; out_fn=Mfunc);
# Choose method and run the solver
method = ProxGGNSCORE();
solution = iterate!(method, model, reg_name, hμ; max_epoch=100, x_tol=1e-6, f_tol=1e-6);
# get the solution x
solution.x
By default, this package computes derivatives using ForwardDiff.jl
. But users can supply functions that compute the derivates involved in the algorithms. This has at least two benefits:
- Faster computations
- Avoid the need to define
$\mathrm{f}$ twice when usingProxGGNSCORE
In the example above:
f(A, y, x) = 1/m*sum(log.(1 .+ exp.(-y .* (A*x))));
S(x) = exp.(-y .* (A*x));
# gradient of f wrt x:
function grad_fx(A, y, x)
Sx = S(x)
return -A' * (y .* (Sx ./ (1 .+ Sx))) / m
end;
# Hessian of f wrt x:
function hess_fx(A, y, x)
Sx = 1 ./ (1 .+ S(x))
W = Diagonal(Sx .* (1 .- Sx))
hess = A' * W * A
return hess / m
end;
# The following are used by ProxGGNSCORE
# Jacobian of yhat wrt x:
jac_yx(A, y, yhat, x) = vec(yhat .* (1 .- yhat)) .* A;
# gradient of \ell wrt yhat:
grad_fy(A, y, yhat) = (-y ./ yhat .+ (1 .- y) ./ (1 .- yhat))/m;
# Hessian of \ell wrt yhat:
hess_fy(A, y, yhat) = Diagonal((y ./ yhat.^2 + (1 .- y) ./ (1 .- yhat).^2)/m);
# Now (for ProxNSCORE):
model_n = Problem(A, y, x0, f, λ; grad_fx=grad_fx, hess_fx=hess_fx);
method_n = ProxNSCORE();
sol_n = iterate!(method_n, model_n, reg_name, hμ; max_epoch=100, x_tol=1e-6, f_tol=1e-6);
sol_n.x
# And for ProxGGNSCORE (does not require hess_fx):
model_ggn = Problem(A, y, x0, f, λ; out_fn=Mfunc, grad_fx=grad_fx, jac_yx=jac_yx, grad_fy=grad_fy, hess_fy=hess_fy);
method_ggn = ProxGGNSCORE();
sol_ggn = iterate!(method_ggn, model_ggn, reg_name, hμ; max_epoch=100, x_tol=1e-6, f_tol=1e-6);
sol_ggn.x
(For sparse group lasso example, see example from paper in /examples/paper
).
Below is a summary of functions
Algorithm | Supported |
---|---|
ProxNSCORE |
|
ProxGGNSCORE |
|
ProxQNSCORE |
|
Arg | Description & usage |
---|---|
ss_type |
1 and with a given L in Problem sets 1 without setting L in Problem sets iterate! (or a default value 2 uses the "inverse" of Barzilai-Borwein method to set 3 uses a line-search method to choose 1 .method = ProxGGNSCORE(ss_type=1)
|
use_prox |
true uses the proximal scheme as described in the paper.false skips the proximal step and takes only the associated Newton-type/gradient-based step.true .method = ProxGGNSCORE(use_prox=true)
|
As the package name and description imply, the implemented algorithms use a generalized self-concordant smooth approximation reg_name
that takes a string value in the iterate!
function. We summarize below currently implemented regularization functions, as well as the corresponding smoothing functions
reg_name value |
Implemented |
Remark(s) |
---|---|---|
"l1" |
PHuberSmootherL1L2(μ) OsBaSmootherL1L2(μ) |
|
"l2" |
PHuberSmootherL1L2(μ) OsBaSmootherL1L2(μ) |
|
"gl" |
PHuberSmootherGL(μ, model) |
For sparse group lasso regularizer |
"indbox" |
ExponentialSmootherIndBox(lb,ub,μ) PHuberSmootherIndBox(lb,ub,μ) |
lb : lower bound in the box constraints ub : upper bound in the box constraints |
Note
PHuberSmootherL1L2
, PHuberSmootherGL
, ExponentialSmootherIndBox
are recommended.
For more details and insights on the approach implemented in this package, please see the associated paper in Citing below.
If you use SelfConcordantSmoothOptimization.jl
in your work, particularly the algorithms listed above, we kindly request that you cite the following paper:
@article{adeoye2023self,
title={Self-concordant Smoothing for Large-Scale Convex Composite Optimization},
author={Adeoye, Adeyemi D and Bemporad, Alberto},
journal={arXiv preprint arXiv:2309.01781},
year={2024}
}
Please use the Github issue tracker for reporting any issues. All types of issues are welcome including bug reports, feature requests, implementation for a specific research problem, etc.