PosDefManifoldML.jl

Documentation

PosDefManifoldML is a Julia package for classifying data in the Riemannian manifolds P of real or complex positive definite matrices. It is based on the PosDefManifold.jl, GLMNet.jl and LIBSVM.jl packages.

Machine learning (ML) in P can either operate directly on the manifold, which requires dedicated Riemannian methods, or the data can be projected onto the tangent space, where standard (Euclidean) machine learning methods apply (e.g., linear discriminant analysis, support-vector machine, logistic regression, random forest, deep neuronal networks, etc).

For the moment being, PosDefManifoldML implements the Riemannian Minimum Distance to Mean (MDM) classifier, which operates directly in P, the elastic net logistic regression (including the pure Ridge and pure Lasso logistic regression model) and several support-vector machine classifiers in the tangent space. The models operating in the tangent space can be used also for traditional (Euclidean) feature vectors, making of this package also a nice interface to the binomial family of generalized linear models implemented in GLMNet.jl and all SVM models implemented in LIBSVM.jl

Installation

Run this line in Julia's REPL:

]add PosDefManifold PosDefManifoldML

Examples

using PosDefManifold, PosDefManifoldML

# simulate symmetric positive definite (SDP) matrices data for a 2-class problem.
# P is a vector of SPD matrices, y a vector of labels. Tr=training, Te=testing.
# SDP matrices will be all of size 10x10.
# The training set will have 30 matrices for class 1 and 40 for class 2.
# The testing set will have 60 matrices for class 1 and 80 for class 2.
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80)

# # # MACHINE LEARNING IN THE PD MANIFOLD # # #

# (1)
# create and fit (train) a Riemannian Minimum Distance to Mean (MDM) model:
model=fit(MDM(), PTr, yTr)
#
# predict labels (classify the testing set):
yPred=predict(model, PTe, :l)
#
# prediction error (in proportion)
predictErr(yTe, yPred)
#
# predict probabilities for the matrices in `PTe` of belonging to each class:
predict(model, PTe, :p)

# (2)
# average accuracy obtained by 10-fold cross-validation:
cv = cvAcc(MDM(), PTr, yTr)

# # # MACHINE LEARNING IN THE TANGENT SPACE # # #

# (1)
# create and fit (train) LASSO Logistic Regression models
# finding the best model by cross-validation:
model=fit(ENLR(), PTr, yTr)
#
# predict labels (classify the testing set) using the 'best' model:
yPred=predict(model, PTe, :l)
#
# prediction error (in proportion)
predictErr(yTe, yPred)
#
# ...
#
# create and fit a RIDGE logistic regression model
model=fit(ENLR(), PTr, yTr; alpha=0)
#
#...
#
# create and fit an ELASTIC NET logistic regression model with alpha = 0.5
model=fit(ENLR(), PTr, yTr; alpha=0.5)

# (2)
# average accuracy obtained by 10-fold cross-validation:
cv = cvAcc(ENLR(), PTr, yTr; alpha=0.5)

# (1)
# create and fit (train) an SVM with Radial Basis kernel

# finding the best model by cross-validation:
model=fit(SVM(), PTr, yTr)
#
# predict labels (classify the testing set) using the 'best' model:
yPred=predict(model, PTe, :l)
#
# prediction error (in proportion)
predictErr(yTe, yPred)
#
# ...

# (2)
# average accuracy obtained by 10-fold cross-validation:
cv = cvAcc(SVM(), PTr, yTr)

About the Authors

Marco Congedo, corresponding author, is a Research Director of CNRS (Centre National de la Recherche Scientifique), working in UGA (University of Grenoble Alpes). contact: marco dot congedo at gmail dot com

Anton Andreev is a Research Engineer working at the same institution.

Saloni Jain at the time of writing this package was a Student at the Indian Institute of Technology, Kharagpur, India.

Documentation