SimSearchManifoldLearning.jl

Non-linear dimensional reduction using SimilaritySearch (ManifoldLearning and UMAP)
Author sadit
Popularity
9 Stars
Updated Last
10 Months Ago
Started In
March 2022

Dev Build Status

SimilaritySearch and ManifoldLearning (and UMAP)

This package provides some support to use SimilaritySearch with manifold learning methods. In particular, we implement the required methods to implement knn function for ManifoldLearning and also provides an UMAP implementation that takes advantage of many SimilaritySearch features like multithreading and data independency; it supports string, sets, vectors, etc. under diverse distance functions.

The ManifoldLearning support is limited to some structure specification due to the design of the package. See the ManifoldKnnIndex type in the documentation pages.

UMAP implementation

This package also provides a pure Julia implementation of the Uniform Manifold Approximation and Projection dimension reduction algorithm

McInnes, L, Healy, J, Melville, J, UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. ArXiV 1802.03426, 2018

The implementation in this package is based on the UMAP.jl package by Dillon Gene Daudert and collaborators. Forked and adapted to work with SimilaritySearch and take advantage of multithreading systems.

The provided implementation partially supports the ManifoldLearning API using fit and predict and similar arguments. It can use any distance function from SimilaritySearch, Distances.jl, StringDistances.jl, or any distance function implemented by the user.

Additionally, it improves multithreading support in the KNN and the UMAP projection.

Examples and demonstrations

https://sadit.github.io/SimilaritySearchDemos/

NOTE: Currently, these examples are working with direct implementations of this package. This will be changed soon (I will start once the package becomes part of the general registry).

Disclaimer

This implementation is a work-in-progress. If you encounter any issues, please create an issue or make a pull request.