Welcome to the LearningAlgebraicVarieties package from the article Learning Algebraic Varieties from Samples by P. Breiding, S. Kalisnik, B. Sturmfels and M. Weinstein.
Special thanks to V. Valve who helped updating the PHCcurve diagrams.
To install the package, open a new Julia
session and type
]add LearningAlgebraicVarieties
After the installation is completed the command
using LearningAlgebraicVarieties
loads all the functions into the current session.
All functions accept m data points in ℝ^n or ℙ^(n-1) as an m×n matrix Ω; i.e., as arrays.
We provide some datasets in the JLD2 data format. Download the datasets.jld2
from this repository, navigate to its folder and use
using JLD2 # add this packages with ]add JLD2
@load "datasets.jld2"
Now, your session should contain a dictionary with name data
containing some datasets.
Here is an example:
DimensionDiagrams(Ω, true)
plots the all dimension diagrams for the data in projective space. On the other hand,
DimensionDiagrams(Ω, false, diagrams = [:CorrSum, :BoxCounting], eps_ticks = 10)
plots the dimension diagrams CorrSum and BoxCounting for data in euclidean space. The estimates are computed for 10 values of ϵ between 0 and 1.
The complete syntax of the DimensionDiagrams
function is as follows.
DimensionDiagrams(
Ω::Array{T,2},
projective::Bool;
diagrams = [:CorrSum, :BoxCounting, :NPCA, :MLE, :ANOVA, :PHCurve],
eps_ticks = 25,
fontsize = 16,
lw = 4
) where {T <: Number}
Here:
Ω
is a matrix whose columns are the data points.projective = false
: makes diagrams in euclidean space.projective = true
: makes diagrams in projective space. There are some optional arguments.methods
: lists the dimension estimators to be plotted.eps_ticks = k
: puts k evenly spaced ϵ into [0,1]. At those ϵs the dimensions are computed.fontsize
: sets the font size of the axes.lw
: sets the line width.
Here is an example.
MultivariateVandermondeMatrix(Ω, 2, true)
computes the multivariate Vandermonde matrix for the sample Ω and all monomials of degree 2. The true
value determines homogeneous equations. On the other hand, the Vandermonde matrix with all monomials of degree at most 2 is computed by
MultivariateVandermondeMatrix(Ω, 2, false)
It is also possible to define the exponents involved. For example,
exponents = [[1,0,0], [1,1,1]]
MultivariateVandermondeMatrix(Ω, exponents)
computes the multivariate Vandermonde matrix for Ω ⊂ ℝ^3 and the monomials x_1
and x_1 x_2 x_3
.
Here is the full syntax
MultivariateVandermondeMatrix(Ω::Array{T},
d::Int64,
homogeneous_equations::Bool)
MultivariateVandermondeMatrix(data::Array{T},
exponents::Vector)
where
Ω
is a matrix whose colums are the data.d
is the degree of the monomials.homogeneous_equations = true
restricts the space of monomials to monomials of degree d.homogeneous_equations = false
computes all monomials of degree at most d.exponents
is an array of exponents vectors.
Here is an example.
FindEquations(Ω, :with_svd, 2, true)
finds homogeneous equations of degree 2 using SVD to compute the kernel of the Vandermonde matrix, while
FindEquations(Ω, :with_qr, 3, false)
finds all polynomials of degree at most 3 and uses QR to compute the kernel of the Vandermonde matrix.
To find all equations with support x_1 x_2
and x_1^2
using the reduced row echelon form to compute the kernel, type
exponents = [[1,1], [2,0]]
FindEquations(Ω, :with_rref, exponents)
A multivariate Vandermonde matrix may be passed to FindEquations:
M = MultivariateVandermondeMatrix(Ω, 2, false)
FindEquations(M, :with_svd, τ)
where τ is a tolerance value.
The full syntax of FindEquations
is as follows.
FindEquations(Ω::Array{T,2},
alg::Symbol,
d::Int64,
homogeneous_equations::Bool)
where {T<:Number}
FindEquations(Ω::Array{T,2},
alg::Symbol,
exponents::Array{Array{Int64,1},1})
where {T<:Number}
FindEquations(M::MultivariateVandermondeMatrix,
alg::Symbol,
τ::Float64)
Here:
Ω
is a matrix whose colums are the data points.alg
is the algorithm that should be used (one of:with_svd
,:with_qr
,:with_rref
).d
is the degree of the equations.homogeneous_equations = true
restricts the search space to homogeneous polynomials.homogeneous_equations = false
computes all polynomials of degree at most d.exponents
is an array of exponent vectors.τ
is the tolerance value.
Computing distances is a key aspect in both dimension estimation and persistent homology. Here are the functions with which we compute distances.
To compute the scaled Fubini Study distances between the data points in Ω type
ScaledEuclidean(Ω)
On the other hand,to compute the scaled Fubini Study distances between the data points in Ω type
ScaledFubiniStudy(Ω)
Finally, the ellipsoid-driven complex is encoded in a distance matrix. It is computed by typing
EllipsoidDistances(Ω, f, λ)
where f
is a vector of polynomials of the type provided by DynamicPolynomials.jl and λ sets the ratio between the principal axes of the ellipsoids