QuasiMonteCarlo.jl

Lightweight and easy generation of quasi-Monte Carlo sequences with a ton of different methods on one API for easy parameter exploration in scientific machine learning (SciML)
Popularity
101 Stars
Updated Last
2 Months Ago
Started In
December 2019

QuasiMonteCarlo.jl

Join the chat at https://julialang.zulipchat.com #sciml-bridged Global Docs

codecov Build Status

ColPrac: Contributor's Guide on Collaborative Practices for Community Packages SciML Code Style

This is a lightweight package for generating Quasi-Monte Carlo (QMC) samples using various different methods.

Tutorials and Documentation

For information on using the package, see the stable documentation. Use the in-development documentation for the version of the documentation, which contains the unreleased features.

Example

using QuasiMonteCarlo, Distributions
lb = [0.1, -0.5]
ub = [1.0, 20.0]
n = 5
d = 2

s = QuasiMonteCarlo.sample(n, lb, ub, GridSample())
s = QuasiMonteCarlo.sample(n, lb, ub, Uniform())
s = QuasiMonteCarlo.sample(n, lb, ub, SobolSample())
s = QuasiMonteCarlo.sample(n, lb, ub, LatinHypercubeSample())
s = QuasiMonteCarlo.sample(n, lb, ub, LatticeRuleSample())
s = QuasiMonteCarlo.sample(n, lb, ub, HaltonSample())

The output s is a matrix, so one can use things like @uview from UnsafeArrays.jl for a stack-allocated view of the ith point:

using UnsafeArrays
@uview s[:, i]

API

Everything has the same interface:

A = QuasiMonteCarlo.sample(n, lb, ub, sample_method, output_type = Float64)

or to generate points directly in the unit box $[0,1]^d$

A = QuasiMonteCarlo.sample(n, d, sample_method, output_type = Float64) # = QuasiMonteCarlo.sample(n,zeros(d),ones(d),sample_method)

where:

  • n is the number of points to sample.
  • lb is the lower bound for each variable. The length determines the dimensionality.
  • ub is the upper bound.
  • d is the dimension of the unit box.
  • sample_method is the quasi-Monte Carlo sampling strategy.
  • output_type controls the output type, Float64, Float32, Rational (for exact digital net representation), etc. This feature does not yet work with every QMC sequence.

Additionally, there is a helper function for generating design matrices:

k = 2
As = QuasiMonteCarlo.generate_design_matrices(n,
    lb,
    ub,
    sample_method,
    k,
    output_type = Float64)

which returns As which is an array of k design matrices A[i] that are all sampled from the same low-discrepancy sequence.

Available Sampling Methods

Sampling methods SamplingAlgorithm are divided into two subtypes

  • DeterministicSamplingAlgorithm

    • GridSample for samples on a regular grid.
    • SobolSample for the Sobol sequence.
    • FaureSample for the Faure sequence.
    • LatticeRuleSample for a randomly-shifted rank-1 lattice rule.
    • HaltonSample for the Halton sequence.
    • GoldenSample for a Golden Ratio sequence.
    • KroneckerSample(alpha, s0) for a Kronecker sequence, where alpha is a length-d vector of irrational numbers (often sqrt(d)) and s0 is a length-d seed vector (often 0).
  • RandomSamplingAlgorithm

    • UniformSample for uniformly distributed random numbers.
    • LatinHypercubeSample for a Latin Hypercube.
    • Additionally, any Distribution can be used, and it will be sampled from.

Adding a new sampling method

Adding a new sampling method is a two-step process:

  1. Add a new SamplingAlgorithm type.
  2. Overload the sample function with the new type.

All sampling methods are expected to return a matrix with dimension d by n, where d is the dimension of the sample space and n is the number of samples.

Example

struct NewAmazingSamplingAlgorithm{OPTIONAL} <: SamplingAlgorithm end

function sample(n, lb, ub, ::NewAmazingSamplingAlgorithm)
    if lb isa Number
        ...
        return x
    else
        ...
        return reduce(hcat, x)
    end
end

Randomization of QMC sequences

Most of the previous methods are deterministic, i.e. sample(n, d, Sampler()::DeterministicSamplingAlgorithm) always produces the same sequence $X = (X_1, \dots, X_n)$. There are two ways to obtain a randomized sequence:

  • Either directly use QuasiMonteCarlo.sample(n, d, DeterministicSamplingAlgorithm(R = RandomizationMethod())) or sample(n, lb, up, DeterministicSamplingAlgorithm(R = RandomizationMethod())).
  • Or, given $n$ points $d$-dimensional points, all in $[0,1]^d$ one can do randomize(X, ::RandomizationMethod()) where $X$ is a $d\times n$-matrix.

The currently available randomization methods are:

  • Scrambling methods ScramblingMethods(b, pad, rng) where b is the base used to scramble and pad the number of bits in b-ary decomposition. pad is generally chosen as $\gtrsim \log_b(n)$. The implemented ScramblingMethods are

    • DigitalShift
    • MatousekScramble a.k.a. Linear Matrix Scramble.
    • OwenScramble a.k.a. Nested Uniform Scramble is the most understood theoretically, but is more costly to operate.
  • Shift(rng) a.k.a. Cranley Patterson Rotation.

For numerous independent randomization, use generate_design_matrices(n, d, ::DeterministicSamplingAlgorithm), ::RandomizationMethod, num_mats) where num_mats is the number of independent X generated.

Randomization Example

Randomization of a Faure sequence with various methods.

using QuasiMonteCarlo
m = 4
d = 3
b = QuasiMonteCarlo.nextprime(d)
N = b^m # Number of points
pad = m # Length of the b-ary decomposition number = sum(y[k]/b^k for k in 1:pad)

# Unrandomized (deterministic) low discrepancy sequence
x_faure = QuasiMonteCarlo.sample(N, d, FaureSample())

# Randomized version
x_nus = randomize(x_faure, OwenScramble(base = b, pad = pad)) # equivalent to sample(N, d, FaureSample(R = OwenScramble(base = b, pad = pad)))
x_lms = randomize(x_faure, MatousekScramble(base = b, pad = pad))
x_digital_shift = randomize(x_faure, DigitalShift(base = b, pad = pad))
x_shift = randomize(x_faure, Shift())
x_uniform = rand(d, N) # plain i.i.d. uniform
using Plots
# Setting I like for plotting
default(fontfamily = "Computer Modern",
    linewidth = 1,
    label = nothing,
    grid = true,
    framestyle = :default)

Plot the resulting sequences along dimensions 1 and 3.

d1 = 1
d2 = 3 # you can try every combination of dimensions (d1, d2)
sequences = [x_uniform, x_faure, x_shift, x_digital_shift, x_lms, x_nus]
names = [
    "Uniform",
    "Faure (deterministic)",
    "Shift",
    "Digital Shift",
    "Matousek Scramble",
    "Owen Scramble"
]
p = [plot(thickness_scaling = 1.5, aspect_ratio = :equal) for i in sequences]
for (i, x) in enumerate(sequences)
    scatter!(p[i], x[d1, :], x[d2, :], ms = 0.9, c = 1, grid = false)
    title!(names[i])
    xlims!(p[i], (0, 1))
    ylims!(p[i], (0, 1))
    yticks!(p[i], [0, 1])
    xticks!(p[i], [0, 1])
    hline!(p[i], range(0, 1, step = 1 / 4), c = :gray, alpha = 0.2)
    vline!(p[i], range(0, 1, step = 1 / 4), c = :gray, alpha = 0.2)
    hline!(p[i], range(0, 1, step = 1 / 2), c = :gray, alpha = 0.8)
    vline!(p[i], range(0, 1, step = 1 / 2), c = :gray, alpha = 0.8)
end
plot(p..., size = (1500, 900))

Different randomization methods of the same initial set of points