KDEstimation.jl

Provides a general framework for implementing and performing Kernel Density Estimation
Author m-wells
Popularity
5 Stars
Updated Last
1 Year Ago
Started In
October 2019

KDEstimation (Kernel Density Estimation)

Build Status codecov Coverage Status

This package provides a general framework for implementing Kernel Density Estimation methods.

Univariate KDE

The density estimator

where

  • is the estimator
  • is the kernel function
  • is the bandwidth can be evaluated using one of three implemented methods.
  • Direct()
    • where is the sample size
  • Binned()
    • where is the number of evaluation points
    • by default
  • FFT()
    • where is the number of evaluation points
    • by default

Multivariate KDE (work in progress)

Kernels implemented

Here is a link to the relevant wikipedia article

Kernel Support
Biweight
Cosine
Epanechnikov
Logistic unbounded
Normal unbounded
SymTriangularDist
Triweight
Uniform

This package uses Distributions.jl to suppy kernels such that

where

and is one of the kernels listed in the table above.

Note about the Uniform distribution

Distributions.jl defines (loc,scale) = (a, b - a)) where a and b are the bounds lower and upper bounds, respectively. This package accounts for this inconsistancy by evaluating the Uniform kernel as

.

Bandwidth selection via Least Squares Cross Validation

The objective function to minimize is given by

where

This has also been implemented using Direct, Binned, and FFT methods.

Example usage

using KDEstimation, Distributions
# set a seed for reproducibility
using Random: seed!
seed!(1234)
# generate random data
x = randn(1000)
rot = rule_of_thumb2(Normal,x)
println("rule of thumb: ", rot)
lscv_res = lscv(Normal,x,FFT())
rule of thumb: 0.2676817928332638

LSCV{Normal,FFT(4096),1}
Results of Optimization Algorithm
 * Algorithm: Golden Section Search
 * Search Interval: [0.128205, 0.195830]
 * Minimizer: 1.616402e-01
 * Minimum: -2.789090e-01
 * Iterations: 34
 * Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
 * Objective Function Calls: 35

Visualization using Plots.jl

using Plots; pyplot()
plot(lscv_res)

png

h = minimizer(lscv_res)
fkde = kde(Normal, h, x, FFT())
frot = kde(Normal, rot, x, FFT())
# these can be called like functions
@show fkde(0.3)
@show frot(-2)
;
fkde(0.3) = 0.37927382397190534
frot(-2) = 0.05601509471009895
plot(fkde, label="LSCV", lw=2)
plot!(frot, label="Rule of thumb", lw=2)

png

Further Reading

This work has been heavily influenced by Artur Gramacki's book "Nonparametric Kernel Density Estimation and Its Computational Aspects"

Used By Packages

No packages found.