The purpose of this package is to provide a general framework for implementing Kernel Density Estimation methods.
The density estimator
where
- is the estimator
- is the kernel function
- is the bandwidth can be evaluated using one of three implemented methods.
Direct()
Binned()
FFT()
Here is a link to the relevant wikipedia article
Kernel | Support | |
---|---|---|
Biweight | ||
Cosine | ||
Epanechnikov | ||
Logistic | unbounded | |
Normal | unbounded | |
SymTriangularDist | ||
Triweight | ||
Uniform |
This package uses Distributions.jl to suppy kernels such that
where
and is one of the kernels listed in the table above.
Note: for the Uniform distribution, Distributions.jl defines (loc,scale) = (a, b-a))
where a
and b
are the bounds lower and upper bounds, respectively.
This package accounts for this inconsistancy by evaluating the Uniform kernel as
The objective function to minimize is given by
where
This has also been implemented using Direct
, Binned
, and FFT
methods.
using KDEstimation, Distributions
# set a seed for reproducibility
using StableRNGs
rng = StableRNG(1111)
# generate random data
x = randn(rng, 100)
rot = rule_of_thumb2(Normal,x)
println("rule of thumb: ", rot)
lscv_res = lscv(Normal,x,FFT())
rule of thumb: 0.3955940866915174
LSCV{Normal,FFT(4096),1}
Results of Optimization Algorithm
* Algorithm: Golden Section Search
* Search Interval: [0.289408, 0.389348]
* Minimizer: 3.457826e-01
* Minimum: -2.834224e-01
* Iterations: 33
* Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
* Objective Function Calls: 34
Visualization using Plots.jl
using Plots; pyplot()
plot(lscv_res)
h = minimizer(lscv_res)
fkde = kde(Normal, h, x, FFT())
frot = kde(Normal, rot, x, FFT())
# these are callable
@show fkde(0.3);
@show frot(-2);
fkde(0.3) = 0.38237039523949345
frot(-2) = 0.04546902308913938
plot(fkde, label="LSCV", lw=2)
plot!(frot, label="Rule of thumb", lw=2)
This work has been heavily influenced by Artur Gramacki's book "Nonparametric Kernel Density Estimation and Its Computational Aspects"