ZVSimulator package provides a framework for assessing the zero variance (ZV) principle for Monte Carlo or
random sampling via simulation.
ZVSimulator is used for assessing the effectiveness of the zero variance principle on arbitrary MCMC samplers
or on arbitrary univariate or multivariate distributions. The simulation-based evaluation of ZV's effectiveness is
made by estimating the variance reduction factor (VRF), which is the ratio of the variance of the original estimator
over the variance of the ZV estimator.
To compute the variance of the original estimator of a summary statistic, several Monte Carlo chains are simulated from a given MCMC sampler or several sets of independently and identically distributed (i.i.d.) samples are randomly drawn from a given distribution. Then the summary statistic (typically the mean) is computed for each chain or for each sample set and the sample variance of these summary statistics (typically the sample variance of these means) is computed. This way the variance of the original estimator for a given summary statistic is computed. The value of the score function, that is the value of the gradient of the log-likelihood, along the chain iterations or along the i.i.d. samples is stored and is employed in order to calculate the corresponding ZV estimator.
The main features of the
ZVSimulator from the user's perspective are the following:
- Integrated usage with the
MCMCpackages. The user can define distributions or MCMC tasks by using the standard interface of
MCMC, so as to pass them to the
- Minimal effort to set up a ZV simulation due to the high level interface with
MCMC. This facilitates shifting the user's focus from programming towards the statistical issue of ZV's effectiveness. For example, the gradient computation is done by using the relevant score functions defined in
Distributionsand the ZV coefficients are calculated using the relevant ZV functions in
- Assessment of ZV for any summary statistic by passing a function that transforms the simulated samples.
- Lower level functionality, which allows assessing ZV for any user-defined random process leaving room for future developments of ZV on more stochastic models.
- Parallel implementation of the
ZVSimulator, which provides faster computations on several workers.
ZV for a univariate distribution
As an example of how one can compute the VRFs for a univariate distribution, consider a t-distribution with 5 degrees
of freedom. The distribution is defined by calling
TDist(5.) using the relevant constructor from
psim_rand_vrf() function is invoked on the t-density, from which 100 sample sets are drawn, each of size
1000. Note that if julia is started by
julia -p 4 for instance, then the simulation will run on 4 workers on the
local machine. The code for this example is the following and can also be found in
using Distributions, ZVSimulator d = TDist(5.) results = psim_rand_vrf(d, nsets=100, nsamples=1000)
ZV for MCMC
examples/mcmc_mvtdist.jl provides an example of computing the VRFs for a Monte Carlo simulation from a multivariate
Student target distribution. The MCMC tasks are defined by using the (model, sampler, runner) triplet-interface of
MCMC package. Then
psim_serialmc_vrf() is invoked on these tasks to calculate the VRFs by simulating 100 chains,
each consisting of 10000 iterations of which the first 1000 are discarded as burnin.
using Distributions, MCMC, ZVSimulator function C(n::Int, c::Float64) X = eye(n) [(j <= n-i) ? X[i+j, i] = X[i, i+j] = c^j : nothing for i = 1:(n-1), j = 1:(n-1)] X end df, npars, a = 5., 4, 0.25 c = ((df-2)/df)*C(npars, a) mcmodel = model(MvTDist(df, zeros(npars), c), init=zeros(npars)) samplers = [HMCDA() for i in 1:100] tasks = mcmodel * samplers * SerialMC(steps=10000, burnin=1000) results = psim_serialmc_vrf(tasks)