The ZVSimulator
package provides a framework for assessing the zero variance (ZV) principle for Monte Carlo or
random sampling via simulation.
The ZVSimulator
is used for assessing the effectiveness of the zero variance principle on arbitrary MCMC samplers
or on arbitrary univariate or multivariate distributions. The simulation-based evaluation of ZV's effectiveness is
made by estimating the variance reduction factor (VRF), which is the ratio of the variance of the original estimator
over the variance of the ZV estimator.
To compute the variance of the original estimator of a summary statistic, several Monte Carlo chains are simulated from a given MCMC sampler or several sets of independently and identically distributed (i.i.d.) samples are randomly drawn from a given distribution. Then the summary statistic (typically the mean) is computed for each chain or for each sample set and the sample variance of these summary statistics (typically the sample variance of these means) is computed. This way the variance of the original estimator for a given summary statistic is computed. The value of the score function, that is the value of the gradient of the log-likelihood, along the chain iterations or along the i.i.d. samples is stored and is employed in order to calculate the corresponding ZV estimator.
The main features of the ZVSimulator
from the user's perspective are the following:
- Integrated usage with the
Distributions
andMCMC
packages. The user can define distributions or MCMC tasks by using the standard interface ofDistributions
orMCMC
, so as to pass them to theZVSimulator
. - Minimal effort to set up a ZV simulation due to the high level interface with
Distributions
andMCMC
. This facilitates shifting the user's focus from programming towards the statistical issue of ZV's effectiveness. For example, the gradient computation is done by using the relevant score functions defined inDistributions
and the ZV coefficients are calculated using the relevant ZV functions inMCMC
. - Assessment of ZV for any summary statistic by passing a function that transforms the simulated samples.
- Lower level functionality, which allows assessing ZV for any user-defined random process leaving room for future developments of ZV on more stochastic models.
- Parallel implementation of the
ZVSimulator
, which provides faster computations on several workers.
As an example of how one can compute the VRFs for a univariate distribution, consider a t-distribution with 5 degrees
of freedom. The distribution is defined by calling TDist(5.)
using the relevant constructor from Distributions
.
Then the psim_rand_vrf()
function is invoked on the t-density, from which 100 sample sets are drawn, each of size
1000. Note that if julia is started by julia -p 4
for instance, then the simulation will run on 4 workers on the
local machine. The code for this example is the following and can also be found in examples/rand_tdist.jl
:
using Distributions, ZVSimulator
d = TDist(5.)
results = psim_rand_vrf(d, nsets=100, nsamples=1000)
examples/mcmc_mvtdist.jl
provides an example of computing the VRFs for a Monte Carlo simulation from a multivariate
Student target distribution. The MCMC tasks are defined by using the (model, sampler, runner) triplet-interface of
the MCMC
package. Then psim_serialmc_vrf()
is invoked on these tasks to calculate the VRFs by simulating 100 chains,
each consisting of 10000 iterations of which the first 1000 are discarded as burnin.
using Distributions, MCMC, ZVSimulator
function C(n::Int, c::Float64)
X = eye(n)
[(j <= n-i) ? X[i+j, i] = X[i, i+j] = c^j : nothing for i = 1:(n-1), j = 1:(n-1)]
X
end
df, npars, a = 5., 4, 0.25
c = ((df-2)/df)*C(npars, a)
mcmodel = model(MvTDist(df, zeros(npars), c), init=zeros(npars))
samplers = [HMCDA() for i in 1:100]
tasks = mcmodel * samplers * SerialMC(steps=10000, burnin=1000)
results = psim_serialmc_vrf(tasks)