Bootstrap.jl: Statistical Bootstrapping

Motivation

Bootstrapping is a widely applicable technique for statistical estimation.

Functionality

Bootstrapping statistics with different resampling methods:
- Random resampling with replacement (BasicSampling)
- Antithetic resampling, introducing negative correlation between samples (AntitheticSampling)
- Balanced random resampling, reducing bias (BalancedSampling)
- Exact resampling, iterating through all unique resamples (ExactSampling): deterministic bootstrap, suited for small samples sizes
- Resampling of residuals in generalized linear models (ResidualSampling, WildSampling)
- Maximum Entropy bootstrapping for dependent and non-stationary datasets (MaximumEntropySampling)
Confidence intervals:
- Basic (BasicConfInt)
- Percentile (PercentileConfInt)
- Normal distribution (NormalConfInt)
- Studendized (StudentConfInt)
- Bias-corrected and accelerated (BCa) (BCaConfInt)

Installation

The Bootstrap package is part of the Julia ecosphere and the latest release version can be installed with

using Pkg
Pkg.add("Bootstrap")

More details on packages and how to manage them can be found in the package section of the Julia documentation.

Examples

This example illustrates the basic usage and cornerstone functions of the package. More elaborate cases are covered in the documentation notebooks.

  using Bootstrap

Our observations in some_data are sampled from a standard normal distribution.

  some_data = randn(100);

Let's bootstrap the standard deviation (std) of our data, based on 1000 resamples and with different bootstrapping approaches.

  using Statistics  # the `std` methods live here
  
  n_boot = 1000

  ## basic bootstrap
  bs1 = bootstrap(std, some_data, BasicSampling(n_boot))

  ## balanced bootstrap
  bs2 = bootstrap(std, some_data, BalancedSampling(n_boot))

We can explore the properties of the bootstrapped samples, for example, the estimated bias and standard error of our statistic.

  bias(bs1)
  stderror(bs1)

Furthermore, we can estimate confidence intervals (CIs) for our statistic of interest, based on the bootstrapped samples. confint returns a Tuple of Tuples, where each Tuple is of the form (statistic_value, upper_confidence_bound, lower_confidence_bound). A confidence interval is returned for each variable in the bootstrap model.

  ## calculate 95% confidence intervals
  cil = 0.95;

  ## basic CI
  bci1 = confint(bs1, BasicConfInt(cil));

  ## percentile CI
  bci2 = confint(bs1, PercentileConfInt(cil));

  ## BCa CI
  bci3 = confint(bs1, BCaConfInt(cil));

  ## Normal CI
  bci4 = confint(bs1, NormalConfInt(cil));

References

The bootstrapping wikipedia article is a comprehensive introduction into the topic. An extensive description of the bootstrap is the focus of the book Davison and Hinkley (1997): Bootstrap Methods and Their Application. Most of the methodology covered in the book is implemented in the boot package for the R programming language. More references are listed in the documentation for further reading.