Independent Hypothesis Weighting for multiple testing with side-information in Julia
Author nignatiadis
6 Stars
Updated Last
2 Years Ago
Started In
October 2021


Build Status Coverage

This package provides a preliminary implementation of the Independent Hypothesis Weighting method for multiple testing with side-information, as described in:

Ignatiadis N, Huber W (2021). “Covariate powered cross-weighted multiple testing.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 83: 720-751.

Ignatiadis N, Klaus B, Zaugg J, Huber W (2016). “Data-driven hypothesis weighting increases detection power in genome-scale multiple testing.” Nature Methods. doi: 10.1038/nmeth.3885, 13: 577–580.

This package is work in progress, so that we currently recommend using the R package IHW, which is available on Bioconductor. Also please see the MultipleTesting.jl package that provides methods for multiple testing without side-information (here we build upon the interface defined in MultipleTesting.jl).

Example Usage

Load packages:

using Distributions
using IndependentHypothesisWeighting
using Random
using StatsBase

Generate synthetic data: 10,000 p-values that can be partitioned into two groups, as encoded by the side-information Xs.

Xs = CategoricalVector(sample(1:2, 10000))
Ps = rand(BetaUniformMixtureModel(0.7, 0.2), 10000) .* (Xs.==1) .+ rand(Uniform(), 10000) .* (Xs.==2) 

Suppose we seek to control the false discovery rate at 10%. As a baseline, that does not use the grouping side-information, we may run the Benjamini-Hochberg procedure:

α = 0.1
sum(adjust(Ps, BenjaminiHochberg()) .<= α ) # 580 discoveries

580 p-values are significant. Let us run Independent Hypothesis Weighting (IHW) with the grouping side-information:

ihw_grenander = IHW(weight_learner = GrenanderLearner(), α = α)
ihw_grenander_fit = fit(ihw_grenander, Ps, Xs)
sum(adjust(ihw_grenander_fit) .<= α) # 677 discoveries

IHW increased the significant discoveries to 677.