MendelImpute.jl

OpenMendel package for haplotyping and imputation
Author OpenMendel
Popularity
10 Stars
Updated Last
1 Year Ago
Started In
June 2017

MendelImpute

Documentation Build Status Code Coverage
build Actions Status CI (Julia nightly) codecov

Installation

Download and install Julia. Within Julia, copy and paste the following:

using Pkg
pkg"add https://github.com/OpenMendel/SnpArrays.jl"
pkg"add https://github.com/OpenMendel/VCFTools.jl"
pkg"add https://github.com/OpenMendel/BGEN.jl"
pkg"add https://github.com/OpenMendel/MendelImpute.jl"

This package supports Julia v1.6+.

Note: BGEN format is currrently experimental and is not guaranteed to work properly.

Documentation

Example run:

The following uses data under the data/ directory.

# load package & cd to data directory
using MendelImpute
cd(normpath(MendelImpute.datadir()))

# compress reference haplotypes from .vcf.gz to .jlso format
reffile = "ref.excludeTarget.vcf.gz"       # reference VCF file
tgtfile = "target.typedOnly.masked.vcf.gz" # target VCF file (GWAS file)
outfile = "ref.excludeTarget.jlso"         # output file name (end in .jlso)
compress_haplotypes(reffile, tgtfile, outfile)

# phase & impute
tgtfile = "target.typedOnly.masked.vcf.gz" # target VCF file (GWAS file)
reffile = "ref.excludeTarget.jlso"         # compressed reference file
outfile = "imputed.vcf.gz"                 # output file name
phase(tgtfile, reffile, outfile);

# check error rate (since data was simulated)
using VCFTools
Ximputed = convert_gt(Float64, "imputed.vcf.gz")  # imputed genotypes
Xtrue = convert_gt(Float64, "target.full.vcf.gz") # true genotypes
m, n = size(Xtrue) # matrix dimensions
error_rate = sum(Xtrue .!= Ximputed) / m / n

For more realistic example, see detailed example in documentation

Bug Fixes and User support

If you encounter a bug or need user support, please open a new issue on Github. Please provide as much detail as possible for bug reports, ideally a sequence of reproducible code that lead to the error.

PRs and feature requests are welcomed!

Citation

Our paper is on bioRxiv. If you want to cite MendelImpute.jl, please cite

@article{mendelimpute,
    title = {{A Fast Data-Driven Method for Genotype Imputation, Phasing, and Local Ancestry Inference: MendelImpute.jl}},
    author = {Chu, Benjamin B and Sobel, Eric M and Wasiolek, Rory and Sinsheimer, Janet S and Zhou, Hua and Lange, Kenneth},
    year = {2020},
    journal={arXiv preprint DOI:10.1101/2020.10.24.353755}
}

Acknowledgement

This project is supported by the National Institutes of Health under NIGMS awards R01GM053275 and R25GM103774 and NHGRI award R01HG006139.