ReadDatastores.jl

Datastores for reads, not your papa's FASTQ files.
Author BioJulia
Popularity
11 Stars
Updated Last
10 Months Ago
Started In
July 2019

ReadDatastores

Latest Release MIT license DOI Stable documentation Latest documentation Lifecycle Chat

Description

Not your papa's FASTQ files.

ReadDatastores provides a set of datastore types for storing and randomly accessing sequences from read datasets from disk. Each datastore type is optimised to the type of read data stored.

Using these data-stores grants greater performance than using text files that store reads (see FASTX.jl, XAM.jl, etc.) since the sequences are stored in BioSequences.jl succinct bit encodings already, and preset formats/layouts of the binary files means no need to constantly validate the input.

  • A paired read datastore is provided for paired-end reads and long mate-pairs (Illumina MiSeq etc).
  • A long read datastore is provided for long-reads (Nanopore, PacBio etc.)
  • A linked read datastore is provided for shorter reads that are linked or grouped using some additional (typically proximity based) tag (10x).

Also included is the ability to buffer these datastores, sacrificing some RAM, for faster iteration / sequential access of the reads in the datastore.

Installation

You can install ReadDatastores from the julia REPL. Press ] to enter pkg mode again, and enter the following:

add ReadDatastores

If you are interested in the cutting edge of the development, please check out the master branch to try new features before release.

Testing

ReadDatastores is tested against Julia 1.X on Linux, OS X, and Windows.

Latest build status:

Contributing

We appreciate contributions from users including reporting bugs, fixing issues, improving performance and adding new features.

Take a look at the contributing files detailed contributor and maintainer guidelines, and code of conduct.

Financial contributions

We also welcome financial contributions in full transparency on our open collective. Anyone can file an expense. If the expense makes sense for the development of the community, it will be "merged" in the ledger of our open collective by the core contributors and the person who filed the expense will be reimbursed.

Backers & Sponsors

Thank you to all our backers and sponsors!

Love our work and community? Become a backer.

backers

Does your company use BioJulia? Help keep BioJulia feature rich and healthy by sponsoring the project Your logo will show up here with a link to your website.

Questions?

If you have a question about contributing or using BioJulia software, come on over and chat to us on Gitter, or you can try the Bio category of the Julia discourse site.