PreprocessMD.jl

Medically-informed data preprocessing for machine learning
Author bcbi
Popularity
6 Stars
Updated Last
1 Year Ago
Started In
May 2022

PreprocessMD.jl

Medically-informed data preprocessing for machine learning

Documentation Build Status
Build Status Coverage

Summary

The purpose of PreprocessMD.jl is to provide a suite of functions for preprocesing biomedical data. The scope of this package is medical data preprocessing, so we develop functions that are specific to biomedical research but general enough for widespread use. These tools are developed for the OMOP Common Data Model1, especially the MIMIC-IV demo set2.

Following the definitions of Hu et al.3, we consider data preprocessing to include project-level data manipulations, as opposed to the upstream data cleaning (e.g., error-corrections and standardizations) that is typically performed over an entire database, and the downstream data preparing (e.g., labelling and classification), which might vary across any number of analyses within a project.

Usage

An example pipeline is available in the documentation.

Features

Planned features for PreprocessMD.jl include:

  • Summaries and feasibility checks
  • Feature extraction
  • Variable derivation
  • Data imputation
  • Dimension reduction

Footnotes

  1. https://ohdsi.github.io/CommonDataModel/

  2. https://physionet.org/content/mimic-iv-demo-omop/0.9/

  3. Wu, Hulin, Jose Miguel Yamal, Ashraf Yaseen, and Vahed Maroufy, eds. Statistics and Machine Learning Methods for EHR Data: From Data Extraction to Data Analytics. CRC Press, 2020.

Used By Packages

No packages found.