DataBench.jl

A package to benchmark data manipulation in Julia vs R data.table
Popularity
1 Star
Updated Last
6 Months Ago
Started In
October 2017

DataBench - a Julia vs R data manipulation benchmark suite

A comparison of data manipulation prowess using synthetic data and the GE Flight Quest data

Set up instructions

# Pkg.add("DataBench")
  1. Change the settings.csv's data_path to a path that you can write to
  2. Download the 7z file (https://www.kaggle.com/c/flight/download/InitialTrainingSet_rev1.7z) and
  3. Extract it into the folder data_path/InitialTrainingSet_rev1

Synthetic benchmarks

Adapted from data.tables' official benchmarks

"Real-life" benchmarks

Uses GE Flight Quest data, the largest tabular dataset on Kaggle at the time of writing

Companion post

Speed of data manipulations in Julia vs R

Similar repos

https://github.com/szilard/benchm-databases