MLBenchmarks.jl

ML models benchmarks on public dataset
Author Evovest
Popularity
5 Stars
Updated Last
5 Months Ago
Started In
December 2023

MLBenchmarks.jl

This repo provides Julia based benchmarks for ML algo on tabular data. It was developed to support both NeuroTreeModels.jl and EvoTrees.jl projects.

Methodology

For each dataset and algo, the following methodology is followed:

  • Data is split in three parts: train, eval and test
  • A random grid of 16 hyper-parameters is generated
  • For each parameter configuration, a model is trained on train data until the evaluation metric tracked against the eval stops improving (early stopping)
  • The trained model is evaluated against the test data
  • The metric presented in below are the ones obtained on the test for the model that generated the best eval metric.

Datasets

The following selection of common tabular datasets is covered:

  • Year: min squared error regression
  • MSRank: ranking problem with min squared error regression
  • YahooRank: ranking problem with min squared error regression
  • Higgs: 2-level classification with logistic regression
  • Boston Housing: min squared error regression
  • Titanic: 2-level classification with logistic regression

Algorithms

Comparison is performed against the following algos (implementation in link) considered as state of the art on tabular data problems tasks:

Boston

model_type train_time mse gini
neurotrees 16.6 13.2 0.951
evotrees 0.392 23.5 0.932
xgboost 0.103 21.6 0.931
lightgbm 0.406 26.7 0.931
catboost 0.127 14.9 0.944

Titanic

model_type train_time logloss accuracy
neurotrees 7.95 0.445 0.821
evotrees 0.11 0.405 0.821
xgboost 0.0512 0.412 0.799
lightgbm 0.128 0.388 0.828
catboost 0.264 0.393 0.843

Year

model_type train_time mse gini
neurotrees 308.0 76.8 0.651
evotrees 71.9 80.4 0.626
xgboost 33.8 82.0 0.614
lightgbm 15.2 79.4 0.633
catboost 127.0 80.2 0.630

MSRank

model_type train_time mse ndcg
neurotrees 85.1 0.577 0.467
evotrees 39.8 0.554 0.505
xgboost 19.4 0.554 0.501
lightgbm 38.5 0.553 0.507
catboost 112.0 0.553 0.504

Yahoo

model_type train_time mse ndcg
neurotrees 299.0 0.583 0.781
evotrees 442.0 0.545 0.797
xgboost 129.0 0.544 0.797
lightgbm 215.0 0.539 0.798
catboost 241.0 0.555 0.796

Higgs

model_type train_time logloss accuracy
neurotrees 15900.0 0.453 0.781
evotrees 2710.0 0.465 0.775
xgboost 1390.0 0.464 0.776
lightgbm 993.0 0.464 0.774
catboost 8020.0 0.463 0.776

References