SoleModels.jl

Symbolic modeling in Julia!
Author aclai-lab
Popularity
12 Stars
Updated Last
9 Days Ago
Started In
November 2021

SoleModels.jl โ€“ Symbolic Learning Models

Stable Build Status Coverage Binder

In a nutshell

SoleModels.jl defines the building blocks of symbolic modeling and learning. It features:

  • Definitions for symbolic models (decision trees/forests, rules, branches, etc.);
  • Tools for evaluate them, and extracting rules from them;
  • Support for mixed, neuro-symbolic computation.

These definitions provide a unified base for implementing symbolic algorithms, such as:

  • Decision tree/random forest learning;
  • Classification/regression rule extraction;
  • Association rule mining.

Models

Basic models are:

  • Leaf models: wrapping native Julia computation (e.g., constants, functions);
  • Rules: structures with IF antecedent THEN consequent END semantics;
  • Branches: structures with IF antecedent THEN pos_consequent ELSE neg_consequent END semantics.

Remember that:

  • An antecedent is a logical formula that can be checked on a logical interpretation (that is, an instance of a symbolic learning dataset), yielding a truth value (e.g., true/false);
  • A consequent is another model, for example, a (final) constant model or branch to be applied.

Within this framework, a decision tree is no other than a branch with branch and final consequents. Note that antecedents can consist of logical formulas and, in such case, the symbolic models are can be applied to logical interpretations. For more information, refer to SoleLogics.jl, the underlying logical layer.

Other noteworthy models include:

  • Decision List (or decision table): see Wikipedia;
  • Decision Tree: see Wikipedia;
  • Decision Forest (or tree ensamble): see Wikipedia;
  • Mixed Symbolic Model: a nested structure, mixture of many symbolic models.

Usage: rule extraction from a decision tree

First, train a decision tree:

# Load packages
begin
    Pkg.add("MLJ"); using MLJ
    Pkg.add("MLJDecisionTreeInterface"); using MLJDecisionTreeInterface
    Pkg.add("DataFrames"); using DataFrames
    Pkg.add("Random"); using Random
end

# Load dataset
X, y = begin
    X, y = @load_iris;
    X = DataFrame(X)
    X, y
end

# Split dataset
X_train, y_train, X_test, y_test = begin
    train, test = partition(eachindex(y), 0.8, shuffle=true, rng = Random.MersenneTwister(42));
    X_train, y_train = X[train, :], y[train];
    X_test, y_test = X[test, :], y[test];
    X_train, y_train, X_test, y_test
end;

# Train tree
mach = begin
    Tree = MLJ.@load DecisionTreeClassifier pkg=DecisionTree
    model = Tree(max_depth=-1, rng = Random.MersenneTwister(42))
    machine(model, X_train, y_train) |> fit!
end

# Inspect the tree
๐ŸŒฑ = fitted_params(mach).tree

Then, port it to Sole and play with it:

Pkg.add("SoleDecisionTreeInterface"); using SoleDecisionTreeInterface

# Convert to ๐ŸŒž-compliant model
๐ŸŒฒ = solemodel(๐ŸŒฑ);

# Print model
printmodel(๐ŸŒฒ);

# Inspect the rules
listrules(๐ŸŒฒ)

# Inspect rule metrics
metricstable(๐ŸŒฒ)

# Inspect normalized rule metrics
metricstable(๐ŸŒฒ, normalize = true)

# Make test instances flow into the model, so that test metrics can, then, be computed.
apply!(๐ŸŒฒ, X_test, y_test)

# Pretty table of rules and their metrics
metricstable(๐ŸŒฒ; normalize = true, metrics_kwargs = (; additional_metrics = (; height = r->SoleLogics.height(antecedent(r)))))

# Join some rules for the same class into a single, sufficient and necessary condition for that class
metricstable(joinrules(๐ŸŒฒ; min_ncovered = 1, normalize = true))

Dataset structures (for logical symbolic learning)

Learning logical models (that is, models with logical formulas as antecedents) often requires performing model checking many times. SoleModels.jl provides a set of structures for representing logical datasets, specifically optimized for multiple model checking operations.

About

The package is developed by the ACLAI Lab @ University of Ferrara.

SoleModels.jl mainly builds upon SoleLogics.jl and SoleData.jl, and it is the core module of Sole.jl, an open-source framework for symbolic machine learning.