SoleModels.jl defines the building blocks of symbolic modeling and learning. It features:
- Definitions for symbolic models (decision trees/forests, rules, branches, etc.);
- Tools for evaluate them, and extracting rules from them;
- Support for mixed, neuro-symbolic computation.
These definitions provide a unified base for implementing symbolic algorithms, such as:
- Decision tree/random forest learning;
- Classification/regression rule extraction;
- Association rule mining.
Basic models are:
- Leaf models: wrapping native Julia computation (e.g., constants, functions);
- Rules: structures with
IF antecedent THEN consequent END
semantics; - Branches: structures with
IF antecedent THEN pos_consequent ELSE neg_consequent END
semantics.
Remember that:
- An antecedent is a logical formula that can be checked on a logical interpretation (that is, an instance of a symbolic learning dataset), yielding a truth value (e.g.,
true/false
); - A consequent is another model, for example, a (final) constant model or branch to be applied.
Within this framework, a decision tree is no other than a branch with branch and final consequents. Note that antecedents can consist of logical formulas and, in such case, the symbolic models are can be applied to logical interpretations. For more information, refer to SoleLogics.jl, the underlying logical layer.
Other noteworthy models include:
- Decision List (or decision table): see Wikipedia;
- Decision Tree: see Wikipedia;
- Decision Forest (or tree ensamble): see Wikipedia;
- Mixed Symbolic Model: a nested structure, mixture of many symbolic models.
First, train a decision tree:
# Load packages
begin
Pkg.add("MLJ"); using MLJ
Pkg.add("MLJDecisionTreeInterface"); using MLJDecisionTreeInterface
Pkg.add("DataFrames"); using DataFrames
Pkg.add("Random"); using Random
end
# Load dataset
X, y = begin
X, y = @load_iris;
X = DataFrame(X)
X, y
end
# Split dataset
X_train, y_train, X_test, y_test = begin
train, test = partition(eachindex(y), 0.8, shuffle=true, rng = Random.MersenneTwister(42));
X_train, y_train = X[train, :], y[train];
X_test, y_test = X[test, :], y[test];
X_train, y_train, X_test, y_test
end;
# Train tree
mach = begin
Tree = MLJ.@load DecisionTreeClassifier pkg=DecisionTree
model = Tree(max_depth=-1, rng = Random.MersenneTwister(42))
machine(model, X_train, y_train) |> fit!
end
# Inspect the tree
๐ฑ = fitted_params(mach).tree
Then, port it to Sole and play with it:
Pkg.add("SoleDecisionTreeInterface"); using SoleDecisionTreeInterface
# Convert to ๐-compliant model
๐ฒ = solemodel(๐ฑ);
# Print model
printmodel(๐ฒ);
# Inspect the rules
listrules(๐ฒ)
# Inspect rule metrics
metricstable(๐ฒ)
# Inspect normalized rule metrics
metricstable(๐ฒ, normalize = true)
# Make test instances flow into the model, so that test metrics can, then, be computed.
apply!(๐ฒ, X_test, y_test)
# Pretty table of rules and their metrics
metricstable(๐ฒ; normalize = true, metrics_kwargs = (; additional_metrics = (; height = r->SoleLogics.height(antecedent(r)))))
# Join some rules for the same class into a single, sufficient and necessary condition for that class
metricstable(joinrules(๐ฒ; min_ncovered = 1, normalize = true))
Learning logical models (that is, models with logical formulas as antecedents) often requires performing model checking many times. SoleModels.jl provides a set of structures for representing logical datasets, specifically optimized for multiple model checking operations.
The package is developed by the ACLAI Lab @ University of Ferrara.
SoleModels.jl mainly builds upon SoleLogics.jl and SoleData.jl, and it is the core module of Sole.jl, an open-source framework for symbolic machine learning.