This package implements the simplest version of Differential Evolution, but adds the ability to specify custom 'predictors', which allow to incorporate knowledge about the problem into the optimization.
For example, standard machine learning approaches could be trained on a set of examples of the optimization problem, and their predictions can be used to guide the optimization.
The key idea of Differential Evolution, without the additions of mutations and crossovers, is to evolve a population of candidates to the optimum by:
- selecting 3 random candididates a,b,c
- comparing the costs of aanda' = a + lambda * (c-b)
- replacing awitha'ifa'has lower cost
Predictors allow to compare a not only to a', but also to one or more hypotheses a'' = predict(a). Predictions are computed for the entire population at once, so the signature of a predictor is actually predictor(currentgeneration, currentcosts).
optimum = [-2 4]'
costf(x) = (r = sqrt(sum((x-optimum).^2)); sin(r)^2+r/2)
best, info = de(costf, [-10,-10], [10,10])
@show optimum best optimum-bestde has the following signature:
de(costfunction, minima, maxima; kargs...)The costfunction is expected to take a single parameter of type Array{Number,2} and return a scalar value.
minima and maxima are arrays of type Array{Number,2} specifing the possible range of values passed to costfunction.
The optimization process can be controlled through the following keyword arguments of de, listed with their default values:
- npop = 100, number of candidates in the population
- maxiter = 1e6, maximum number of iterations
- lambda = 0.85, value of- lambdaused in the default DE predictor (see above)
- initpop = mi .+ rand(length(mi), npop) .* (ma - mi), initial population
- recordhistory = false, record details about each candidates' evolution
- continueabove = -Inf, stop when the cost of the best candidate reaches this value
- replaceworst = 0.0, percentage specified as range- 0.0 ... 1.0. In each iteration this percentage of candidates is replaced with copies of the best candidates
- roundto = 1e-6, size of the grid in the parameter space where candidates live. Set to- 0for full numerical precision. Can also be a vector the size of- minima/- maxima, in which case the grid resolution can be different for each parameter. E.g. to enforce integers set to- 1
In this example we assume that we can learn from a couple of examples in a training phase and use the resulting model in our predictor:
# example predictor which learned to predict offsets
predictor(pop, costs) = pop + predict(somemodel, pop)
best, info = de(costf, [-10,-10], [10,10], {predictor, :default})There is a helper function available to compare different strategies:
using InformedDEHelpers, PyPlot
analyze(f, mi, ma, [
	("default", {:default},{}),
	("pred first", {predictor, :default}, {}),
	("pred last", {:default, predictor}, {}),
	("pred only", {predictor}, {}),
	("tryall", {predictor, :default}, {(:tryallpredictors,true)})
])- fixed broken rounding and clamping

