CombinatorialBandits.jl

Implementation of several algorithms for combinatorial bandits, a kind of reinforcement learning.
Author dourouc05
Popularity
4 Stars
Updated Last
6 Months Ago
Started In
January 2020

CombinatorialBandits

Project Status: Active – The project has reached a stable, usable state and is being actively developed. The MIT License Build Status Coverage Status codecov.io

This package implements several algorithms to deal with combinatorial multi-armed bandit (CMAB), including the first polynomial-time optimum-regret algorithm AESCB (described in our paper).

See also Bandits.jl, focusing on multi-armed bandits (i.e. not combinatorial).

To install:

Pkg.add("CombinatorialBandits")

Example usage:

using CombinatorialBandits, Distributions

n = 20
m = 8
ε = 0.1
distr = Distribution[Bernoulli(.5 + ((i % 3 == 0) ? ε : -ε)) for i in 1:n]

i = MSet(distr, 8, MSetAlgosSolver())
@time simulate(i, ThompsonSampling(), 200)
@time simulate(i, LLR(), 200)
@time simulate(i, CUCB(), 200)
@time simulate(i, ESCB2(ESCB2Budgeted(.1, true)), 200)

Citing

If you use this package in your research, please cite this preprint:

@misc{cuvelier2020,
    title={Statistically Efficient, Polynomial-Time Algorithms for Combinatorial Semi Bandits},
    author={Thibaut Cuvelier and Richard Combes and Eric Gourdin},
    year={2020},
    eprint={2002.07258},
    archivePrefix={arXiv},
    primaryClass={stat.ML}
}

Used By Packages

No packages found.