This package implements several algorithms to deal with combinatorial multi-armed bandit (CMAB), including the first polynomial-time optimum-regret algorithms: AESCB (described in our paper) and AOSSB (article in press).
See also Bandits.jl, focusing on multi-armed bandits (i.e. not combinatorial).
To install:
]add CombinatorialBandits
Example usage:
using CombinatorialBandits, Distributions
n = 20
m = 8
ε = 0.1
distr = Distribution[Bernoulli(.5 + ((i % 3 == 0) ? ε : -ε)) for i in 1:n]
i = MSet(distr, 8, MSetAlgosSolver())
@time simulate(i, ThompsonSampling(), 200)
@time simulate(i, LLR(), 200)
@time simulate(i, CUCB(), 200)
@time simulate(i, ESCB2(ESCB2Budgeted(.1, true)), 200)
If you use this package in your research, please cite either article:
@article{cuvelier2021aescb,
author = {Cuvelier, Thibaut and Combes, Richard and Gourdin, Eric},
title = {Statistically Efficient, Polynomial-Time Algorithms for Combinatorial Semi-Bandits},
year = {2021},
issue_date = {March 2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {5},
number = {1},
url = {https://doi.org/10.1145/3447387},
doi = {10.1145/3447387},
journal = {Proc. ACM Meas. Anal. Comput. Syst.},
month = feb,
articleno = {09},
numpages = {31},
keywords = {combinatorial bandits, combinatorial optimization, bandits}
}
@article{cuvelier2021glpg,
title={Asymptotically Optimal Strategies For Combinatorial Semi-Bandits in Polynomial Time},
author={Cuvelier, Thibaut and Combes, Richard and Gourdin, Eric},
journal={arXiv preprint arXiv:2102.07254},
year={2021}
}