This Julia package implements the QMDP approximate solver for POMDP/MDP planning. The QMDP solver is documented in:
- Michael Littman, Anthony Cassandra, and Leslie Kaelbling. "Learning policies for partially observable environments: Scaling up." In Proceedings of the Twelfth International Conference on Machine Learning, pages 362--370, San Francisco, CA, 1995.
import Pkg
Pkg.add("QMDP")
using QMDP
pomdp = MyPOMDP() # initialize POMDP
# initialize the solver
# key-word args are the maximum number of iterations the solver will run for, and the Bellman tolerance
solver = QMDPSolver(max_iterations=20,
belres=1e-3,
verbose=true
)
# run the solver
policy = solve(solver, pomdp)
To compute optimal action, define a belief with the distribution interface, or use the DiscreteBelief provided in POMDPTools.
using POMDPTools
b = uniform_belief(pomdp) # initialize to a uniform belief
a = action(policy, b)
In order to use the efficient SparseValueIterationSolver
from DiscreteValueIteration.jl, you can directly pass the solver to the QMDPSolver
constructor as follows:
using QMDP, DiscreteValueIteration
pomdp = MyPOMDP()
solver = QMDPSolver(SparseValueIterationSolver(max_iterations=20, verbose=true))
policy = solve(solver, pomdp)