CommonRLInterface
This package is designed for two reasons:
- to provide compatibility between different reinforcement learning (RL) environment interfaces - for example, an algorithm that uses
YourRLInterface
should be able to use an environment fromMyRLInterface
without depending onMyRLInterface
as long as they both supportCommonRLInterface
. - to provide a very basic interface for users to write their own RL environments and algorithms.
To accomplish this, there is a single abstract environment type, AbstractEnv
, a small required interface, and a larger optional interface will be added soon.
Required Interface
The interface has only three required functions:
step!(env, a) # returns an observation, reward, done, and info
reset!(env) # returns an observation
actions(env) # returns the set of all possible actions for the environment
Optional Interface
In the near future, a number of optional interface functions will be added. Please file an issue if you would like to see a particular interface function.
Additional info
What does it mean for an RL Framework to "support" CommonRLInterface?
Suppose you have an abstract environment type in your package called YourEnv
. Support for AbstractEnv
means:
-
You provide a convert methods
convert(Type{YourEnv}, ::AbstractEnv) convert(Type{AbstractEnv}, ::YourEnv)
If there are additional options in the conversion, you are encouraged to create and document constructors with additional arguments.
-
You provide an implementation of the interface functions from your framework only using functions from CommonRLInterface
-
You implement at minimum
CommonRL.reset!(::YourCommonEnv)
CommonRL.step!(::YourCommonEnv, a)
CommonRL.actions(::YourCommonEnv)
and as many optional functions as you'd like to support, whereYourCommonEnv
is the concrete type returned byconvert(Type{AbstractEnv}, ::YourEnv)
What does an environment implementation look like?
A 1-D LQR problem with discrete actions might look like this:
mutable struct LQREnv <: AbstractEnv
s::Float64
end
function CommonRLInterface.reset!(m::LQREnv)
m.s = 0.0
end
function CommonRLInterface.step!(m::LQREnv, a)
r = -m.s^2 - a^2
sp = m.s = m.s + a + randn()
return sp, r, false, NamedTuple()
end
CommonRLInterface.actions(m::LQREnv) = (-1.0, 0.0, 1.0)
# from version 0.2 on, you can implement optional functions like this:
# @provide CommonRLInterface.clone(m::LQREnv) = LQREnv(m.s)
What does a simulation with a random policy look like?
env = YourEnv()
done = false
o = reset!(env)
acts = actions(env)
rsum = 0.0
while !done
o, r, done, info = step!(env, rand(acts))
r += rsum
end
@show rsum
What does it mean for an algorithm to "support" CommonRLInterface?
You should have a method of your solver or algorithm that accepts a AbstractEnv
, perhaps handling it by converting it to your framework first, e.g.
solve(env::AbstractEnv) = solve(convert(YourEnv, env))