A space is simply a set of objects. In a reinforcement learning context, spaces define the sets of possible states, actions, and observations.
In Julia, spaces can be represented by a variety of objects. For instance, a small discrete action set might be represented with ["up", "left", "down", "right"]
, or an interval of real numbers might be represented with an object from the IntervalSets
package. In general, the space defined by any Julia object is the set of objects x
for which x in space
returns true
.
In addition to establishing the definition above, this package provides three useful tools:
- Traits to communicate about the properties of spaces, e.g. whether they are continuous or discrete, how many subspaces they have, and how to interact with them.
- Functions such as
product
for constructing more complex spaces - Constructors to for spaces whose elements are arrays, such as
ArraySpace
andBox
.
Since a space is simply a set of objects, a wide variety of common Julia types including Vector
, Set
, Tuple
, and Dict
1can represent a space.
Because of this inclusive definition, there is a very minimal interface that all spaces are expected to implement. Specifically, it consists of
in(x, space)
, which tests whetherx
is a member of the setspace
(this can also be called with thex in space
syntax).rand(space)
, which returns a valid member of the set2.eltype(space)
, which returns the type of the elements in the space.
In addition, the SpaceStyle
trait is always defined. Calling SpaceStyle(space)
will return either a FiniteSpaceStyle
, ContinuousSpaceStyle
, HybridSpaceStyle
, or an UnknownSpaceStyle
object.
Spaces with a finite number of elements have FiniteSpaceStyle
. These spaces are guaranteed to be iterable, implementing Julia's iteration interface. In particular collect(space)
will return all elements in an array.
Continuous spaces represent sets that have an uncountable number of elements they have a SpaceStyle
of type ContinuousSpaceStyle
. CommonRLSpaces does not adopt a rigorous mathematical definition of a continuous set, but, roughly, elements in the interior of a continuous space have other elements very close to them.
Continuous spaces have some additional interface functions:
bounds(space)
returns upper and lower bounds in a tuple. For example, ifspace
is a unit circle,bounds(space)
will return([-1.0, -1.0], [1.0, 1.0])
. This allows agents to choose policies that appropriately cover the space e.g. a normal distribution with a mean ofmean(bounds(space))
and a standard deviation of half the distance between the bounds.clamp(x, space)
returns an element ofspace
that is nearx
. i.e. ifspace
is a unit circle,clamp([2.0, 0.0], space)
might return[1.0, 0.0]
. This allows for a convenient way for an agent to find a valid action if they sample actions from a distribution that doesn't match the space exactly (e.g. a normal distribution).clamp!(x, space)
, similar toclamp
, but clampsx
in place.
The interface for hybrid continuous-discrete spaces is currently planned, but not yet defined. If the space style is not FiniteSpaceStyle
or ContinuousSpaceStyle
, it is UnknownSpaceStyle
.
[need to figure this out, but I think elsize(space)
should return the size of the arrays in the space]
The Cartesian product of two spaces a
and b
can be constructed with c = product(a, b)
.
The exact form of the resulting space is unspecified and should be considered an implementation detail. The only guarantees are (1) that there will be one unique element of c
for every combination of one object from a
and one object from b
and (2) that the resulting space conforms to the interface above.
The TupleSpaceProduct
constructor provides a specialized Cartesian product where each element is a tuple, i.e. TupleSpaceProduct(a, b)
has elements of type Tuple{eltype(a), eltype(b)}
.
1Note: the elements of a space represented by a Dict
are key-value Pair
s.
2[TODO: should we make any guarantees about whether rand(space)
is drawn from a uniform distribution?]
Category | Style | Example |
---|---|---|
Enumerable discrete space | FiniteSpaceStyle{()}() |
(:cat, :dog) , 0:1 , ["a","b","c"] |
One dimensional continuous space | ContinuousSpaceStyle{()}() |
-1.2..3.3 , Interval(1.0, 2.0) |
Multi-dimensional discrete space | FiniteSpaceStyle{(3,4)}() |
ArraySpace((:cat, :dog), 3, 4) , ArraySpace(0:1, 3, 4) , ArraySpace(1:2, 3, 4) , ArraySpace(Bool, 3, 4) |
Multi-dimensional variable discrete space | FiniteSpaceStyle{(2,)}() |
product((:cat, :dog), (:litchi, :longan, :mango)) , product(-1:1, (false, true)) |
Multi-dimensional continuous space | ContinuousSpaceStyle{(2,)}() or ContinuousSpaceStyle{(3,4)}() |
Box([-1.0, -2.0], [2.0, 4.0]) , product(-1.2..3.3, -4.6..5.0) , ArraySpace(-1.2..3.3, 3, 4) , ArraySpace(Float32, 3, 4) |
Multi-dimensional hybrid space [planned for future] | HybridSpaceStyle{(2,),()}() |
product(-1.2..3.3, -4.6..5.0, [:cat, :dog]) , product(Box([-1.0, -2.0], [2.0, 4.0]), [1,2,3]) |
julia> using CommonRLSpaces
julia> s = (:litchi, :longan, :mango)
julia> rand(s)
:litchi
julia> rand(s) in s
true
julia> length(s)
3
julia> s = ArraySpace(1:5, 2,3)
CommonRLSpaces.RepeatedSpace{UnitRange{Int64}, Tuple{Int64, Int64}}(1:5, (2, 3))
julia> rand(s)
2×3 Matrix{Int64}:
4 1 1
3 2 2
julia> rand(s) in s
true
julia> SpaceStyle(s)
FiniteSpaceStyle()
julia> elsize(s)
(2, 3)
julia> s = product(-1..1, 0..1)
Box{StaticArraysCore.SVector{2, Float64}}([-1.0, 0.0], [1.0, 1.0])
julia> rand(s)
2-element StaticArraysCore.SVector{2, Float64} with indices SOneTo(2):
0.03049072910834738
0.6295234114874269
julia> rand(s) in s
true
julia> SpaceStyle(s)
ContinuousSpaceStyle()
julia> elsize(s)
(2,)
julia> bounds(s)
([-1.0, 0.0], [1.0, 1.0])
julia> clamp([5, 5], s)
2-element StaticArraysCore.SizedVector{2, Float64, Vector{Float64}} with indices SOneTo(2):
1.0
1.0