DecisionMakingUtils.jl

This package provides utility structs and functions to support solving RL problems
Author DecisionMakingAI
Popularity
1 Star
Updated Last
9 Months Ago
Started In
February 2021

DecisionMakingUtils

Dev Build Status Coverage

This package contains utility functions used through other DecisionMakingAI repositories. Currently, there is functionality for creating a Fourier basis, Tile coding, normalizing features, and linear function modeling. Tabular models are also available as a special case of tile coding.

The following is an example of creating a tile coding-based q function.

using DecisionMakingUtils
using Flux: Chain

X = [1.0 2.0; -3.0 4.0]  # Assume X represents the ranges of the state features where the first (second) column represents the minimum (maximum).  
state_dims = size(X, 1)
num_tiles = 5
num_tilings = 4
num_actions = 4

nrm = ZeroOneNormalization(X)  # TileCoding assumes the features are normalized to [0,1]. Wrapping tiles will make features >=1 wrap around to start from 0
nbuff = zeros(state_dims)  # Buffer to prevent allocations of the feature normalization
nf = BufferedFunction(nrm, nbuff)  # wrapper function to hold the buffer
tc = TileCodingBasis(state_dims, num_tiles, num_tilings=num_tilings, tiling_type=:wrap)
ϕ = Chain(nf, tc)  # chain the normalization and tile coding into one function

num_outputs = 1 # if you want to predict successor features, this should be length(tc)
m = TileCodingModel(ϕ, num_tiles=size(tc)[1], num_tilings=num_tilings, num_outputs=num_outputs,num_actions=num_actions)
buff = LinearBuffer(m)
bf = BufferedFunction(m, buff)
s = rand(state_dims)
qs = bf(s) # value of each q-function
qsa = bf(s, 1) # value of first action in state s
qs, grad = value_withgrad(bf, s) # same as qs above, plus the gradient w.r.t. each action this is just phi(s) for each a. grad has shape of params(m)
qsa, grad = value_withgrad(bf, s, 1) # q value and derivative w.r.t. that action in state s

Here is an example using the FourierBasis.

dorder = 2  # order of the basis for all coupled terms. The number of features grow exponentially with this parameter 
iorder = 4  # order of the basis for each individual state feature. The number of features grows linearly with this parameter 
full = true  # if true it computes both sine and cosine of the features, otherwise only cosine will be computed
fb = FourierBasis(state_dims, dorder, iorder, full)  # assumes features are normalized to [0,1]
fbuff = FourierBasisBuffer(fb)  # creates buffer to avoid allocations
num_features = length(fb)  # gets the total number of features output by the basis function
basisf = BufferedFunction(fb, fbuff)
ϕ = Chain(nf, basisf)

m = LinearModel(ϕ, num_features, num_actions=num_actions)
buff = LinearBuffer(m)
bf = BufferedFunction(m, buff)

bf([1.1, 0.0], 1)  # q-value for first action at the given state features
v, g = value_withgrad(bf, [1.1, 0.0], 1)  # q-value and partial derivative with respect to the model weights

Used By Packages

No packages found.