Deep and convolutional neural networks with feedback operations in Flux.
This package implements deep neural networks with feedback. This means that the output of higher/later layers can serve as an input to lower/earlier layers at the next timestep.
Most deep learning frameworks do not support this form of recurrence in a straightforward manner. Usually recurrence is limited to a single layer, implemented as an RNN cell. This package essentially turns the whole network into a single RNN cell with support for arbitrary connectivity.
The package can be installed using Pkg.add()
using Pkg
Pkg.add("FeedbackNets")or using the REPL shorthand
] add FeedbackNetsThe package depends on Flux and on CuArrays for GPU support. For more
details on Julia package management, look here.
Once the package is installed, you can access it with Julia's package manager:
using FeedbackNetsTypically, you'll want to load Flux as well for its network layers:
using FluxThe core of the package is the FeedbackChain, a type that behaves largely
similar to a normal Flux.Chain. It treats normal Flux layers as one would
expect. However, it can contain two additional elements: Splitters and
Mergers. These two types are used to structure feedback in a network, i.e., to
enable higher levels of the chain to provide input to lower levels in the next
timestep.
A Splitter marks a point in the forward stream from which feedback is provided.
As the FeedbackChain traverses the feedforward stream, it records the
intermediate output at each Splitter and adds it to a state dictionary.
A Merger marks a location at which feedback is folded back into the
feedforward stream. Each Merger contains the name of the Splitter from which
it gets feedback, an operation (e.g., a ConvTranspose or a Chain) to apply
to the feedback and a binary operation (e.g., +) which it applies to combine
forward and feedback input.
For example, a simple FeedbackChain may contain a Dense layer that maps ten
input units to five outputs and a feedback path that has another Dense layer
with the inverse connectivity.
net = FeedbackChain(
Merger("fork1", Dense(5, 10, relu), +),
Dense(10, 5, relu),
Splitter("fork1")
)At each timestep, this network will take the previous state of fork1, pass it
through the 5-to-10 unit Dense layer and add it to the 10-unit input. The
result is then passed through the 10-to-5 Dense layer to produce the output of
the network, which is stored for the next timestep by fork1.
In order to apply net to an input, we need to pass it a dictionary with the
current / inital state of fork1.
x = randn(10)
h = Dict("fork1" => zeros(5))
h, out = net(h, x)A FeedbackChain can be wrapped in a Flux.Recur in order to have it handle the
state internally. This requires that an initial state dictionary is provided.
net = Flux.Recur(net, h)
out = net(x)The project is MIT licensed. See LICENSE for details.