Fancy memoizing for expensive functions in Julia.
Author ExpandingMan
13 Stars
Updated Last
4 Months Ago
Started In
March 2017


Anamnesis [an-am-nee-sis]:

  1. the recollection or rememberance of the past
  2. recollection of the ideas which the soul had known in a previous existence, especially by means of reasoning

A package for doing unobtrusive memoization of computationally expensive functions in Julia.

This package is for Julia 0.7 and later.

The @anamnesis Macro

On Function Declarations

The "standard" implementation of memoization is to simply declare a function that will automatically memoize all of its results. This can be done with

@anamnesis function SU2::Real, n::AbstractVector{T}) where T<:Real
    println("Fire walk with me")
    σ⃗ = [Complex{T}[0 1; 1 0], Complex{T}[0 -im; im 0], Complex{T}[1 0; 0 -1]]

SU2(0, [1, 0, 0])  # calls SU2, prints "Fire walk with me"
SU2(0, [1, 0, 0])  # the previous result is retrieved, does not print

This also works for single-line function declearations such as @anamnesis f(x) = x^2 + 1.

Note that Anamnesis.jl also supports keyword arguments in all use cases.

On Blocks

One of the design goals of Anamnesis.jl is to allow memoization as unobtrusively as possible, preferably without the need to change any other code. To this end, we provide the @anamnesis macro also for blocks. For example

@anamnesis begin
    t = f(x) + g(x)
    z = exp(-λ*t)
    f(z) + f(x) - g(x)
end f

In this example, all calls to f are memoized, but calls to other functions are not. This was achieved by naming f after the block. This can be done with arbitrarily many functions so, for example, if we placed a g after the above example it would have memoized g as well. It is not necessary to declare f with @anamnesis to do this, in fact, these are mutually exclusive use cases. Note that this can be done with any expression, not just a begin end block. For example, we could do

@anamnesis function h(x)
    o = f(x) + g(x) - f(x^2)
    f(o^2) + f(x^2)
end f

What distinguishes this example from the memoization of h itself is the extra f added to the end.

The @mem Macro

One of the simplest uses of Anamnesis.jl is the @mem macro. This macro will memoize only the function call for which it was called, for example

@mem f(1)  # the result is memoized
f(2)  # this is a normal function call
@mem f(1)  # f is not evaluated here, instead the previous value is retrieved
@mem f(2)  # now the value for f(2) is stored

The Scribe Objects

Anamnesis.jl uses objects called Scribes to perform memoization. The objects are just the function itself combined with a cache of seen values. If you prefer to use Anamnesis.jl more explicitly, you can simply use the scribe objects directly. For example

f(x) = abs2(gamma(x))
sf = Scribe(f)

sf(0)  # this result is memoized
f(0)  # function calls work normally; results are not memoized
sf(0)  # this of course retrieves the previously memoized result

Accessing Scribe Objects Generated by Macros

The macros described above generally create scribe objects and place them in the current global scope. These can be accessed with the @scribeof macro. For example

@mem f(0)
@scribeof(f)  # this is the scribe object generated by mem
@scribeof(f)(0)  # you can call these normally

Memoized functions created with @anamnesis on function declarations need to be accessed differently, in this case whatever name you gave the function is the name of a new function that wraps the Scribe.

@anamnesis h(x) = exp(x)
@scribeof(h)  # ERROR

@rawfunc(h)  # this is the non-memoized function
@rawfunc(h)(0)  # you can call it normally, it will not memoize

@scribeofrawfunc(h)  # this is the scribe object you are calling with h itself

typeof(h) <: Function  # the reason Anamnesis is designed this way, instead of just passing you a Scribe is so that you are still creating a bonified function

The Value Cache and Specifying Return Types

By default Scribes use an IdDict{Any,Any} to cache its values, however any AbstractDict object can be used in its place. One can even supply an AbstractDict with pre-existing values. For example

sf = Scribe(f, Dict{Any,Any}((1,)=>0))

sf(1) # return 0 regardless of return value of f

Explicitly specifying the value cache can also be useful for improving performance. If you know the argument or (more importantly) return types of your function, you can tell the compiler about it by supplyinig an appropriately typed AbstractDict. For example

sf = Scribe(f, Dict{Tuple{Real,Real},Float64}())

sf(1, 2)  # the return values are now guaranteed type stable.  if f does not return a Float64 an error will be thrown

sg = Scribe(g, Dict{Any,ComplexF64}())  # a more common use case is to specify only the return type

Note that return types can also be specified by overloading

Anamnesis.returntype(::typeof(f), argtypes) = AbstractFloat
Anamnesis.returntype(::typeof(f), ::Type{Tuple{String}}) = AbstractString
# here you are promising the compiler that f returns an AbstractFloat unless you pass it a String in which case it returns AbstractString

Note that even if you do not specify a return type in any way Anamnesis.jl uses inference to attempt to determine one so that its output should be type stable in most cases (although it contains type unstable code internally).

Performance and When to Use

Julia is a scientific programming language which was designed from the ground up with computational efficiency as one of its goals. As such, even ad hoc and "naive" functions written in Julia tend to be extremely efficient. Furthermore, any memoization implementation is liable to have some performance pain points: in general they contain type unstable code (even if they have type stable output) and they must include a value cache which is accessible from the same scope as the function itself so that global objects are potentially involved. Because of this, memoization comes with a significant overhead, even with Julia's highly efficient AbstractDict implementations.

I find that a good rule of thumb is that you should never memoize functions which take less than a few microseconds to evaluate. Typically I'd expect you to use memoization with functions which typicall take on the order of milliseconds or more to evaluate.

Long Term Plans

I started Anamnesis.jl as an experiment mainly with the goal in mind of doing inter-process memoization, e.g. so that you could exit a program completely and then retrieve memoized values from serialized data on disk. I find that I very frequently have need for this sort of thing when doing experimentation in scientific programming and data science as often I have functions that take a very long time (on the order of seconds) to evaluate, and I don't want to have to do that every time I change some unrelated piece of code. I initially had some inellegant code which worked, but was not a good long term solution. I've now decided to make sure I have a really good implementation of basic memoization and build up from there.

Hopefully I'll have a chance to work on some of these more "exotic" features soon, and I already have concrete plans about how to achieve this.