AxisArrays.jl
This package for the Julia language provides an array type (the AxisArray
) that knows about its dimension names and axis values.
This allows for indexing by name without incurring any runtime overhead.
This permits one to implement algorithms that are oblivious to the storage order of the underlying arrays.
AxisArrays can also be indexed by the values along their axes, allowing column names or interval selections.
In contrast to similar approaches in Images.jl and NamedArrays.jl, this allows for typestable selection of dimensions and compiletime axis lookup. It is also better suited for regularly sampled axes, like samples over time.
Collaboration is welcome! This is still a workinprogress. See the roadmap for the project's current direction.
Axis{}
and keywords
Note about An AxisArray
stores an object of type Axis{:name}
for each dimension,
containing both the name (a Symbol
) and the "axis values" (an AbstractVector
).
These types are what made compiletime lookup possible.
Instead of providing them explicitly, it is now possible to use keyword arguments
for both construction and indexing:
V = AxisArray(rand(10); row='a':'j') # AxisArray(rand(10), Axis{:row}('a':'j'))
V[row='c'] == V[Axis{:row}('c')] == V[row=3] == V[3]
axes()
and indices()
Note about The function AxisArrays.axes
returns the tuple of such Axis
objects.
Since Julia version 0.7, Base.axes(V) == (1:10,)
gives instead the range of possible
ordinary integer indices. (This was called Base.indices
.) Since both names are exported,
this collision results in a warning if you try to use axes
without qualification:
julia> axes([1,2])
WARNING: both AxisArrays and Base export "axes"; uses of it in module Main must be qualified
ERROR: UndefVarError: axes not defined
Packages that are upgrading to support Julia 0.7+ should:
 Replace all uses of the
axes
function with the fullyqualifiedAxisArrays.axes
 Replace all uses of the deprecated
indices
function with the unqualifiedaxes
 Immediately after
using AxisArrays
, addconst axes = Base.axes
In the future, AxisArrays will be looking for a new name for its functionality.
This will allow you to use the idiomatic Base
name and offers an easy upgrade
path to whatever the new name will be.
Example of currentlyimplemented behavior:
julia> using Pkg; pkg"add AxisArrays Unitful"
julia> using AxisArrays, Unitful, Random
julia> fs = 40000; # Generate a 40kHz noisy signal, with spikelike stuff added for testing
julia> import Unitful: s, ms, µs
julia> rng = Random.MersenneTwister(123); # Seed a random number generator for repeatable examples
julia> y = randn(rng, 60*fs+1)*3;
julia> for spk = (sin.(0.8:0.2:8.6) .* [0:0.01:.1; .15:.1:.95; 1:.05:.05] .* 50,
sin.(0.8:0.4:8.6) .* [0:0.02:.1; .15:.1:1; 1:.2:.1] .* 50)
i = rand(rng, round(Int,.001fs):1fs)
while i+length(spk)1 < length(y)
y[i:i+length(spk)1] += spk
i += rand(rng, round(Int,.001fs):1fs)
end
end
julia> A = AxisArray(hcat(y, 2 .* y); time = (0s:1s/fs:60s), chan = ([:c1, :c2]))
2dimensional AxisArray{Float64,2,...} with axes:
:time, 0.0 s:2.5e5 s:60.0 s
:chan, Symbol[:c1, :c2]
And data, a 2400001×2 Array{Float64,2}:
3.5708 7.14161
6.14454 12.2891
3.42795 6.85591
1.37825 2.75649
1.19004 2.38007
1.99414 3.98828
2.9429 5.88581
0.226449 0.452898
0.821446 1.64289
0.582687 1.16537
⋮
3.50593 7.01187
2.26783 4.53565
0.16902 0.33804
3.84852 7.69703
0.226457 0.452914
0.560809 1.12162
4.67663 9.35326
2.41005 4.8201
3.71612 7.43224
AxisArrays behave like regular arrays, but they additionally use the axis information to enable all sorts of fancy behaviors. For example, we can specify indices in any order, just so long as we annotate them with the axis name:
julia> A[time=4] # or A[Axis{:time}(4)]
1dimensional AxisArray{Float64,1,...} with axes:
:chan, Symbol[:c1, :c2]
And data, a 2element Array{Float64,1}:
1.37825
2.75649
julia> A[chan = :c2, time = 1:5] # or A[Axis{:chan}(:c2), Axis{:time}(1:5)]
1dimensional AxisArray{Float64,1,...} with axes:
:time, 0.0 s:2.5e5 s:0.0001 s
And data, a 5element Array{Float64,1}:
7.14161
12.2891
6.85591
2.75649
2.38007
We can also index by the values of each axis using an Interval
type that
selects all values between two endpoints a .. b
or the axis values directly.
Notice that the returned AxisArray still has axis information itself... and it
still has the correct time information for those datapoints!
julia> A[40µs .. 220µs, :c1]
1dimensional AxisArray{Float64,1,...} with axes:
:time, 5.0e5 s:2.5e5 s:0.0002 s
And data, a 7element Array{Float64,1}:
3.42795
1.37825
1.19004
1.99414
2.9429
0.226449
0.821446
julia> AxisArrays.axes(ans, 1)
AxisArrays.Axis{:time,StepRangeLen{Quantity{Float64, Dimensions:{𝐓}, Units:{s}},Base.TwicePrecision{Quantity{Float64, Dimensions:{𝐓}, Units:{s}}},Base.TwicePrecision{Quantity{Float64, Dimensions:{𝐓}, Units:{s}}}}}(5.0e5 s:2.5e5 s:0.0002 s)
You can also index by a single value using atvalue(t)
.
This function is not needed for categorical axes like :chan
here,
as :c1
is a Symbol
which can't be confused with an integer index.
Using atvalue()
will drop a dimension (like using a single integer).
Indexing with an Interval(lo, hi)
type retains dimensions, even
when the ends of the interval are equal (like using a range 1:1
):
julia> A[atvalue(2.5e5s), :c1]
6.14453912336772
julia> A[2.5e5s..2.5e5s, :c1]
1dimensional AxisArray{Float64,1,...} with axes:
:time, 2.5e5 s:2.5e5 s:2.5e5 s
And data, a 1element Array{Float64,1}:
6.14454
You can even index by multiple values by broadcasting atvalue
over an array:
julia> A[atvalue.([2.5e5s, 75.0µs])]
2dimensional AxisArray{Float64,2,...} with axes:
:time, Quantity{Float64, Dimensions:{𝐓}, Units:{s}}[2.5e5 s, 7.5e5 s]
:chan, Symbol[:c1, :c2]
And data, a 2×2 Array{Float64,2}:
6.14454 12.2891
1.37825 2.75649
Sometimes, though, what we're really interested in is a window of time about a
specific index. One of the operations above (looking for values in the window from 40µs
to 220µs) might be more clearly expressed as a symmetrical window about a
specific index where we know something interesting happened. To represent this,
we use the atindex
function:
julia> A[atindex(90µs .. 90µs, 5), :c2]
1dimensional AxisArray{Float64,1,...} with axes:
:time_sub, 7.5e5 s:2.5e5 s:7.500000000000002e5 s
And data, a 7element Array{Float64,1}:
6.85591
2.75649
2.38007
3.98828
5.88581
0.452898
1.64289
Note that the returned AxisArray has its time axis shifted to represent the interval about the given index! This simple concept can be extended to some very powerful behaviors. For example, let's threshold our data and find windows about those threshold crossings.
julia> idxs = findall(diff(A[:,:c1] .< 15) .> 0);
julia> spks = A[atindex(200µs .. 800µs, idxs), :c1]
2dimensional AxisArray{Float64,2,...} with axes:
:time_sub, 0.0002 s:2.5e5 s:0.0008 s
:time_rep, Quantity{Float64, Dimensions:{𝐓}, Units:{s}}[0.162 s, 0.20045 s, 0.28495 s, 0.530325 s, 0.821725 s, 1.0453 s, 1.11967 s, 1.1523 s, 1.22085 s, 1.6253 s … 57.0094 s, 57.5818 s, 57.8716 s, 57.8806 s, 58.4353 s, 58.7041 s, 59.1015 s, 59.1783 s, 59.425 s, 59.5657 s]
And data, a 41×247 Array{Float64,2}:
0.672063 7.25649 0.633375 … 1.54583 5.81194 4.706
1.65182 2.57487 0.477408 3.09505 3.52478 4.13037
4.46035 2.11313 4.78372 1.23385 7.2525 3.57485
5.25651 2.19785 3.05933 0.965021 6.78414 5.94854
7.8537 0.345008 0.960533 0.812989 0.336715 0.303909
0.466816 0.643649 3.67087 … 3.92978 3.1242 0.789722
6.0445 13.2441 4.60716 0.265144 4.50987 8.84897
9.21703 13.2254 14.4409 8.6664 13.3457 11.6213
16.1809 22.7037 25.023 15.9376 28.0817 16.996
23.2671 31.2021 25.3787 24.4914 32.2599 26.1118
⋮ ⋱ ⋮
0.301629 0.0683982 4.36574 1.92362 5.12333 3.4431
4.7182 1.18615 4.40717 4.51757 8.64314 0.0800021
2.43775 0.151882 1.40817 3.38555 2.23418 0.728549
3.2482 0.60967 0.471288 … 2.53395 0.468817 3.65905
4.26967 2.24747 3.13758 1.74967 4.5052 0.145357
0.752487 1.69446 1.20491 1.71429 1.81936 0.290158
4.64348 3.94187 1.59213 7.15428 0.539748 4.82309
1.09652 2.66999 0.521931 3.80528 1.70421 3.40583
0.94341 2.60785 3.34291 … 1.10584 4.31118 3.6404
By indexing with a repeated interval, we have added a dimension to the output! The returned AxisArray's columns specify each repetition of the interval, and each datapoint in the column represents a timepoint within that interval, adjusted by the time of the theshold crossing. The best part here is that the returned matrix knows precisely where its data came from, and has labeled its dimensions appropriately. Not only is there the proper time base for each waveform, but we also have recorded the event times as the axis across the columns.
Indexing
Two main types of Axes supported by default include:

Categorical axis  These are vectors of labels, normally symbols or strings. Elements or slices can be selected by elements or vectors of elements.

Dimensional axis  These are sorted vectors or iterators that can be selected by
Intervals
. These are commonly used for sequences of times or datetimes. For regular sample rates, ranges can be used.
Here is an example with a Dimensional axis representing a time sequence along rows and a Categorical axis of symbols for column headers.
B = AxisArray(reshape(1:15, 5, 3), .1:.1:0.5, [:a, :b, :c])
B[row = (0.2..0.4)] # restrict the AxisArray along the time axis
B[0.0..0.3, [:a, :c]] # select an interval and two of the columns
Userdefined axis types can be added along with custom indexing behaviors.
Example: compute the intensityweighted mean along the z axis
B = AxisArray(randn(100,100,100), :x, :y, :z)
Itotal = sumz = 0.0
for iter in CartesianIndices(Base.axes(B)) # traverses in storage order for cache efficiency
global Itotal, sumz
I = B[iter] # intensity in a single voxel
Itotal += I
sumz += I * iter[axisdim(B, Axis{:z})] # axisdim "looks up" the z dimension
end
meanz = sumz/Itotal
The intention is that all of these operations are just as efficient as they would be if you used traditional positionbased indexing with all the inherent assumptions about the storage order of B
.