PermutedArrays

It's very common in Julia to build <:AbstractArray data structures with memory layouts different than an array of pointers. Examples include StructArrays, ArraysOfArrays, and TupleVectors, and of course standard Arrays where each element is the same immutable type.

This is great, until you need to permute the values. For example, say we have a TupleVector

using TupleVectors
using PermutedArrays
using Random
using BenchmarkHistograms

n = 100
k = 1000
x = TupleVectors.chainvec(randn(k), n);
for j in 2:n
    x[j] .= randn(k);
end

tv1 = TupleVector((x=x,));

Then the cost of a random permutation is

julia> @btime permute!($tv1, p) setup=(p=randperm(n));
  3.204 ms (502 allocations: 76.31 MiB)

PermutedArrays lets us reduce this to

julia> @btime permute!($tv2, p) setup=(p=randperm(n));
  438.015 ns (3 allocations: 1.81 KiB)

Note that doing this by converting to a Vector would require copying the data:

julia> @btime tv3 = Vector($tv1);
  818.211 ns (1 allocation: 7.19 KiB)

But after paying that cost, that approach can be just as fast:

julia> tv3 = Vector(tv1);

julia> @btime permute!($tv3, p) setup=(p=randperm(n));
  473.760 ns (1 allocation: 896 bytes)

However, the memory requirements have now doubled, and the Vector loses the original advantages of the packed format.

Implementation

The setup is relatively simple:

struct PermutedVector{T, V} <: AbstractVector{T}
    data::V
    perm::Vector{Int}
    iperm::Vector{Int}
end

perm is the permutation to be applied implicitly to data, and iperm is its inverse.

Many permutations are built from a composition of "swaps", each exchanging two elements. This is expressed in swap!, with a default implementation

function swap!(v::AbstractVector, i::Int, j::Int)
    (v[i], v[j]) = (v[j], v[i])
    return v
end

Suggest Category

PermutedArrays

Implementation

Julia Packages