It's very common in Julia to build <:AbstractArray data structures with memory layouts different than an array of pointers. Examples include StructArrays, ArraysOfArrays, and TupleVectors, and of course standard Arrays where each element is the same immutable type.
This is great, until you need to permute the values. For example, say we have a TupleVector
using TupleVectors
using PermutedArrays
using Random
using BenchmarkHistograms
n = 100
k = 1000
x = TupleVectors.chainvec(randn(k), n);
for j in 2:n
x[j] .= randn(k);
end
tv1 = TupleVector((x=x,));Then the cost of a random permutation is
julia> @btime permute!($tv1, p) setup=(p=randperm(n));
3.204 ms (502 allocations: 76.31 MiB)PermutedArrays lets us reduce this to
julia> @btime permute!($tv2, p) setup=(p=randperm(n));
438.015 ns (3 allocations: 1.81 KiB)Note that doing this by converting to a Vector would require copying the data:
julia> @btime tv3 = Vector($tv1);
818.211 ns (1 allocation: 7.19 KiB)But after paying that cost, that approach can be just as fast:
julia> tv3 = Vector(tv1);
julia> @btime permute!($tv3, p) setup=(p=randperm(n));
473.760 ns (1 allocation: 896 bytes)However, the memory requirements have now doubled, and the Vector loses the original advantages of the packed format.
The setup is relatively simple:
struct PermutedVector{T, V} <: AbstractVector{T}
data::V
perm::Vector{Int}
iperm::Vector{Int}
endperm is the permutation to be applied implicitly to data, and iperm is its inverse.
Many permutations are built from a composition of "swaps", each exchanging two elements. This is expressed in swap!, with a default implementation
function swap!(v::AbstractVector, i::Int, j::Int)
(v[i], v[j]) = (v[j], v[i])
return v
end