Updates of high order cumulant tensors
Author iitis
Popularity
3 Stars
Updated Last
1 Year Ago
Started In
September 2017

Updates following statistics of `n`-variate data

• `d`'th moment tensor,
• an array of moment tensors of order `1,2,...,d`.

Given `t` realisations of `n`-variate data: `X ∈ ℜᵗˣⁿ`, and the update `X + ∈ ℜᵘˣⁿ` returns array of updated cumulant tensors of order `1,2,...,d`.

To store symmetric tensors uses a `SymmetricTensors` type, requires SymmetricTensors.jl. To convert to array, run:

`julia> Array(st::SymmetricTensors{T, d})`

to convert back, run:

`julia>  SymmetricTensor(a::Array{T,d})`

Requires Cumulants.jl.

As of 01/01/2017 kdomino is the lead maintainer of this package.

## Installation

Within Julia, run

`pkg> add CumulantsUpdates`

to install the files. Julia 1.0 or later is required.

## Parallel computation

For parallel computation just run

```julia> addprocs(n)

## Functions

### Data update

To update data `X ∈ ℜᵗˣⁿ` by the update `X+ ∈ ℜᵘˣⁿ` in the observation window of size `t` and get `ℜᵗˣⁿ ∋ X' = vcat(X,X+)[1+u:end, :]` run:

`julia> dataupdat(X::Matrix{T}, Xplus::Matrix{T}) where T<:AbstractFloat`

the condition `u < t` must be fulfilled.

```julia> a = ones(4,4)
4×4 Array{Float64,2}:
1.0  1.0  1.0  1.0
1.0  1.0  1.0  1.0
1.0  1.0  1.0  1.0
1.0  1.0  1.0  1.0

julia> b = zeros(2,4)
2×4 Array{Float64,2}:
0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0

julia> dataupdat(a,b)
4×4 Array{Float64,2}:
1.0  1.0  1.0  1.0
1.0  1.0  1.0  1.0
0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0```

### Moment update

To update the moment tensor `M::SymmetricTensor{T, d}` for data `X ∈ ℜᵗˣⁿ`, given the update `X+ ∈ ℜᵘˣⁿ` where `u < t` run

`julia> momentupdat(M::SymmetricTensor{T, d}, X::Matrix{T}, Xplus::Matrix{T}) where {T<:AbstractFloat, d}`

Returns a `SymmetricTensor{T, d}` of the moment tensor of updated multivariate data: `ℜᵗˣⁿ ∈ X' = dataupdat(X,X+)`, i.e.: `M = moment(X, d)`, `momentupdat(M, X, X+) = moment(X', d)`. If `u ≪ t` `momentupdat()` is much faster than a recalculation.

```julia> x = ones(6, 2);

julia> SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0]

[1.0 1.0; 1.0 1.0]], 2, 1, 2, true)

julia> y = 2*ones(2,2);

julia> momentupdat(m, x, y)
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[3.33333 3.33333; 3.33333 3.33333]

[3.33333 3.33333; 3.33333 3.33333]], 2, 1, 2, true)
```

Function `momentarray(X, d)` can be used to compute an array of moments of order `1, ..., d` of data `X ∈ ℜᵗˣⁿ`

`julia> momentarray(X::Matrix{T}, d::Int, b::Int) where {T<:AbstractFloat, d}`

`b` - is a `SymmetricTensor` parameter, a block size.

One can update an array of moments by calling:

`julia> momentupdat(M::Array{SymmetricTensor{T, d}}, X::Matrix{T}, Xplus::Matrix{T}) where {T<:AbstractFloat, d}`

Returns an `Array{SymmetricTensor{T, d}}` of moment tensors of order `1, ..., d` of updated multivariate data: `ℜᵗˣⁿ ∋ X' = dataupdat(X,X+)`, i.e. `Mₐᵣ = momentarray(X, d)`, `momentupdat(Mₐᵣ, X, X+) = momentarray(X', d)`.

### Cumulants update

Presented functions are design for sequent update of multivariate cumulant tensors. Hence it can be applied in a data streaming scheme. Suppose one has data `X ∈ ℜᵗˣⁿ` and subsequent coming updates `X+ ∈ ℜᵘˣⁿ` such that `u ≪ t`. Suppose one want to compute cumulant tensors in an observation window of size `t` each time the update comes. To store data amd moments we use the following structure `mutable struct DataMoments{T <: AbstractFloat}` with following fields: `X` - data, `d` - maximal order of a moment series, `b` - a block size, `M` - moment series. To initialise, we can use the following constructor:

`julia> DataMoments(X::Matrix{T}, d::Int, b::Int) where T <: AbstractFloat`

here an array of moments will be computed. To update a DataMoments structure and compute updated cumulants run:

```julia> cumulantsupdate!(dm::DataMoments{T}, Xplus::Matrix{T}) where T <: AbstractFloat
```

The function updates a DataMoment structure and returns a series of cumulants of order `1, ..., dm.d` for updated data. See:

```julia> x = ones(10,2);

julia> s = DataMoments(x, 4, 2);

julia> y = zeros(4,2);

julia> cumulantsupdate!(s,y)[4]
SymmetricTensor{Float64,4}(Union{Nothing, Array{Float64,4}}[[-0.1056 -0.1056; -0.1056 -0.1056]

[-0.1056 -0.1056; -0.1056 -0.1056]

[-0.1056 -0.1056; -0.1056 -0.1056]

[-0.1056 -0.1056; -0.1056 -0.1056]], 2, 1, 2, true)
```

To save the DataMoments structure use:

`julia> savedm(dm::DataMoments, dir::String)`

`julia> loaddm(dir::String)`

returns a DataMoments structure stored in a given directory.

### The p-norm

`julia> norm(st::SymmetricTensor{T, d}, p::Union{Float64, Int}) where {T<:AbstractFloat, d}`

Returns a `p`-norm of the tensor stored as `SymmetricTensors`, supported for `k ≠ 0`. The output of `norm(st, p) = norn(convert(Array, st),p)`. However `norm(st::SymmetricTensor, p)` uses the block structure implemented in `SymmetricTensors`, hence is faster and decreases the computer memory requirement.

```julia> te = [-0.112639 0.124715 0.124715 0.268717 0.124715 0.268717 0.268717 0.046154];

julia> st = SymmetricTensor(reshape(te, (2,2,2)));

julia> norm(st)
0.5273572868359742

julia> norm(st, 2.5)
0.4468668679541424

julia> norm(st, 1)
1.339089```

### Convert cumulants to moments and moments to cumulants

Given `M` a vector of moments of order `1, ..., d` to change it to a vector of cumulants of order `1, ..., d` using

`julia> function moms2cums!(M::Vector{SymmetricTensor{T}}) where T <: AbstractFloat`

One can convert a vector of cumulants `c` to a vector of moments by running

`julia> function cums2moms(c::Vector{SymmetricTensor{T}}) where T <: AbstractFloat`
```julia> m = momentarray(ones(20,3), 3, 2)
3-element Array{SymmetricTensor{Float64,N} where N,1}:
SymmetricTensor{Float64,1}(Union{Nothing, Array{Float64,1}}[[1.0, 1.0], [1.0]], 2, 2, 3, false)
SymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; nothing [1.0]], 2, 2, 3, false)
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0]
[1.0 1.0; 1.0 1.0] nothing; nothing nothing]
Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; nothing [1.0]], 2, 2, 3, false)

julia> moms2cums!(m)

julia> m
3-element Array{SymmetricTensor{Float64,N} where N,1}:
SymmetricTensor{Float64,1}(Union{Nothing, Array{Float64,1}}[[1.0, 1.0], [1.0]], 2, 2, 3, false)
SymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[0.0 0.0; 0.0 0.0] [0.0; 0.0]; #undef [0.0]], 2, 2, 3, false)
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[0.0 0.0; 0.0 0.0]
[0.0 0.0; 0.0 0.0] #undef; #undef #undef]
Union{Nothing, Array{Float64,3}}[[0.0 0.0; 0.0 0.0] [0.0; 0.0]; #undef [0.0]], 2, 2, 3, false)

julia> cums2moms(m)
3-element Array{SymmetricTensor{Float64,N} where N,1}:
SymmetricTensor{Float64,1}(Union{Nothing, Array{Float64,1}}[[1.0, 1.0], [1.0]], 2, 2, 3, false)
SymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; #undef [1.0]], 2, 2, 3, false)
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0]
[1.0 1.0; 1.0 1.0] #undef; #undef #undef]
Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; #undef [1.0]], 2, 2, 3, false)

```

# Performance tests

To analyse the computational time of cumulants updates vs `Cumulants.jl` recalculation, we supply the executable script `comptimes.jl`. The script saves computational times to the `res/*.jld` file. The scripts accept following parameters:

• `-d (Int)`: cumulant's maximum order, by default `d = 4`,
• `-n (vararg Int)`: numbers of marginal variables, by default `n = 40`,
• `-t (Int)`: number of realisations of random variable, by default `t = 500000`,
• `-u (vararg Int)`: number of realisations of update, by default `u = 10000, 15000, 20000`,
• `-b (Int)`: blocks size, by default `b = 4`,
• `-p (Int)`: numbers of processes, by default `p = 3`.

To analyse the computational time of cumulants updates for different block sizes `1 < b ≤ Int(√n)+2`, we supply the executable script `comptimesblocks.jl`. The script saves computational times to the `res/*.jld` file. The scripts accept following parameters:

• `-d (Int)`: cumulant's order, by default `d = 4`,
• `-n (Int)`: numbers of marginal variables, by default `n = 48`,
• `-u (vararg Int)`: number of realisations of the update, by default `u = 10000, 20000`.
• `-p (Int)`: numbers of processes, by default `p = 3`.

To analyse the computational time of cumulants updates for different number of workers, we supply the executable script `comptimesprocs.jl`. The script saves computational times to the `res/*.jld` file. The scripts accept following parameters:

• `-d (Int)`: cumulant's order, by default `d = 4`,
• `-n (Int)`: numbers of marginal variables, by default `n = 48`,
• `-u (vararg Int)`: number of realisations of the update, by default `u = 10000, 20000`,
• `-b (Int)`: blocks size, by default `b = 4`,
• `-p (Int)`: maximal numbers of processes, by default `p = 6`.

To plot computational times run executable `res/plotcomptimes.jl` on chosen `*.jld` file.

# Citing this work

Krzysztof Domino, Piotr Gawron, An algorithm for arbitrary–order cumulant tensor calculation in a sliding window of data streams, International Journal of Applied Mathematics and Computer Science, vol. 29, issue 1, pp. 206, 2019 (https://doi.org/10.2478/amcs-2019-0015 )

This project was partially financed by the National Science Centre, Poland – project number 2014/15/B/ST6/05204.