Calculates cumulant tensors of any order for multivariate data.
Functions return tensor or array of tensors in SymmetricTensors type. Requires SymmetricTensors.jl. To convert to array, run:
julia> Array(data::SymmetricTensors{T, N})As of 01/01/2017 kdomino is the lead maintainer of this package.
Within Julia, run
pkg> add Cumulantsto install the files. Julia 1.0 or later is required.
julia> moment(data::Matrix{T}, m::Int, b::Int = 2) where T<: AbstractFloatReturns a SymmetricTensor{T, m} of the moment of order m of multivariate data represented by a t by n matrix, i.e. data with n marginal variables and t realisations. The argument b with defalt value 2, is an optional Int that determines the size
of the blocks in SymmetricTensors type.
julia> data = reshape(collect(1.:15.),(5,3))
5×3 Array{Float64,2}:
1.0 6.0 11.0
2.0 7.0 12.0
3.0 8.0 13.0
4.0 9.0 14.0
5.0 10.0 15.0julia> m = moment(data, 3)
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[45.0 100.0; 100.0 230.0]
[100.0 230.0; 230.0 560.0] nothing; nothing nothing]
Union{Nothing, Array{Float64,3}}[[155.0 360.0; 360.0 890.0] [565.0; 1420.0]; nothing [2275.0]], 2, 2, 3, false)To convert to array use:
julia> Array(m)
3×3×3 Array{Float64,3}:
[:, :, 1] =
45.0 100.0 155.0
100.0 230.0 360.0
155.0 360.0 565.0
[:, :, 2] =
100.0 230.0 360.0
230.0 560.0 890.0
360.0 890.0 1420.0
[:, :, 3] =
155.0 360.0 565.0
360.0 890.0 1420.0
565.0 1420.0 2275.0julia> cumulants(data::Matrix{T}, m::Int = 4, b::Int = 2) where T<: AbstractFloatReturns a vector of SymmetricTensor{T, i} i = 1,2,3,...,m of cumulants of
order 1,2,3,...,m. Cumulants are calculated for multivariate data represented
by matrix of size t by n, i.e. data with n marginal variables and t
realisations.
julia> c = cumulants(data, 3);
julia> c[2]
SymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[2.0 2.0; 2.0 2.0] [2.0; 2.0]; nothing [2.0]], 2, 2, 3, false)
julia> c[3]
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[0.0 0.0; 0.0 0.0]
[0.0 0.0; 0.0 0.0] nothing; nothing nothing]
Union{Nothing, Array{Float64,3}}[[0.0 0.0; 0.0 0.0] [0.0; 0.0]; nothing [0.0]], 2, 2, 3, false)
To convert to array:
julia> Array(c[2])
3×3 Array{Float64,2}:
2.0 2.0 2.0
2.0 2.0 2.0
2.0 2.0 2.0
julia> Array(c[3])
3×3×3 Array{Float64,3}:
[:, :, 1] =
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0
[:, :, 2] =
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0
[:, :, 3] =
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0The argument b with default value 2, is an optional Int
that determines a size of blocks in SymmetricTensors type. This block size b is the parameter that affect
the algorithm performance, for most cases the performance is optimal for b = 2, 3. The block size must
fulfil 0 < b ≦ size(data, 2) otherwise error will be raised. For the performance analysis for various bolck sizes see Section 5.2.1 in
Krzysztof Domino, Piotr Gawron, Łukasz Pawela, Efficient Computation of Higher-Order Cumulant Tensors, SIAM J. Sci. Comput. 40, A1590 (2018) , https://arxiv.org/abs/1701.05420. For benchmarking one can also use
benchmarks/comptimeblocks.jl
The purpose of this package is to compute moments and cumulants for multivariate data. It works for univariate data X structured in the form of matrix with size(X, 2) = 1 if taking b=1. Such univariate application is not efficient however.
julia> X = [1., 2., 3., 4.];
julia> X = reshape(X, (4,1));
julia> c = cumulants(X,4,1);
julia> map(x -> Array(x)[1], c)
4-element Array{Float64,1}:
2.5
1.25
0.0
-2.125We do not suply exact univariate fisher's k-statistics.
Parallel computation is efficient for large number of data realisations, e.g. t = 1000000. For parallel computation just run
julia> addprocs(n)
julia> @everywhere using CumulantsNaive algorithms of moment and cumulant tensors calculations are also available.
julia> naivemoment(data::Matrix{T}, m::Int = 4) where T<: AbstractFloatReturns array{T, m} of the m'th moment of data. calculated using a naive algorithm.
julia> naivemoment(data, 3)
3×3×3 Array{Float64,3}:
[:, :, 1] =
45.0 100.0 155.0
100.0 230.0 360.0
155.0 360.0 565.0
[:, :, 2] =
100.0 230.0 360.0
230.0 560.0 890.0
360.0 890.0 1420.0
[:, :, 3] =
155.0 360.0 565.0
360.0 890.0 1420.0
565.0 1420.0 2275.0julia> naivecumulant(data::Matrix{T}, m::Int = 4) where T<: AbstractFloatReturns Array{T, m} of the m'th cumulant of data, calculated using a naive algorithm. Works for 1 <= m < 7, for m >= 7 throws exception.
julia> naivecumulant(data, 2)
3×3 Array{Float64,2}:
2.0 2.0 2.0
2.0 2.0 2.0
2.0 2.0 2.0julia> naivecumulant(data, 3)
3×3×3 Array{Float64,3}:
[:, :, 1] =
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0
[:, :, 2] =
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0
[:, :, 3] =
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0To analyse the computational time of cumulants vs naivecumulants and moment vs naivemoment, we supply the executable script comptimes.jl.
This script returns to a .jld file computational times, given following parameters:
-m (Int): cumulant's order, by defaultm = 4,-n (vararg Int): numbers of marginal variables, by defaultm = 20 24 28,-t (vararg Int): number of realisations of random variable, by defalutt = 10000. Be careful while usingn>4and largem, where naive algorithms might need a large computational time and memory usage. Naive algorithms does not use the block structures, hence they computes and stores a whole cumulant tensor regardless its symmetry. All comparisons performed by this script use one core.
To analyse the computational time of cumulants for different block sizes 1 =< b =< Int(sqrt(n)), we supply the executable script comptimeblocks.jl.
This script returns to a .jld file computational times, given following parameters:
-m (Int): cumulant's order, by defaultm = 4,-n (Int): numbers of marginal variables, by defaultm = 48,-t (vararg Int): number of realisations of random variable, by defaultt = 10000 20000. Computational times and parameters are saved in the .jld file in /res directory. All comparisons performed by this script use one core.
To analyse the computational time of moment on different numbers of processes, we supply the executable script comptimeprocs.jl.
This script returns to a .jld file computational times, given following parameters:
-m (Int): moment's order, by defaultm = 4,-n (Int): numbers of marginal variables, by defaultm = 50,-t (Int): number of realisations of random variable, by defaultt = 100000,-p (Int): maximal number of processes, by defaultp = 4,-b (Int): blocks size, by defaultb = 2.
All result files are saved in /res directory. To plot a graph run /res/plotcomptimes.jl followed by a *.jld file name
For the computational example on data use the following.
The script gandata.jl generates t = 150000000 realisations of n = 4 dimensional data form the t-multivariate distribution with ν = 14 degrees of freedom, and theoretical
super-diagonal elements of those cumulants. Results are saved in data/datafortests.jld
The script testondata.jl computes cumulant tensors of order m = 1 - 6 for data/datafortests.jld, results are saved in data/cumulants.jld.
To read cumulants.jld please run
julia> using JLD
julia> using SymmetricTensors
julia> load("cumulants.jld")
To plot super-diagonal elements of those cumulants and their theoretical values from t-student distribution pleas run plotsuperdiag.jl
Krzysztof Domino, Piotr Gawron, Łukasz Pawela, Efficient Computation of Higher-Order Cumulant Tensors, SIAM J. Sci. Comput. 40, A1590 (2018) , https://arxiv.org/abs/1701.05420
This project was partially financed by the National Science Centre, Poland – project number 2014/15/B/ST6/05204.