LazyTables.jl

A simple table interface for Julia that is fast and just works.
Author m-wells
Popularity
6 Stars
Updated Last
2 Years Ago
Started In
September 2022

LazyTables.jl

All the good of TypedTables.jl but FASTER and without as many allocations!

A LazyTable is basically a TypedTables.Table but better. At worst, a LazyTable will perform just as well as a Table. But at its best it can be hundreds of times faster with no allocations. Here are some benchmarks which you can run for yourself.

Benchmarks

All benchmarks are performed using the same data for each table. The tables have 1000 rows and 52 columns (column names are :A to :Z and :a to :z) where each column is a Vector.

# LazyTable is 68.64 times FASTER than TypedTables.Table
# LazyTable used 100.00% LESS memory than TypedTables.Table
# LazyTable made 100.00% FEWER allocations than TypedTables.Table

julia> @benchmark values($lazytab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 961 evaluations.
 Range (min  max):  86.613 ns  115.137 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     86.665 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   87.101 ns ±   1.653 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%
 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark values($typetab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 6 evaluations.
 Range (min  max):  5.437 μs  142.746 μs  ┊ GC (min  max): 0.00%  93.06%
 Time  (median):     5.686 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.979 μs ±   3.144 μs  ┊ GC (mean ± σ):  1.24% ±  2.29%
 Memory estimate: 4.78 KiB, allocs estimate: 126.

# LazyTable is 47.10 times FASTER than TypedTables.Table
# LazyTable used 100.00% LESS memory than TypedTables.Table
# LazyTable made 100.00% FEWER allocations than TypedTables.Table

julia> @benchmark pairs($lazytab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 961 evaluations.
 Range (min  max):  86.620 ns  113.056 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     86.674 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   87.068 ns ±   1.517 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark pairs($typetab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min  max):  3.729 μs  129.913 μs  ┊ GC (min  max): 0.00%  94.93%
 Time  (median):     3.872 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.101 μs ±   2.678 μs  ┊ GC (mean ± σ):  1.41% ±  2.12%

 Memory estimate: 3.48 KiB, allocs estimate: 91.

# LazyTable is 87.58 times FASTER than TypedTables.Table
# LazyTable used 100.00% LESS memory than TypedTables.Table
# LazyTable made 100.00% FEWER allocations than TypedTables.Table

julia> f(x) = Iterators.map(x -> x.a, x)

julia> @benchmark f($lazytab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 988 evaluations.
 Range (min  max):  46.230 ns  66.602 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     46.272 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   46.568 ns ±  1.127 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark f($typetab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min  max):  3.725 μs  120.630 μs  ┊ GC (min  max): 0.00%  95.06%
 Time  (median):     3.847 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.078 μs ±   2.568 μs  ┊ GC (mean ± σ):  1.36% ±  2.12%

 Memory estimate: 3.48 KiB, allocs estimate: 91.

# LazyTable is 89.21 times FASTER than TypedTables.Table
# LazyTable used 100.00% LESS memory than TypedTables.Table
# LazyTable made 100.00% FEWER allocations than TypedTables.Table

julia> f(x) = x[500]

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 988 evaluations.
 Range (min  max):  45.949 ns  67.608 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     46.002 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   46.240 ns ±  0.993 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min  max):  3.746 μs  124.082 μs  ┊ GC (min  max): 0.00%  94.28%
 Time  (median):     3.910 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.125 μs ±   2.650 μs  ┊ GC (mean ± σ):  1.40% ±  2.13%

 Memory estimate: 3.48 KiB, allocs estimate: 91.

# LazyTable is 1.70 times FASTER than TypedTables.Table
# LazyTable used 94.69% LESS memory than TypedTables.Table
# LazyTable made 62.83% FEWER allocations than TypedTables.Table

julia> f(x) = sum(x[500])

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 6 evaluations.
 Range (min  max):  5.034 μs  199.959 μs  ┊ GC (min  max): 0.00%  95.24%
 Time  (median):     5.466 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.402 μs ±   1.964 μs  ┊ GC (mean ± σ):  0.35% ±  0.95%

 Memory estimate: 1.31 KiB, allocs estimate: 84.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 10000 samples with 3 evaluations.
 Range (min  max):  7.946 μs  363.028 μs  ┊ GC (min  max): 0.00%  94.70%
 Time  (median):     8.261 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   9.206 μs ±  12.401 μs  ┊ GC (mean ± σ):  5.08% ±  3.71%

 Memory estimate: 24.72 KiB, allocs estimate: 226.

# this is a draw

julia> f(x) = filter(<(0.5), x.z)

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.019 μs  91.588 μs  ┊ GC (min  max): 0.00%  97.02%
 Time  (median):     1.056 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.188 μs ±  1.237 μs  ┊ GC (mean ± σ):  1.44% ±  1.38%

 Memory estimate: 1.06 KiB, allocs estimate: 1.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.017 μs  90.782 μs  ┊ GC (min  max): 0.00%  96.64%
 Time  (median):     1.055 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.195 μs ±  1.277 μs  ┊ GC (mean ± σ):  1.47% ±  1.37%

 Memory estimate: 1.06 KiB, allocs estimate: 1.

# LazyTable is 245.80 times FASTER than TypedTables.Table
# LazyTable used 99.46% LESS memory than TypedTables.Table
# LazyTable made 99.88% FEWER allocations than TypedTables.Table

julia> f(x) = filter(x -> x.z < 0.5, x)

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  13.791 μs   1.433 ms  ┊ GC (min  max): 0.00%  97.16%
 Time  (median):     14.262 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   15.707 μs ± 25.517 μs  ┊ GC (mean ± σ):  3.18% ±  1.95%

 Memory estimate: 19.62 KiB, allocs estimate: 113.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 1294 samples with 1 evaluation.
 Range (min  max):  3.677 ms    6.025 ms  ┊ GC (min  max): 0.00%  23.36%
 Time  (median):     3.743 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.861 ms ± 340.557 μs  ┊ GC (mean ± σ):  2.11% ±  5.71%

 Memory estimate: 3.57 MiB, allocs estimate: 91112.

# this is a draw

julia> f(x) = Iterators.filter(x -> x.z < 0.5, x)

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  5.858 ns  22.717 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     5.903 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.026 ns ±  0.303 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  5.863 ns  21.733 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     5.888 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.005 ns ±  0.359 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 0 bytes, allocs estimate: 0.

# this is a draw

julia> f(x) = vcat(x, x)

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  45.199 μs   1.421 ms  ┊ GC (min  max):  0.00%  91.16%
 Time  (median):     58.151 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   72.780 μs ± 98.033 μs  ┊ GC (mean ± σ):  14.86% ± 10.44%

 Memory estimate: 619.73 KiB, allocs estimate: 61.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  49.442 μs   1.421 ms  ┊ GC (min  max):  0.00%  88.79%
 Time  (median):     58.011 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   70.518 μs ± 93.531 μs  ┊ GC (mean ± σ):  14.89% ± 10.42%

 Memory estimate: 619.30 KiB, allocs estimate: 60.

Usage

Full integration with the Tables.jl interface.

Should function as a drop in replacement for TypedTables.Table.

A LazyTable actually uses Table as its store and can be constructed the same way.

julia> using LazyTables

julia> lazytable = LazyTable(x = rand(10), y = rand(10))
LazyTable with 2 columns with 10 rows:
╭─────┬───────────┬───────────╮
│ row │     x     │     y     │
├─────┼───────────┼───────────┤
│   10.269970.662442  │
│   20.3151060.745717  │
│   30.7007360.499348  │
│   40.5312620.387146  │
│   50.9619510.531365  │
│   60.224440.498552  │
│   70.04504730.648617  │
│   80.1827060.0796079 │
│   90.2161630.437709  │
│  100.9291860.899007  │
╰─────┴───────────┴───────────╯

julia> lazytable[1]
(x = 0.26997004281231074, y = 0.6624416805539212)

julia> lazytable[1].x
0.26997004281231074

Differences from TypedTables.Table

LazyRow

LazyTable does not return a NamedTuple. Instead it returns a LazyRow which should act and feel just like a NamedTuple (for the most part).

julia> using TypedTables: Table

julia> typetable = lazytable |> Table;

julia> lt1 = lazytable[1]
(x = 0.26997004281231074, y = 0.6624416805539212)

julia> tt1 = typetable[1]
(x = 0.26997004281231074, y = 0.6624416805539212)

julia> lt1[:x]
0.26997004281231074

julia> tt1[:x]
0.26997004281231074

However, you can assign values through a LazyRow into the LazyTable which isn't possible with Table.

julia> lt1.x = 10
10

julia> tt1.x = 10
ERROR: setfield!: immutable struct of type NamedTuple cannot be changed

julia> lt1
(x = 10.0, y = 0.6624416805539212)

julia> tt1
(x = 0.26997004281231074, y = 0.6624416805539212)

Indexing

To make indexing consistent between rows and tables, columns of the LazyTable can be accessed via the property interface or the dictionary like index interface

julia> lazytable.x === lazytable[:x]
true

julia> typetable[:x]
ERROR: ArgumentError: invalid index: :x of type Symbol

Multidimensional Columns

Just like Table, multidimensional columns are supported

julia> lazymatrixtable = hcat(lazytable, lazytable)
LazyTable with 2 columns with 20 rows:
╭─────┬───────┬───────────┬───────────╮
│ row │ index │     x     │     y     │
├─────┼───────┼───────────┼───────────┤
│   11, 110.00.662442  │
│   22, 10.3151060.745717  │
│   33, 10.7007360.499348  │
│   44, 10.5312620.387146  │
│   55, 10.9619510.531365  │
│   66, 10.224440.498552  │
│   77, 10.04504730.648617  │
│   88, 10.1827060.0796079 │
│   99, 10.2161630.437709  │
│  1010, 10.9291860.899007  │
│  111, 210.00.662442  │
│  122, 20.3151060.745717  │
│  133, 20.7007360.499348  │
│  144, 20.5312620.387146  │
│  155, 20.9619510.531365  │
│  166, 20.224440.498552  │
│  177, 20.04504730.648617  │
│  188, 20.1827060.0796079 │
│  199, 20.2161630.437709  │
│  2010, 20.9291860.899007  │
╰─────┴───────┴───────────┴───────────╯

julia> lazymatrixtable[5,2]
(0.9619514860009788, 0.5313645724703538)

Although Table errors when showing Array columns.

julia> typematrixtable = lazymatrixtable |> Table;

julia> lazymatrixtable[5,2]
(0.9619514860009788, 0.5313645724703538)

julia> typematrixtable
Table with 2 columns and 10×2 rowsError showing value of type Table{NamedTuple{(:x, :y), Tuple{Float64, Float64}}, 2, NamedTuple{(:x, :y), Tuple{Matrix{Float64}, Matrix{Float64}}}}:
ERROR: MethodError: no method matching isassigned(::Matrix{Float64}, ::CartesianIndex{2})

Used By Packages

No packages found.