LazyTables.jl

A simple table interface for Julia that is fast and just works.
Author m-wells
Popularity
6 Stars
Updated Last
1 Year Ago
Started In
September 2022

LazyTables.jl

All the good of TypedTables.jl but FASTER and without as many allocations!

A LazyTable is basically a TypedTables.Table but better. At worst, a LazyTable will perform just as well as a Table. But at its best it can be hundreds of times faster with no allocations. Here are some benchmarks which you can run for yourself.

Benchmarks

All benchmarks are performed using the same data for each table. The tables have 1000 rows and 52 columns (column names are :A to :Z and :a to :z) where each column is a Vector.

# LazyTable is 68.64 times FASTER than TypedTables.Table
# LazyTable used 100.00% LESS memory than TypedTables.Table
# LazyTable made 100.00% FEWER allocations than TypedTables.Table

julia> @benchmark values($lazytab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 961 evaluations.
 Range (min … max):  86.613 ns … 115.137 ns  ┊ GC (min … max): 0.00%0.00%
 Time  (median):     86.665 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   87.101 ns ±   1.653 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%
 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark values($typetab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 6 evaluations.
 Range (min … max):  5.437 μs … 142.746 μs  ┊ GC (min … max): 0.00%93.06%
 Time  (median):     5.686 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.979 μs ±   3.144 μs  ┊ GC (mean ± σ):  1.24% ±  2.29%
 Memory estimate: 4.78 KiB, allocs estimate: 126.

# LazyTable is 47.10 times FASTER than TypedTables.Table
# LazyTable used 100.00% LESS memory than TypedTables.Table
# LazyTable made 100.00% FEWER allocations than TypedTables.Table

julia> @benchmark pairs($lazytab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 961 evaluations.
 Range (min … max):  86.620 ns … 113.056 ns  ┊ GC (min … max): 0.00%0.00%
 Time  (median):     86.674 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   87.068 ns ±   1.517 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark pairs($typetab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min … max):  3.729 μs … 129.913 μs  ┊ GC (min … max): 0.00%94.93%
 Time  (median):     3.872 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.101 μs ±   2.678 μs  ┊ GC (mean ± σ):  1.41% ±  2.12%

 Memory estimate: 3.48 KiB, allocs estimate: 91.

# LazyTable is 87.58 times FASTER than TypedTables.Table
# LazyTable used 100.00% LESS memory than TypedTables.Table
# LazyTable made 100.00% FEWER allocations than TypedTables.Table

julia> f(x) = Iterators.map(x -> x.a, x)

julia> @benchmark f($lazytab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 988 evaluations.
 Range (min … max):  46.230 ns … 66.602 ns  ┊ GC (min … max): 0.00%0.00%
 Time  (median):     46.272 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   46.568 ns ±  1.127 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark f($typetab[i]) setup=(i=rand(1:1000))
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min … max):  3.725 μs … 120.630 μs  ┊ GC (min … max): 0.00%95.06%
 Time  (median):     3.847 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.078 μs ±   2.568 μs  ┊ GC (mean ± σ):  1.36% ±  2.12%

 Memory estimate: 3.48 KiB, allocs estimate: 91.

# LazyTable is 89.21 times FASTER than TypedTables.Table
# LazyTable used 100.00% LESS memory than TypedTables.Table
# LazyTable made 100.00% FEWER allocations than TypedTables.Table

julia> f(x) = x[500]

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 988 evaluations.
 Range (min … max):  45.949 ns … 67.608 ns  ┊ GC (min … max): 0.00%0.00%
 Time  (median):     46.002 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   46.240 ns ±  0.993 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min … max):  3.746 μs … 124.082 μs  ┊ GC (min … max): 0.00%94.28%
 Time  (median):     3.910 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.125 μs ±   2.650 μs  ┊ GC (mean ± σ):  1.40% ±  2.13%

 Memory estimate: 3.48 KiB, allocs estimate: 91.

# LazyTable is 1.70 times FASTER than TypedTables.Table
# LazyTable used 94.69% LESS memory than TypedTables.Table
# LazyTable made 62.83% FEWER allocations than TypedTables.Table

julia> f(x) = sum(x[500])

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 6 evaluations.
 Range (min … max):  5.034 μs … 199.959 μs  ┊ GC (min … max): 0.00%95.24%
 Time  (median):     5.466 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.402 μs ±   1.964 μs  ┊ GC (mean ± σ):  0.35% ±  0.95%

 Memory estimate: 1.31 KiB, allocs estimate: 84.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 10000 samples with 3 evaluations.
 Range (min … max):  7.946 μs … 363.028 μs  ┊ GC (min … max): 0.00%94.70%
 Time  (median):     8.261 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   9.206 μs ±  12.401 μs  ┊ GC (mean ± σ):  5.08% ±  3.71%

 Memory estimate: 24.72 KiB, allocs estimate: 226.

# this is a draw

julia> f(x) = filter(<(0.5), x.z)

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.019 μs … 91.588 μs  ┊ GC (min … max): 0.00%97.02%
 Time  (median):     1.056 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.188 μs ±  1.237 μs  ┊ GC (mean ± σ):  1.44% ±  1.38%

 Memory estimate: 1.06 KiB, allocs estimate: 1.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.017 μs … 90.782 μs  ┊ GC (min … max): 0.00%96.64%
 Time  (median):     1.055 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.195 μs ±  1.277 μs  ┊ GC (mean ± σ):  1.47% ±  1.37%

 Memory estimate: 1.06 KiB, allocs estimate: 1.

# LazyTable is 245.80 times FASTER than TypedTables.Table
# LazyTable used 99.46% LESS memory than TypedTables.Table
# LazyTable made 99.88% FEWER allocations than TypedTables.Table

julia> f(x) = filter(x -> x.z < 0.5, x)

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  13.791 μs …  1.433 ms  ┊ GC (min … max): 0.00%97.16%
 Time  (median):     14.262 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   15.707 μs ± 25.517 μs  ┊ GC (mean ± σ):  3.18% ±  1.95%

 Memory estimate: 19.62 KiB, allocs estimate: 113.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 1294 samples with 1 evaluation.
 Range (min … max):  3.677 ms …   6.025 ms  ┊ GC (min … max): 0.00%23.36%
 Time  (median):     3.743 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.861 ms ± 340.557 μs  ┊ GC (mean ± σ):  2.11% ±  5.71%

 Memory estimate: 3.57 MiB, allocs estimate: 91112.

# this is a draw

julia> f(x) = Iterators.filter(x -> x.z < 0.5, x)

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  5.858 ns … 22.717 ns  ┊ GC (min … max): 0.00%0.00%
 Time  (median):     5.903 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.026 ns ±  0.303 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  5.863 ns … 21.733 ns  ┊ GC (min … max): 0.00%0.00%
 Time  (median):     5.888 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.005 ns ±  0.359 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 0 bytes, allocs estimate: 0.

# this is a draw

julia> f(x) = vcat(x, x)

julia> @benchmark f($lazytab)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  45.199 μs …  1.421 ms  ┊ GC (min … max):  0.00%91.16%
 Time  (median):     58.151 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   72.780 μs ± 98.033 μs  ┊ GC (mean ± σ):  14.86% ± 10.44%

 Memory estimate: 619.73 KiB, allocs estimate: 61.

julia> @benchmark f($typetab)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  49.442 μs …  1.421 ms  ┊ GC (min … max):  0.00%88.79%
 Time  (median):     58.011 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   70.518 μs ± 93.531 μs  ┊ GC (mean ± σ):  14.89% ± 10.42%

 Memory estimate: 619.30 KiB, allocs estimate: 60.

Usage

Full integration with the Tables.jl interface.

Should function as a drop in replacement for TypedTables.Table.

A LazyTable actually uses Table as its store and can be constructed the same way.

julia> using LazyTables

julia> lazytable = LazyTable(x = rand(10), y = rand(10))
LazyTable with 2 columns with 10 rows:
╭─────┬───────────┬───────────╮
│ row │     x     │     y     │
├─────┼───────────┼───────────┤
│   10.269970.662442  │
│   20.3151060.745717  │
│   30.7007360.499348  │
│   40.5312620.387146  │
│   50.9619510.531365  │
│   60.224440.498552  │
│   70.04504730.648617  │
│   80.1827060.0796079 │
│   90.2161630.437709  │
│  100.9291860.899007  │
╰─────┴───────────┴───────────╯

julia> lazytable[1]
(x = 0.26997004281231074, y = 0.6624416805539212)

julia> lazytable[1].x
0.26997004281231074

Differences from TypedTables.Table

LazyRow

LazyTable does not return a NamedTuple. Instead it returns a LazyRow which should act and feel just like a NamedTuple (for the most part).

julia> using TypedTables: Table

julia> typetable = lazytable |> Table;

julia> lt1 = lazytable[1]
(x = 0.26997004281231074, y = 0.6624416805539212)

julia> tt1 = typetable[1]
(x = 0.26997004281231074, y = 0.6624416805539212)

julia> lt1[:x]
0.26997004281231074

julia> tt1[:x]
0.26997004281231074

However, you can assign values through a LazyRow into the LazyTable which isn't possible with Table.

julia> lt1.x = 10
10

julia> tt1.x = 10
ERROR: setfield!: immutable struct of type NamedTuple cannot be changed

julia> lt1
(x = 10.0, y = 0.6624416805539212)

julia> tt1
(x = 0.26997004281231074, y = 0.6624416805539212)

Indexing

To make indexing consistent between rows and tables, columns of the LazyTable can be accessed via the property interface or the dictionary like index interface

julia> lazytable.x === lazytable[:x]
true

julia> typetable[:x]
ERROR: ArgumentError: invalid index: :x of type Symbol

Multidimensional Columns

Just like Table, multidimensional columns are supported

julia> lazymatrixtable = hcat(lazytable, lazytable)
LazyTable with 2 columns with 20 rows:
╭─────┬───────┬───────────┬───────────╮
│ row │ index │     x     │     y     │
├─────┼───────┼───────────┼───────────┤
│   11, 110.00.662442  │
│   22, 10.3151060.745717  │
│   33, 10.7007360.499348  │
│   44, 10.5312620.387146  │
│   55, 10.9619510.531365  │
│   66, 10.224440.498552  │
│   77, 10.04504730.648617  │
│   88, 10.1827060.0796079 │
│   99, 10.2161630.437709  │
│  1010, 10.9291860.899007  │
│  111, 210.00.662442  │
│  122, 20.3151060.745717  │
│  133, 20.7007360.499348  │
│  144, 20.5312620.387146  │
│  155, 20.9619510.531365  │
│  166, 20.224440.498552  │
│  177, 20.04504730.648617  │
│  188, 20.1827060.0796079 │
│  199, 20.2161630.437709  │
│  2010, 20.9291860.899007  │
╰─────┴───────┴───────────┴───────────╯

julia> lazymatrixtable[5,2]
(0.9619514860009788, 0.5313645724703538)

Although Table errors when showing Array columns.

julia> typematrixtable = lazymatrixtable |> Table;

julia> lazymatrixtable[5,2]
(0.9619514860009788, 0.5313645724703538)

julia> typematrixtable
Table with 2 columns and 10×2 rowsError showing value of type Table{NamedTuple{(:x, :y), Tuple{Float64, Float64}}, 2, NamedTuple{(:x, :y), Tuple{Matrix{Float64}, Matrix{Float64}}}}:
ERROR: MethodError: no method matching isassigned(::Matrix{Float64}, ::CartesianIndex{2})

Used By Packages

No packages found.