BenchmarkHistograms
Wraps BenchmarkTools.jl to provide a UnicodePlots.jl-powered show
method for @benchmark
. This is accomplished by a custom @benchmark
method which wraps the output in a BenchmarkPlot
struct with a custom show method.
This means one should not call using
on both BenchmarkHistograms and BenchmarkTools in the same namespace, or else these @benchmark
macros will conflict ("WARNING: using BenchmarkTools.@benchmark
in module Main conflicts with an existing identifier.")
However, BenchmarkHistograms re-exports all of BenchmarkTools (including the module BenchmarkTools
itself), so you can simply call using BenchmarkHistograms
instead.
Providing this functionality in BenchmarkTools itself was discussed in https://github.com/JuliaCI/BenchmarkTools.jl/pull/180.
Use the setting BenchmarkHistograms.NBINS[]
to change the number of histogram bins used, e.g.
BenchmarkHistograms.NBINS[] = 10
to use 10 bins.
Example
One just uses BenchmarkHistograms
instead of BenchmarkTools
, e.g.
using BenchmarkHistograms
@benchmark sin(x) setup=(x=rand())
samples: 10000; evals/sample: 1000; memory estimate: 0 bytes; allocs estimate: 0
┌ ┐
[ 4.0, 6.0) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 7823
[ 6.0, 8.0) ┤▇▇▇▇▇▇▇ 1643
[ 8.0, 10.0) ┤▇▇ 529
[10.0, 12.0) ┤ 2
[12.0, 14.0) ┤ 2
ns [14.0, 16.0) ┤ 0
[16.0, 18.0) ┤ 0
[18.0, 20.0) ┤ 0
[20.0, 22.0) ┤ 0
[22.0, 24.0) ┤ 0
[24.0, 26.0) ┤ 0
[26.0, 28.0) ┤ 1
└ ┘
Counts
min: 4.916 ns (0.00% GC); mean: 5.724 ns (0.00% GC); median: 5.208 ns (0.00% GC); max: 27.458 ns (0.00% GC).
That benchmark does not have a very interesting distribution, but it's not hard to find more interesting cases.
@benchmark 5 ∈ v setup=(v = sort(rand(1:10000, 10000)))
samples: 3192; evals/sample: 1000; memory estimate: 0 bytes; allocs estimate: 0
┌ ┐
[ 0.0, 500.0) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 2036
[ 500.0, 1000.0) ┤ 0
[1000.0, 1500.0) ┤ 0
ns [1500.0, 2000.0) ┤ 0
[2000.0, 2500.0) ┤ 0
[2500.0, 3000.0) ┤ 0
[3000.0, 3500.0) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 1156
└ ┘
Counts
min: 1.875 ns (0.00% GC); mean: 1.141 μs (0.00% GC); median: 4.521 ns (0.00% GC); max: 3.315 μs (0.00% GC).
Here, we see a bimodal distribution; in the case 5
is indeed in the vector, we find it very quickly, in the 0-1000 ns range (thanks to sort
which places it at the front). In the case 5 is not present, we need to check every entry to be sure, and we end up in the 3000-4000 ns range.
Without the sort
, we end up with more of a uniform distribution:
@benchmark 5 ∈ v setup=(v = rand(1:10000, 10000))
samples: 2461; evals/sample: 999; memory estimate: 0 bytes; allocs estimate: 0
┌ ┐
[ 0.0, 500.0) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 364
[ 500.0, 1000.0) ┤▇▇▇▇▇▇▇▇▇▇▇▇ 327
[1000.0, 1500.0) ┤▇▇▇▇▇▇▇▇▇▇ 266
ns [1500.0, 2000.0) ┤▇▇▇▇▇▇▇▇ 214
[2000.0, 2500.0) ┤▇▇▇▇▇▇▇▇ 213
[2500.0, 3000.0) ┤▇▇▇▇▇ 146
[3000.0, 3500.0) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 931
└ ┘
Counts
min: 8.842 ns (0.00% GC); mean: 1.972 μs (0.00% GC); median: 2.154 μs (0.00% GC); max: 3.364 μs (0.00% GC).
This function gives a somewhat more Gaussian distribution of times, kindly supplied by Mason Protter:
f() = sum((sin(i) for i in 1:round(Int, 1000 + 100*randn())))
@benchmark f()
samples: 10000; evals/sample: 1; memory estimate: 0 bytes; allocs estimate: 0
┌ ┐
[ 8000.0, 9000.0) ┤ 12
[ 9000.0, 10000.0) ┤▇ 117
[10000.0, 11000.0) ┤▇▇▇▇▇▇▇ 635
[11000.0, 12000.0) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 1810
[12000.0, 13000.0) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 2959
[13000.0, 14000.0) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 2460
ns [14000.0, 15000.0) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 1451
[15000.0, 16000.0) ┤▇▇▇▇▇ 456
[16000.0, 17000.0) ┤▇ 89
[17000.0, 18000.0) ┤ 9
[18000.0, 19000.0) ┤ 1
[19000.0, 20000.0) ┤ 0
[20000.0, 21000.0) ┤ 1
└ ┘
Counts
min: 8.109 μs (0.00% GC); mean: 12.865 μs (0.00% GC); median: 12.820 μs (0.00% GC); max: 20.459 μs (0.00% GC).
See also https://tratt.net/laurie/blog/entries/minimum_times_tend_to_mislead_when_benchmarking.html for another example of where looking at the whole histogram can be useful in benchmarking.
This page was generated using Literate.jl.