ThreadPinning.jl
Readily pin Julia threads to CPU processors
Documentation | Build Status | Quality |
---|---|---|
Demonstration
Dual-socket system where each CPU has 40 hardware threads (20 CPU-cores with 2-way SMT).
Check out the documentation to learn how to use ThreadPinning.jl.
Installation
Note: Only Linux is supported!
The package is registered. Hence, you can simply use
] add ThreadPinning
to add the package to your Julia environment.
Prerequisites
To gather information about the hardware topology of the system (e.g. sockets and memory domains), ThreadPinning.jl uses lscpu
. The latter must therefore be available (i.e. be on PATH
), which should automatically be the case on virtually all linux systems.
In the unlikely case that lscpu
isn't already installed on your system, here are a few ways to get it
- install
util-linux
via your system's package manager or manually from here - download the same as a Julia artifact: util_linux_jll.jl
Autoupdate setting
By default, ThreadPinning.jl queries the system topology using lscpu
on startup (i.e. at runtime). This is quite costly but is unfortunately necessary since you might have precompiled the package on one machine and use it from another (think e.g. login and compute nodes of a HPC cluster). However, you can tell ThreadPinning.jl to permanently skip this autoupdate at runtime and to always use the system topology that was present at compile time (i.e. when precompiling the package). This is perfectly save if you don't use the same Julia depot on different machines, in particular if you're a "standard user" that uses Julia on a desktop computer or laptop, and can reduce the package load time significantly. To do so, simply call ThreadPinning.Prefs.set_autoupdate(false)
.
Why pin Julia threads?
Because
- it effects performance (MFlops/s), in particular on HPC clusters with multiple NUMA domains
- it allows you to utilize performance counters inside of CPU-cores for hardware-performance monitoring
- it makes performance benchmarks more reliable (i.e. less random/noisy)
- ...
Documentation
For more information, please find the documentation here.
Acknowledgements
CI infrastructure is provided by the Paderborn Center for Parallel Computing (PC²)