SIMDscan.jl

Author schrimpf
Popularity
2 Stars
Updated Last
8 Months Ago
Started In
September 2023

SIMDscan

A fast scan using SIMD instructions.

Dev

Build Status Coverage

A scan or prefix operation is a generalization of a cumulative sum. Given a sequence $x_1, x_2, ... , x_n$, and an associative operator, $\oplus$, the the scan is:

$$x_1, x_2 \oplus x_1, x_3 \oplus x_2 \oplus x_1, ... , x_n \oplus x_{n-1} \oplus \cdots \oplus x_1$$

The scan can be parallelized when $\oplus$ is associative. This package provides an in-place scan implementation using SIMD, scan_simd!(⊕, x). For testing and performance comparison, there is also a serial implementation, scan_serial!(⊕, x).

Usage

See the docs

Benchmarks

See the benchmarks section of the docs. With 512 bit SIMD vectors, scan_simd! appears to be about 4 time faster. With 256 bit SIMD vectors, the gain is smaller, but still notable. The benchmarks run on github actions, so the resuls and CPU will vary from commit to commit. Of course, the performance will also depend on problem size and the $\oplus$ operator.

Used By Packages

No packages found.