GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
202 Stars
Updated Last
7 Months Ago
Started In
June 2019


Join the chat at https://julialang.zulipchat.com #sciml-bridged Global Docs Build status


ColPrac: Contributor's Guide on Collaborative Practices for Community Packages SciML Code Style

This library is a component package of the DifferentialEquations.jl ecosystem. It includes functionality for making use of GPUs in the differential equation solvers.

The two ways to accelerate ODE solvers with GPUs

There are two very different ways that one can accelerate an ODE solution with GPUs. There is one case where u is very big and f is very expensive but very structured, and you use GPUs to accelerate the computation of said f. The other use case is where u is very small but you want to solve the ODE f over many different initial conditions (u0) or parameters p. In that case, you can use GPUs to parallelize over different parameters and initial conditions. In other words:

Type of Problem SciML Solution
Accelerate a big ODE Use CUDA.jl's CuArray as u0
Solve the same ODE with many u0 and p Use DiffEqGPU.jl's EnsembleGPUArray and EnsembleGPUKernel

Supported GPUs

SciML's GPU support extends to a wide array of hardware, including:

GPU Manufacturer GPU Kernel Language Julia Support Package Backend Type
Intel OneAPI OneAPI.jl oneAPI.oneAPIBackend()
Apple (M-Series) Metal Metal.jl Metal.MetalBackend()

For this tutorial we will demonstrate the CUDA backend for NVIDIA GPUs, though any of the other GPUs can be used by simply swapping out the backend choice.

Example of Within-Method GPU Parallelism

using OrdinaryDiffEq, CUDA, LinearAlgebra
u0 = cu(rand(1000))
A = cu(randn(1000, 1000))
f(du, u, p, t) = mul!(du, A, u)
prob = ODEProblem(f, u0, (0.0f0, 1.0f0)) # Float32 is better on GPUs!
sol = solve(prob, Tsit5())

Example of Parameter-Parallelism with GPU Ensemble Methods

using DiffEqGPU, CUDA, OrdinaryDiffEq, StaticArrays

function lorenz(u, p, t)
    σ = p[1]
    ρ = p[2]
    β = p[3]
    du1 = σ * (u[2] - u[1])
    du2 = u[1] *- u[3]) - u[2]
    du3 = u[1] * u[2] - β * u[3]
    return SVector{3}(du1, du2, du3)

u0 = @SVector [1.0f0; 0.0f0; 0.0f0]
tspan = (0.0f0, 10.0f0)
p = @SVector [10.0f0, 28.0f0, 8 / 3.0f0]
prob = ODEProblem{false}(lorenz, u0, tspan, p)
prob_func = (prob, i, repeat) -> remake(prob, p = (@SVector rand(Float32, 3)) .* p)
monteprob = EnsembleProblem(prob, prob_func = prob_func, safetycopy = false)

@time sol = solve(monteprob, GPUTsit5(), EnsembleGPUKernel(CUDA.CUDABackend()),
                  trajectories = 10_000, adaptive = false, dt = 0.1f0)