## DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
Popularity
283 Stars
Updated Last
13 Days Ago
Started In
June 2019

# DiffEqGPU

This library is a component package of the DifferentialEquations.jl ecosystem. It includes functionality for making use of GPUs in the differential equation solvers.

## The two ways to accelerate ODE solvers with GPUs

There are two very different ways that one can accelerate an ODE solution with GPUs. There is one case where `u` is very big and `f` is very expensive but very structured, and you use GPUs to accelerate the computation of said `f`. The other use case is where `u` is very small but you want to solve the ODE `f` over many different initial conditions (`u0`) or parameters `p`. In that case, you can use GPUs to parallelize over different parameters and initial conditions. In other words:

Type of Problem SciML Solution
Accelerate a big ODE Use CUDA.jl's CuArray as `u0`
Solve the same ODE with many `u0` and `p` Use DiffEqGPU.jl's `EnsembleGPUArray` and `EnsembleGPUKernel`

## Supported GPUs

SciML's GPU support extends to a wide array of hardware, including:

GPU Manufacturer GPU Kernel Language Julia Support Package Backend Type
NVIDIA CUDA CUDA.jl `CUDA.CUDABackend()`
AMD ROCm AMDGPU.jl `AMDGPU.ROCBackend()`
Intel OneAPI OneAPI.jl `oneAPI.oneAPIBackend()`
Apple (M-Series) Metal Metal.jl `Metal.MetalBackend()`

For this tutorial we will demonstrate the CUDA backend for NVIDIA GPUs, though any of the other GPUs can be used by simply swapping out the `backend` choice.

## Example of Within-Method GPU Parallelism

```using OrdinaryDiffEq, CUDA, LinearAlgebra
u0 = cu(rand(1000))
A = cu(randn(1000, 1000))
f(du, u, p, t) = mul!(du, A, u)
prob = ODEProblem(f, u0, (0.0f0, 1.0f0)) # Float32 is better on GPUs!
sol = solve(prob, Tsit5())```

## Example of Parameter-Parallelism with GPU Ensemble Methods

```using DiffEqGPU, CUDA, OrdinaryDiffEq, StaticArrays

function lorenz(u, p, t)
σ = p[1]
ρ = p[2]
β = p[3]
du1 = σ * (u[2] - u[1])
du2 = u[1] * (ρ - u[3]) - u[2]
du3 = u[1] * u[2] - β * u[3]
return SVector{3}(du1, du2, du3)
end

u0 = @SVector [1.0f0; 0.0f0; 0.0f0]
tspan = (0.0f0, 10.0f0)
p = @SVector [10.0f0, 28.0f0, 8 / 3.0f0]
prob = ODEProblem{false}(lorenz, u0, tspan, p)
prob_func = (prob, i, repeat) -> remake(prob, p = (@SVector rand(Float32, 3)) .* p)
monteprob = EnsembleProblem(prob, prob_func = prob_func, safetycopy = false)

@time sol = solve(monteprob, GPUTsit5(), EnsembleGPUKernel(CUDA.CUDABackend()),
trajectories = 10_000, adaptive = false, dt = 0.1f0)```

## Benchmarks

Curious about our claims? See https://github.com/utkarsh530/GPUODEBenchmarks for comparsion of our GPU solvers against CPUs and GPUs implementation in C++, JAX and PyTorch.

## Citation

If you are using `DiffEqGPU.jl` in your work, consider citing our paper:

``````@article{utkarsh2024automated,
title={Automated translation and accelerated solving of differential equations on multiple GPU platforms},
author={Utkarsh, Utkarsh and Churavy, Valentin and Ma, Yingbo and Besard, Tim and Srisuma, Prakitr and Gymnich, Tim and Gerlach, Adam R and Edelman, Alan and Barbastathis, George and Braatz, Richard D and others},
journal={Computer Methods in Applied Mechanics and Engineering},
volume={419},
pages={116591},
year={2024},
publisher={Elsevier}
}
``````

### Required Packages

View all packages

### Used By Packages

No packages found.