Convenience wrappers to incomplete factorizations from CUSPARSE to be used for iterative solvers of sparse linear systems on the GPU. The implementations are adapted from the great tutorial of the Krylov.jl package.
Supports both Incomplete Cholesky factorization with zero fill-in (ic0
)
for symmetric positive definite matrices and Incomplete LU factorization with zero fill-in (ilu0
).
The created preconditioners can be used with any iterative linear solver that supports right-preconditioning (e.g. gmres
from Krylov.jl).
The 1D Heat Equation
with homogeneous Dirichlet boundary conditions
and a backward-in-time central-in-space discretization (implicit Euler) requires the solution to a linear system of equations in each iteration
for the vector of (interior) degrees of freedom. The matrix
It is unconditionally stable for all
Let's first assemble the matrix
using SparseArrays
using LinearAlgebra
N = 10_000 # Interior DoF without the 2 boundary points
Δx = 1 / (N + 1)
Δt = 0.1
α = 1.0
A = spdiagm(-1 => -α * Δt / Δx^2 * ones(N-1),
0 => 1 .+ 2 * α * Δt / Δx^2 * ones(N),
1 => -α * Δt / Δx^2 * ones(N-1))
mesh = range(0.0, 1.0, length=N+2)
u = ifelse.((mesh .> 0.2) .& (mesh .< 0.4), 1.0, 0.0)[2:end-1]
Let's move the matrix
using CUDA
using CUDA.CUSPARSE
using CUDAPreconditioners
A_gpu = CuSparseMatrixCSR(A)
P_r_gpu = ic0(A_gpu)
Then, we can solve the linear system on the GPU using the cg
(conjugate
gradient) solver from
Krylov.jl.
using Krylov
u_gpu = CuArray(u)
# Solve without a preconditioner
u_gpu_next_no_preconditioner, solve_stats_no_preconditioner = cg(
A_gpu,
u_gpu,
)
# Solve with the preconditioner
u_gpu_next_with_preconditioner, solve_stats_with_preconditioner = cg(
A_gpu,
u_gpu,
M=P_r_gpu,
ldiv=true,
)
@show solve_stats_no_preconditioner.niter # -> 9999
@show solve_stats_with_preconditioner.niter # -> 7
The preconditioned version takes way fewer iterations to converge.
The 1D Advection Equation
with periodic boundary conditions
and a backward-in-time first-order upwind discretization (implicit Euler) requires the solution to a linear system of equations in each iteration.
If we fix
The system matrix is singular, i.e.
Again, let's first assemble the matrix
using SparseArrays
using LinearAlgebra
N = 10_000 # Interior DoF without the right periodic boundary point
Δx = 1 / N
Δt = 0.1
c = 1.0
A = spdiagm(-1 => -c * Δt / Δx * ones(N-1),
0 => 1 .+ c * Δt / Δx * ones(N),
1 => -c * Δt / Δx * ones(N-1))
A[end, 1] = -c * Δt / Δx
mesh = range(0.0, 1.0, length=N+1)
u = ifelse.((mesh .> 0.2) .& (mesh .< 0.4), 1.0, 0.0)[1:end-1]
Let's move the matrix ilu0
preconditioner here, because the matrix is not symmetric.
using CUDA
using CUDA.CUSPARSE
using CUDAPreconditioners
A_gpu = CuSparseMatrixCSR(A)
P_r_gpu = ilu0(A_gpu)
Then, we can solve the linear system on the GPU using the bicgstab
(BiConjugate Gradient Stabilized) solver from
Krylov.jl.
using Krylov
u_gpu = CuArray(u)
# Solve without a preconditioner
u_gpu_next_no_preconditioner, solve_stats_no_preconditioner = bicgstab(
A_gpu,
u_gpu,
itmax=100_000,
)
# Solve with the preconditioner
u_gpu_next_with_preconditioner, solve_stats_with_preconditioner = bicgstab(
A_gpu,
u_gpu,
N=P_r_gpu,
ldiv=true,
itmax=100_000,
)
@show solve_stats_no_preconditioner.niter # -> 58020
@show solve_stats_with_preconditioner.niter # -> 1