**WaterLily.jl** is a simple and fast fluid simulator written in pure Julia. This project is supported by awesome libraries developed within the Julia scientific community, and it aims to accelerate and enhance fluid simulations. Watch the JuliaCon2024 talk here:

If you have used WaterLily for research, please **cite us**! The 2024 paper describes the main features of the solver and provides benchmarking, validation, and profiling results.

```
@misc{WeymouthFont2024,
title = {WaterLily.jl: A differentiable and backend-agnostic Julia solver to simulate incompressible viscous flow and dynamic bodies},
author = {Gabriel D. Weymouth and Bernat Font},
url = {https://arxiv.org/abs/2407.16032},
eprint = {2407.16032},
archivePrefix = {arXiv},
year = {2024},
primaryClass = {physics.flu-dyn}
}
```

WaterLily solves the unsteady incompressible 2D or 3D Navier-Stokes equations on a Cartesian grid. The pressure Poisson equation is solved with a geometric multigrid method. Solid boundaries are modelled using the Boundary Data Immersion Method. The solver can run on serial CPU, multi-threaded CPU, or GPU backends.

WaterLily lets the user can set the domain size and boundary conditions, the fluid viscosity (which determines the Reynolds number), and immerse solid obstacles. A large selection of examples, notebooks, and tutorials are found in the WaterLily-Examples repository. Here, we will illustrate the basics by simulating and plotting the flow over a circle.

We define the size of the simulation domain as `n`

by `m`

cells. The circle has radius `m/8`

and is centered at `(m/2,m/2)`

. The flow boundary conditions are `(U,0)`

, where we set `U=1`

, and the Reynolds number is `Re=U*radius/ν`

where `ν`

(Greek "nu" U+03BD, not Latin lowercase "v") is the kinematic viscosity of the fluid.

```
using WaterLily
function circle(n,m;Re=100,U=1)
# signed distance function to circle
radius, center = m/8, m/2-1
sdf(x,t) = √sum(abs2, x .- center) - radius
Simulation((n,m), # domain size
(U,0), # domain velocity (& velocity scale)
2radius; # length scale
ν=U*2radius/Re, # fluid viscosity
body=AutoBody(sdf)) # geometry
end
```

The circle geometry is defined using a signed distance function. The `AutoBody`

function uses automatic differentiation to infer the other geometric parameters of the body automatically. Replace the circle's distance function with any other, and now you have the flow around something else... such as a donut or the Julia logo. For more complex geometries, ParametricBodies.jl defines a `body`

using any parametric curve, such as a spline. See that repo (and the video above) for examples.

The code block above return a `Simulation`

with the parameters we've defined. Now we can initialize a simulation (first line) and step it forward in time (second line)

```
circ = circle(3*2^5,2^6)
sim_step!(circ)
```

Note we've set `n,m`

to be multiples of powers of 2, which is important when using the (very fast) geometric multi-grid solver.

We can now access and plot whatever variables we like. For example, we can plot the x-component of the velocity field using

```
using Plots
u = circ.flow.u[:,:,1] # first component is x
contourf(u') # transpose the array for the plot
```

As you can see, the velocity within the circle is zero, the velocity far from the circle is one, and there are accelerated and decelerated regions around the circle. The `sim_step!`

has only taken a single time step, and this initial flow around our circle looks similar to the potential flow because the viscous boundary layer has not separated yet.

A set of flow metric functions have been implemented, and we can use them to measure the simulation. The following code block defines a function to step the simulation to time `t`

and then use the `pressure_force`

metric to measure the force on the immersed body. The function is applied over a time range, and the forces are plotted.

```
function get_forces!(sim,t)
sim_step!(sim,t,remeasure=false)
force = WaterLily.pressure_force(sim)
force./(0.5sim.L*sim.U^2) # scale the forces!
end
# Simulate through the time range and get forces
time = 1:0.1:50 # time scale is sim.L/sim.U
forces = [get_forces!(circ,t) for t in time];
#Plot it
plot(time,[first.(forces), last.(forces)],
labels=permutedims(["drag","lift"]),
xlabel="tU/L",
ylabel="Pressure force coefficients")
```

We can also plot the vorticity field instead of the u-velocity to see a snap-shot of the wake.

```
# Use curl(velocity) to compute vorticity `inside` the domain
ω = zeros(size(u));
@inside ω[I] = WaterLily.curl(3,I,circ.flow.u)*circ.L/circ.U
# Plot it
clims = (-6,6)
contourf(clamp.(ω,clims...)'; clims,
color=palette(:RdBu,9),linewidth=0,levels=8,
aspect_ratio=:equal,border=:none)
```

As you can see, WaterLily correctly predicts that the flow is unsteady, with an alternating vortex street wake, leading to an oscillating side force and drag force.

WaterLily uses KernelAbstractions.jl to multi-thread on CPU and run on GPU backends. The implementation method and speed-up are documented in the 2024 paper, with costs as low as 1.44 nano-seconds measured per degree of freedom and time step!

Note that multi-threading requires *starting* Julia with the `--threads`

argument, see the multi-threading section of the manual. If you are running Julia with multiple threads, KernelAbstractions will detect this and multi-thread the loops automatically.

Running on a GPU requires initializing the `Simulation`

memory on the GPU, and care needs to be taken to move the data back to the CPU for visualization. As an example, let's compare a **3D** GPU simulation of a sphere to the **2D** multi-threaded CPU circle defined above

```
using CUDA,WaterLily
function sphere(n,m;Re=100,U=1,T=Float64,mem=Array)
radius, center = m/8, m/2-1
body = AutoBody((x,t)->√sum(abs2, x .- center) - radius)
Simulation((n,m,m),(U,0,0), # 3D array size and BCs
2radius;ν=U*2radius/Re,body, # no change
T, # Floating point type
mem) # memory type
end
@assert CUDA.functional() # is your CUDA GPU working??
GPUsim = sphere(3*2^5,2^6;T=Float32,mem=CuArray); # 3D GPU sim!
println(length(GPUsim.flow.u)) # 1.3M degrees-of freedom!
sim_step!(GPUsim) # compile GPU code & run one step
@time sim_step!(GPUsim,50,remeasure=false) # 40s!!
CPUsim = circle(3*2^5,2^6); # 2D CPU sim
println(length(CPUsim.flow.u)) # 0.013M degrees-of freedom!
sim_step!(CPUsim) # compile GPU code & run one step
println(Threads.nthreads()) # I'm using 8 threads
@time sim_step!(CPUsim,50,remeasure=false) # 28s!!
```

As you can see, the 3D sphere set-up is almost identical to the 2D circle, but using 3D arrays means there are almost 1.3M degrees-of-freedom, 100x bigger than in 2D. Never the less, the simulation is quite fast on the GPU, only around 40% slower than the much smaller 2D simulation on a CPU with 8 threads. See the 2024 paper and the examples repo for many more non-trivial examples including running on AMD GPUs.

Finally, KernelAbstractions does incur some CPU allocations for every loop, but other than this `sim_step!`

is completely non-allocating. This is one reason why the speed-up improves as the size of the simulation increases.

We always appreciate new contributions, so please submit a pull request with your changes and help us make WaterLily even better! Note that contributions need to be submitted together with benchmark results - WaterLily should always be fast! 😃 For this, we have a fully automated benchmarking suite that conducts performance tests. In short, to compare your changes with the latest WaterLily, clone the that repo and run the benchmarks with

```
git clone https://github.com/WaterLily-jl/WaterLily-Benchmarks && cd WaterLily-Benchmarks
sh benchmark.sh -wd "<your/waterlily/path>" -w "<your_waterlily_branch> master"
julia --project compare.jl
```

This will run benchmarks for CPU and GPU backends. If you do not have a GPU, simply pass `-b "Array"`

when runnning `benchmark.sh`

. More information on the benchmark suite is available in that README.

Of course, ideas, suggestions, and questions are welcome too! Please raise an issue to address any of these.

- Immerse obstacles defined by 3D meshes (Meshing.jl)
- Multi-CPU/GPU simulations (#141)
- Free-surface physics with (Volume-of-Fluid) or other methods.