Author siravan
5 Stars
Updated Last
1 Year Ago
Started In
May 2021

Using SciML Symbolics to Solve Perturbation Problems


Symbolics.jl is a fast and modern Computer Algebra System (CAS) written in the Julia Programming Language. It is an integral part of the SciML ecosystem of differential equation solvers and scientific machine learning packages. While Symbolics.jl is primarily designed for modern scientific computing (e.g., auto-differentiation, machine learning), it is a powerful CAS and can also be useful for classic scientific computing. One such application is using the perturbation theory to solve algebraic and differential equations.

Perturbation methods are a collection of techniques to solve problems that generally don't have a closed solution but depend on a tunable parameter and have closed-form or easy solutions for some values of the parameter. The main idea is to assume a solution as a power series in the tunable parameter (say ๐œ€), such that ๐œ€ = 0 corresponds to an easy solution.

We will discuss the general steps of the perturbation methods in four examples below. One hallmark of the perturbation method is the generation of long and convoluted intermediate equations, which are subjected to algorithmic and mechanical manipulations. Therefore, these problems are well suited for CAS. In fact, CAS softwares have been used to help with the perturbation calculations since the 1950s.

In this tutorial our goal is to show how to use Julia and Symbolics.jl to solve simple perturbation problems. The code for the for examples (test_* functions) and the helper functions are in src/perturb.jl.

Solving the Quintic

We start with the "hello world!" analog of the perturbation problems: solving the quintic (fifth-order) equations. We want to find a real valued ๐‘ฅ such that ๐‘ฅโต + ๐‘ฅ = 1. According to the Abel's theorem, a general quintic equation does not have a closed form solution. Of course, we can easily solve this equation numerically; for example, by using the Newton's method. We use the following implementation of the Newton's method:

using Symbolics, SymbolicUtils

function solve_newton(f, x, xโ‚€; abstol=1e-8, maxiter=50)
    xโ‚™ = Float64(xโ‚€)
    fโ‚™โ‚Šโ‚ = x - f / Symbolics.derivative(f, x)

    for i = 1:maxiter
        xโ‚™โ‚Šโ‚ = substitute(fโ‚™โ‚Šโ‚, Dict(x => xโ‚™))
        if abs(xโ‚™โ‚Šโ‚ - xโ‚™) < abstol
            return xโ‚™โ‚Šโ‚
            xโ‚™ = xโ‚™โ‚Šโ‚
    return xโ‚™โ‚Šโ‚

In this code, Symbolics.derivative(eq, x) does exactly what it names implies: it calculates the symbolic derivative of eq (a Symbolics.jl expression) with respect to x (a Symbolics.jl variable). We use Symbolics.substitute(eq, D) to evaluate the update formula by substituting variables or sub-expressions (defined as a dictionary D) in eq. It should be noted that substitute is the workhorse of our code and will be used multiple times in the rest of this tutorial. solve_newton is written with simplicity and clarity in mind and not performance but suffices for our purpose.

Let's go back to our quintic. We can define a Symbolics variable as @variables x and then solve the equation solve_newton(x^5 + x - 1, x, 1.0) (here, xโ‚€ = 0 is our first guess). The answer is x = 0.7549. Now, let's see how we can solve this problem using the perturbation method.

We introduce a tuning parameter ๐œ€ into our equation: ๐‘ฅโต + ๐‘ฅ = 1. If ๐œ€ = 1, we get our original problem. For ๐œ€ = 0, the problem transforms to an easy one: ๐‘ฅโต = 1 which has a solution ๐‘ฅ = 1 (and four complex solutions which we ignore here). We expand ๐‘ฅ as a power series on ๐œ€:

๐‘ฅ(๐œ€) = ๐‘Žโ‚€ + ๐‘Žโ‚๐œ€ + ๐‘Žโ‚‚๐œ€ยฒ + ๐‘‚(๐œ€ยณ)

๐‘Žโ‚€ is the solution of the easy equation, therefore ๐‘Žโ‚€ = 1. Substituting into the original problem,

(1 + ๐‘Žโ‚๐œ€ + ๐‘Žโ‚‚๐œ€ยฒ)โต + ๐œ€ (1 + ๐‘Žโ‚๐œ€ + ๐‘Žโ‚‚๐œ€ยฒ) - 1 = 0

Expanding the equations, we get

๐œ€ (1 + 5๐‘Žโ‚) + ๐œ€ยฒ (๐‘Žโ‚ + 5๐‘Žโ‚‚ + 10๐‘Žโ‚ยฒ) + ๐‘‚(๐œ€ยณ) = 0

This equation should hold for each power of ๐œ€. Therefore,

1 + 5๐‘Žโ‚ = 0,


๐‘Žโ‚ + 5๐‘Žโ‚‚ + 10๐‘Žโ‚ยฒ = 0.

This system of equations does not initially seem to be linear because of the presence of terms like 10๐‘Žโ‚ยฒ, but upon closer inspection is found to be in fact linear (this is a feature of the permutation method). In addition, the system is in a triangular form, meaning the first equation depends only on ๐‘Žโ‚, the second one on ๐‘Žโ‚ and ๐‘Žโ‚‚, such that we can replace the result of ๐‘Žโ‚ from the first one into the second equation and remove the non-linear term. We solve the first equation to get ๐‘Žโ‚ = -1/5. Substituting in the second one and solve for ๐‘Žโ‚‚:

๐‘Žโ‚‚ = (-1/5 + 10(-(1/5)ยฒ) / 5 = -1/25


๐‘ฅ(๐œ€) = 1 - ๐œ€ / 5 - ๐œ€ยฒ / 25 + ๐‘‚(๐œ€ยณ)

Solving the original problem, ๐‘ฅ(1) = 0.76, compared to 0.75487767 calculated numerically. We can improve the accuracy by including more terms in the expansion of ๐‘ฅ. However, the calculations, while straightforward, become messy and intractable to do manually very quickly. This is why a CAS is very helpful to solve perturbation problems.

Now, let's see how we can do these calculations in Julia (see test_quintic function). Let n = 2 be the order of the expansion. We start by defining the symbolic variables:

  @variables @variables ฯต a[1:n]        

Then, we define x = 1 + a[1]*ฯต + a[2]*ฯต^2. Note that in test_quintic we use the helper function def_taylor to define x by calling it as x = def_taylor(ฯต, a, 1), where the arguments are the expansion variable, an array of parameters, and the constant term. The next step is to substitute x in the problem equation eq = x^5 + ฯต*x - 1. Now, eq is

  ฯต*(1 + aโ‚*ฯต + aโ‚‚*(ฯต^2)) + (1 + aโ‚*ฯต + aโ‚‚*(ฯต^2))^5 - 1

Or in the expanded form (calculated as expand(eq)):

ฯต + aโ‚*(ฯต^2) + aโ‚‚*(ฯต^3) + (aโ‚^5)*(ฯต^5) + (aโ‚‚^5)*(ฯต^10) + 5aโ‚*ฯต + 5aโ‚‚*(ฯต^2) +
10(aโ‚^2)*(ฯต^2) + 10(aโ‚^3)*(ฯต^3) + 5(aโ‚^4)*(ฯต^4) + 10(aโ‚‚^2)*(ฯต^4) +
10(aโ‚‚^3)*(ฯต^6) + 5(aโ‚‚^4)*(ฯต^8) + 20aโ‚*aโ‚‚*(ฯต^3) + 30aโ‚*(aโ‚‚^2)*(ฯต^5) +
20aโ‚*(aโ‚‚^3)*(ฯต^7) + 5aโ‚*(aโ‚‚^4)*(ฯต^9) + 30aโ‚‚*(aโ‚^2)*(ฯต^4) + 20aโ‚‚*(aโ‚^3)*(ฯต^5) +
5aโ‚‚*(aโ‚^4)*(ฯต^6) + 30(aโ‚^2)*(aโ‚‚^2)*(ฯต^6) + 10(aโ‚^2)*(aโ‚‚^3)*(ฯต^8) +

We need a way to get the coefficients of different powers of ๐œ€. Function collect_powers(eq, x, ns) returns the powers of variable x in expression eq. Argument ns is the range of the powers.

function collect_powers(eq, x, ns; max_power=100)
    eq = substitute(expand(eq), Dict(x^j => 0 for j=last(ns)+1:max_power))

    eqs = []
    for i in ns
        powers = Dict(x^j => (i==j ? 1 : 0) for j=1:last(ns))
        push!(eqs, substitute(eq, powers))

For example, collect_powers(eq, ฯต, 1:2) means the coefficients of ฯต and ฯต^2 in eq and returns a list of expressions [1 + 5aโ‚, aโ‚ + 5aโ‚‚ + 10(aโ‚^2)]. We assign this list to eqs.

collect_powers uses substitute to find the coefficient of a given power of x by passing a Dict with all powers of x set to 0, except the target power which is set to 1. The following experssion returns the coefficient of ฯต^2 in eq,

  substitute(expand(eq), Dict(
    ฯต => 0,
    ฯต^2 => 1,
    ฯต^3 => 0,
    ฯต^4 => 0,
    ฯต^5 => 0,
    ฯต^6 => 0,
    ฯต^7 => 0,
    ฯต^8 => 0)

Back to our problem. Having the coefficients of the powers of ฯต, we can set each equation to 0 and solve the system of linear equations to find the numerical values of the coefficients. Symbolics.jl has a function Symbolics.solve_for that can solve systems of linear equations. We can start by solving eqs[1] for aโ‚ and then substitute this in eqs[2] and solve for aโ‚‚. This process is done by function solve_coef(eqs, ps):

function solve_coef(eqs, ps)
    vals = Dict()

    for i = 1:length(ps)
        eq = substitute(eqs[i], vals)
        vals[ps[i]] = Symbolics.solve_for(eq ~ 0, ps[i])

Here, eqs is an array of expressions (assumed to be equal to 0) and ps is an array of variables. The result is a dictionary of variable => value pairs. For example, solve_coef(eqs, a) returns Dict(aโ‚ => -0.2, aโ‚‚ => -0.04). Substituting back in the definition of x, i.e., x = 1 + a[1]*ฯต + a[2]*ฯต^2, we obtain X = ฯต -> 1 - 0.2*ฯต - 0.04*ฯต^2. Therefore, the solution to our original problem becomes X(1), which is equal to 0.76.

We can use larger values of n to improve the accuracy of estimations:

n x
1 0.8
2 0.76
3 0.752
4 0.752
5 0.7533
6 0.7543
7 0.7548
8 0.7550

Remember the numerical value is 0.7549.

The two functions collect_powers and solve_coef(eqs, a) are used in all the examples in this tutorial.

Solving the Kepler's Equation

Historically, the perturbation methods were first invented to solve orbital calculations needed to calculate the orbit of the Moon and the planets. In homage to this history, our second example has a celestial theme. Our goal is solve the Kepler's equation:

๐ธ - ๐‘’ sin(๐ธ) = ๐‘€.

where ๐‘’ is the eccentricity of the elliptical orbit, ๐‘€ is the mean anomaly, and ๐ธ (unknown) is the eccentric anomaly (the angle between the position of a planet in an elliptical orbit and the point of periapsis). This equation is central to solving two-body Keplerian orbits. We want to find a function ๐ธ(๐‘€; ๐‘’).

Similar to the first example, it is easy to solve this problem using the Newton's method. For example, let ๐‘’ = 0.01671 (the eccentricity of the Earth) and ๐‘€ = ฯ€/2. We have solve_newton(x - e*sin(x) - M, x, M) equals to 1.5875 (compared to ฯ€/2 = 1.5708). Now, we try to solve the same problem using the perturbation techniques (see function test_kepler).

For ๐‘’ = 0, we have ๐ธ = ๐‘€. Therefore, we can use ๐‘’ as our perturbation parameter. For consistency, we rename it to ๐œ€. We start by defining the variables (assuming n = 3):

  @variables ฯต M a[1:n]
  x = def_taylor(ฯต, n, M)  

The problem equation is eq = E - ฯต * sin(E) - M. We further simplify by substituting sin with its power series (using expand_sin helper function):

sin(๐ธ) = ๐‘ฅ - ๐‘ฅยณ / 6 + ๐‘ฅโต / 120 - ๐‘ฅโท / 5040 + ๐‘‚(๐‘ฅโน).

We follow the same algorithm as before. We collect the coefficients of the powers of ๐œ€:

  eqs = collect_powers(eq, ฯต, 1:n)

and then solve for a:

  vals = solve_coef(eqs, a)

Finally, we substitute vals back in x to get a formula to calculate E:

  X = substitute(x, vals)
  substitute(X, Dict(ฯต => 0.01671, M => ฯ€/2))

The result is 1.5876, compared to the numerical value of 1.5875. It is customary to order X based on the powers of M instead of ฯต. We can calculate this series as collect_powers(sol, M, 0:3) . The result (after manual cleanup) is

  E(M, ฯต) =
    (1 + ฯต + ฯต^2 + ฯต^3)*M
    - (ฯต + 4*ฯต^2 + 10*ฯต^3)*M^3/6
    + (ฯต + 16*ฯต^2 + 91*ฯต^3)*M^5/120

Comparing the formula to the one for ๐ธ in the Wikipedia article on the Kepler's equation:

The first deviation is in the coefficient of ๐œ€ยณ๐‘€โต.

The Trajectory of a Ball!

In the first two examples, we applied the permutation method to algebraic problems. However, the main power of the permutation method is to solve differential equations (usually ODEs, but also occasionally PDEs). Surprisingly, the main procedure developed to solve algebraic problems works well for differential equations. In fact, we will use the same two helper functions, collect_powers and solve_coef. The main difference is in the way we expand the dependent variables. For algebraic problems, the coefficients of ๐œ€ are constants; whereas, for differential equations, they are functions of the dependent variable (usually time).

For the first example on how to solve an ODE, we have chosen a simple and well-behaved problem. The problem is a variation of a standard first-year physics problem: what is the trajectory of an object (say, a ball or a rocket) thrown vertically at velocity ๐‘ฃ from the surface of a planet. Assuming a constant acceleration of gravity, ๐‘”, every burgeoning physicist knows the answer: ๐‘ฅ(๐‘ก) = ๐‘ฅ(0) + ๐‘ฃ๐‘ก - ๐‘”๐‘กยฒ/2. However, what happens if ๐‘” is not constant? Specifically, ๐‘” is inversely proportional to the distant from the center of the planet. If ๐‘ฃ is large, the assumption of constant gravity does not hold. However, unless ๐‘ฃ is large compared to the escape velocity, the correction is usually small. After simplifications and reframing in dimensionless variables, the problem becomes ๐‘ฅฬˆ(๐‘ก) = -(1 + ๐œ€๐‘ฅ(๐‘ก))โปยฒ, assuming ๐‘ฅ(0) = 0, and ๐‘ฅฬ‡(0) = 1. Note that for ๐œ€ = 0, this equation transforms back to the standard one.

Let's start with defining the variables

  @variables ฯต t y[1:n](t) โˆ‚โˆ‚y[1:n]

Next, we define ๐‘ฅ (for n = 3):

  x = y[1] + y[2]*ฯต + y[3]*ฯต^2

We need the second derivative of x. It may seem that we can do this using Differential(t); however, this operation needs to wait! Instead, we define dummy variables โˆ‚โˆ‚y as the placeholder for the second derivatives and define

  โˆ‚โˆ‚x = โˆ‚โˆ‚y[1] + โˆ‚โˆ‚y[2]*ฯต + โˆ‚โˆ‚y[3]*ฯต^2

as the second derivative of x. After rearrangement, our governing equation is ๐‘ฅฬˆ(๐‘ก)(1 + ๐œ€๐‘ฅ(๐‘ก))ยฒ + 1 = 0, or

  eq = โˆ‚โˆ‚x * (1 + ฯต*x)^2 + 1

The next steps are the same as before (however, note that we pass 0:n-1 to collect_powers because the zeroth order term is needed here)

  eqs = collect_powers(eq, ฯต, 0:n-1)
  vals = solve_coef(eqs, โˆ‚โˆ‚y)

At this stage,

  vals = Dict(
    โˆ‚โˆ‚yโ‚ => -1.0,
    โˆ‚โˆ‚yโ‚‚ => 2.0yโ‚(t),
    โˆ‚โˆ‚yโ‚ƒ => 2.0yโ‚‚(t) - (3.0(yโ‚(t)^2))

Our system of ODEs is forming. Note the triangular form of the relationship. This is time to convert โˆ‚โˆ‚s to the correct Symbolics.jl form by substitution:

  D = Differential(t)
  subs = Dict(โˆ‚โˆ‚y[i] => D(D(y[i])) for i = 1:n)
  eqs = [substitute(first(v), subs) ~ substitute(last(v), subs) for v in vals]

Now, eqs becomes

  [Differential(t)(Differential(t)(yโ‚(t))) ~ -1.0,
   Differential(t)(Differential(t)(yโ‚‚(t))) ~ 2.0yโ‚(t),
   Differential(t)(Differential(t)(yโ‚ƒ(t))) ~ 2.0yโ‚‚(t) - (3.0(yโ‚(t)^2))]

We are nearly there! From this point on, the rest is standard ODE solving procedures. Potentially we can use a symbolic ODE solver to find a closed form solution to this problem. However, Symbolics.jl currently does not support this functionality. Instead, we solve the problem numerically. We form an ODESystem, lower the order (convert second derivatives to first), generate an ODEProblem (after passing the correct initial conditions), and, finally, solve it.

  using ModelingToolkit, DifferentialEquations

  sys = ODESystem(eqs, t)
  sys = ode_order_lowering(sys)
  prob = ODEProblem(sys, [1.0, 0.0, 0.0, 0.0, 0.0, 0.0], (0, 5.0))
  sol = solve(prob; dtmax=0.01)

The solution to the problem can be written as

  X = ฯต -> sol[y[1]] .+ sol[y[2]] * ฯต .+ sol[y[3]] * ฯต^2

The following figure is generated by running

  using Plots

  plot(sol.t, hcat([X(ฯต) for ฯต = 0.0:0.1:0.5]...))    

and shows the trajectories for a range of ฯต:

As expected, the higher ฯต is (meaning the gravity is less with altitude), the object goes higher and stays up for a longer duration. Of course, we could have solved the problem directly using as ODE solver. One of the benefits of the perturbation method is that we need to run the ODE solver only once and then can just calculate the answer for different values of ฯต; whereas, if we were using the direct method, we needed to run the solver once for each value of ฯต.

A Weakly Nonlinear Oscillator

For our final example, we have chosen a simple example from a very important class of problems, the nonlinear oscillators. As we will see, perturbation theory has difficulty providing a good solution to this problem, but the process is instructive. This example follows closely chapter 7.6 of Nonlinear Dynamics and Chaos by Steven Strogatz.

The problem is to solve ๐‘ฅฬˆ(๐‘ก) + 2๐œ€๐‘ฅฬ‡ + ๐‘ฅ = 0, assuming ๐‘ฅ(0) = 0, and ๐‘ฅฬ‡(0) = 1. If ๐œ€ = 0, the problem reduces to the simple linear harmonic oscillator with the exact solution ๐‘ฅ(t) = sin(๐‘ก). We follow the same steps as the previous example.

  @variables ฯต t y[1:n](t) โˆ‚y[1:n] โˆ‚โˆ‚y[1:n] # n = 3
  x = y[1] + y[2]*ฯต + y[3]*ฯต^2
  โˆ‚x = โˆ‚y[1] + โˆ‚y[2]*ฯต + โˆ‚y[3]*ฯต^2
  โˆ‚โˆ‚x = โˆ‚โˆ‚y[1] + โˆ‚โˆ‚y[2]*ฯต + โˆ‚โˆ‚y[3]*ฯต^2

Note that now we also need the first derivative terms. Continuing,

  eq = โˆ‚โˆ‚x + 2*ฯต*โˆ‚x + x
  eqs = collect_powers(eq, ฯต, 0:n-1)
  vals = solve_coef(eqs, โˆ‚โˆ‚y)

Let's inspect vals:

  vals = Dict(
    โˆ‚โˆ‚yโ‚ => -yโ‚(t),
    โˆ‚โˆ‚yโ‚‚ => -2.0โˆ‚yโ‚ - yโ‚‚(t),
    โˆ‚โˆ‚yโ‚ƒ => -2.0โˆ‚yโ‚‚ - yโ‚ƒ(t))

Next, we need to replace โˆ‚s and โˆ‚โˆ‚s with their Symbolics.jl counterparts:

  D = Differential(t)
  subs1 = Dict(โˆ‚y[i] => D(y[i]) for i = 1:n)
  subs2 = Dict(โˆ‚โˆ‚y[i] => D(D(y[i])) for i = 1:n)
  subs = subs1 โˆช subs2
  eqs = [substitute(first(v), subs) ~ substitute(last(v), subs) for v in vals]

We continue with converting to an ODEProblem, solving it, and finally plot the results against the exact solution to the original problem.

  sys = ODESystem(eqs, t)
  sys = ode_order_lowering(sys)
  prob = ODEProblem(sys, [1.0, 0.0, 0.0, 0.0, 0.0, 0.0], (0, 50.0))
  sol = solve(prob; dtmax=0.01)

  T = sol.t
  X = ฯต -> sol[y[1]] .+ sol[y[2]] * ฯต .+ sol[y[3]] * ฯต^2
  Y = ฯต -> exp.(-ฯต*T) .* sin.(sqrt(1 - ฯต^2)*T) / sqrt(1 - ฯต^2)    # exact solution

  plot(sol.t, [Y(0.1), X(0.1)])

The result is (compare to Figure 7.6.2 in Nonlinear Dynamics and Chaos)

The two curves fit well for the first couple of cycles, but then the perturbation method curve diverges from the true solution. The main reason is that the problem has two or more time-scales.