# Richardson package for Julia

The `Richardson`

package provides a function `extrapolate`

that
extrapolates a given function `f(x)`

to `f(x0)`

, evaluating
`f`

only at a geometric sequence of points `> x0`

(or optionally `< x0`

).

The key algorithm is Richardson extrapolation using a Neville–Aitken
tableau, which adaptively increases the degree of an extrapolation
polynomial until convergence is achieved to a desired tolerance
(or convergence stalls due to e.g. floating-point errors). This
allows one to obtain `f(x0)`

to high-order accuracy, assuming
that `f(x0+h)`

has a Taylor series or some other power
series in `h`

. (See e.g. these course notes by Prof. Flaherty at RPI.)

## Usage

```
extrapolate(f, h; contract=0.125, x0=zero(h), power=1,
atol=0, rtol=atol>0 ? 0 : sqrt(ε), maxeval=typemax(Int), breaktol=2)
```

Extrapolate `f(x)`

to `f₀ ≈ f(x0)`

, evaluating `f`

only at `x > x0`

points
(or `x < x0`

if `h < 0`

) using Richardson extrapolation starting at
`x=x₀+h`

. It returns a tuple `(f₀, err)`

of the estimated `f(x0)`

and an error estimate.

The return value of `f`

can be any type supporting `±`

and `norm`

operations (i.e. a normed vector space).
Similarly, `h`

and `x0`

can be in any normed vector space,
in which case `extrapolate`

performs Richardson extrapolation
of `f(x0+s*h)`

to `s=0⁺`

(i.e. it takes the limit as `x`

goes
to `x0`

along the `h`

direction).

On each step of Richardson extrapolation, it shrinks `x-x0`

by
a factor of `contract`

, stopping when the estimated error is
`< max(rtol*norm(f₀), atol)`

, when the estimated error
increases by more than `breaktol`

(e.g. due to numerical errors in the
computation of `f`

), when `f`

returns a non-finite value (`NaN`

or `Inf`

),
or when `f`

has been evaluated `maxeval`

times. Note that
if the function may converge to zero, you may want
specify a nonzero `atol`

(which cannot be set by default
because it depends on the scale/units of `f`

); alternatively,
in such cases `extrapolate`

will halt when it becomes
limited by the floating-point precision. (Passing `breaktol=Inf`

can be useful to force `extrapolate`

to continue shrinking `h`

even
if polynomial extrapolation is initially failing to converge,
possibly at the cost of extraneous function evaluations.)

If `x0 = ±∞`

(`±Inf`

), then `extrapolate`

computes the limit of
`f(x)`

as `x ⟶ ±∞`

using geometrically *increasing* values
of `h`

(by factors of `1/contract`

).

In general, the starting `h`

should be large enough that `f(x0+h)`

can be computed accurately and efficiently (e.g. without
severe cancellation errors), but small enough that `f`

does not
oscillate much between `x0`

and `x0+h`

. i.e. `h`

should be a typical
scale over which the function `f`

varies significantly.

Technically, Richardson extrapolation assumes that `f(x0+h)`

can
be expanded in a power series in `h^power`

, where the default
`power=1`

corresponds to an ordinary Taylor series (i.e. assuming
`f`

is analytic at `x0`

). If this is not true, you may obtain
slow convergence from `extrapolate`

, but you can pass a different
value of `power`

(e.g. `power=0.5`

) if your `f`

has some different
(Puiseux) power-series expansion. Conversely, if `f`

is
an *even* function around `x0`

, i.e. `f(x0+h) == f(x0-h)`

,
so that its Taylor series contains only *even* powers of `h`

,
you can accelerate convergence by passing `power=2`

.

`extrapolate(fh_itr; power=1, atol=0, rtol=0, maxeval=typemax(Int), breaktol=Inf)`

Similar to `extrapolate(f, h)`

, performs Richardson extrapolation of a sequence of
values `f(h)`

to `h → 0`

, but takes an iterable collection `fh_itr`

of a
sequence of `(f(h), h)`

tuples (in order of decreasing `|h|`

).

There is no `contract`

keyword argument since the contraction factors are determined
by the sequence of `h`

values (which need not contract by the same amount). The
tolerances `atol`

and `rtol`

both default to `0`

so that by default it examines
*all* of the values in the `fh_itr`

collection. Otherwise, the keyword arguments
have the same meanings as in `extrapolate(f, h)`

.

`extrapolate!(fh::AbstractVector; power=1, atol=0, rtol=0, maxeval=typemax(Int), breaktol=Inf)`

Similar to `extrapolate(fh_itr)`

, performs Richardson extrapolation on an array `fh`

of `(f(h), h)`

tuples (in order of decreasing `|h|`

), but overwrites the array
`fh`

in-place with intermediate calculations.

(Thus, the array `fh`

must be a vector of `Tuple{T,H}`

values, where `H<:Number`

is
the type of `h`

and `T`

is the type of the extrapolated `f(0)`

**result**. This `T`

should be a floating-point type, i.e. `fh`

should contain `float(f(h))`

if the
function you are extrapolating is not already floating-point-valued.)

## Examples

For example, let's extrapolate `sin(x)/x`

to `x=0`

(where the correct answer is `1`

) starting at `x=1`

, printing out the `x`

value at each step so that we can see what the algorithm is doing.

(Since `f`

is passed as the first argument to `extrapolate`

, we
can use Julia's `do`

syntax to conveniently define a multi-line
anonymous function to pass.)

```
extrapolate(1.0, rtol=1e-10) do x
@show x
sin(x)/x
end
```

giving the output:

```
x = 1.0
x = 0.125
x = 0.015625
x = 0.001953125
x = 0.000244140625
x = 3.0517578125e-5
(1.0000000000000002, 2.0838886172214188e-13)
```

That is, it evaluates our function `sin(x)/x`

for 6 different values of `x`

and returns `1.0000000000000002`

, which is accurate to machine precision (the error is `≈ 2.2e-16`

). The returned error estimate of `2e-13`

is conservative, which is typical for extrapolating well-behaved functions.

Since `sin(x)/x`

is an *even* (symmetric) function around `x=0`

,
its Taylor series contains only even powers of `x`

. We can
exploit this fact to *accelerate convergence for even functions* by
passing `power=2`

to `extrapolate`

:

```
extrapolate(1.0, rtol=1e-10, power=2) do x
@show x
sin(x)/x
end
```

gives

```
x = 1.0
x = 0.125
x = 0.015625
x = 0.001953125
x = 0.000244140625
(1.0, 0.0)
```

which converged to machine precision (in fact, the exact result) in only 5 function evaluations (1 fewer than above).

### Infinite limits

Using the `x0`

keyword argument, we can compute the limit of `f(x)`

as `x ⟶ x0`

. In fact, you can pass `x0 = Inf`

to compute a limit as
`x ⟶ ∞`

(which is accomplished internally by a change of variables `x = 1/u`

and performing Richardson extrapolation to `u=0`

). For example:

```
extrapolate(1.0, x0=Inf) do x
@show x
(x^2 + 3x - 2) / (x^2 + 5)
end
```

gives

```
x = 1.0
x = 8.0
x = 64.0
x = 512.0
x = 4096.0
x = 32768.0
x = 262144.0
(1.0000000000000002, 1.2938539128981574e-12)
```

which is the correct result (`1.0`

) to machine precision.

#### Extrapolating series

One nice application of infinite limits is extrapolating infinite series. If we start with an integer `x`

, then the default `contract=0.125`

will increase `x`

by a factor of `8.0`

on each iteration, so `x`

will always be an exact integer and we can use it as the number of terms in a series.

For example, suppose we are computing the infinite series `1/1² + 1/2² + 1/3² + ⋯`

. This is the famous Basel problem, and it converges to `π²/6 ≈ 1.644934066848226…`

. If we compute it by brute force, however, we need quite a few terms to get high accuracy:

```
julia> sum(n -> 1/n^2, 1:100) - π^2/6
-0.009950166663333482
julia> sum(n -> 1/n^2, 1:10^4) - π^2/6
-9.99950001654426e-5
julia> sum(n -> 1/n^2, 1:10^9) - π^2/6
-9.999985284281365e-10
```

Even with 10⁹ terms we get only about 9 digits. Instead, we can use `extrapolate`

(starting at 1 term):

```
julia> val, err = extrapolate(1, x0=Inf) do N
@show N
sum(n -> 1/n^2, 1:Int(N))
end
N = 1.0
N = 8.0
N = 64.0
N = 512.0
N = 4096.0
N = 32768.0
(1.6449340668482288, 4.384936858059518e-12)
julia> (val - π^2/6)/(π^2/6)
1.4848562646983628e-15
```

By `32768`

terms, the extrapolated value is accurate to about 15 digits.

### Numerical derivatives

A classic application of Richardson extrapolation is the accurate evaluation of derivatives via finite-difference approximations (although analytical derivatives, e.g. by automatic differentiation, are typically vastly more efficient when they are available). In this example, we use Richardson extrapolation on the forward-difference approximation `f'(x) ≈ (f(x+h)-f(x))/h`

, for which the error decreases as `O(h)`

but a naive application to a very small `h`

will yield a huge cancellation error from floating-point roundoff effects. We differentiate `f(x)=sin(x)`

at `x=1`

, for which the correct answer is `cos(1) ≈ 0.5403023058681397174009366...`

, starting with `h=0.1`

```
extrapolate(0.1, rtol=0) do h
@show h
(sin(1+h) - sin(1)) / h
end
```

Although we gave an `rtol`

of `0`

, the `extrapolate`

function will terminate after a finite number of steps when it detects that improvements are limited by floating-point error:

```
h = 0.1
h = 0.0125
h = 0.0015625
h = 0.0001953125
h = 2.44140625e-5
h = 3.0517578125e-6
(0.5403023058683176, 1.7075230118734908e-12)
```

The output `0.5403023058683176`

differs from `cos(1)`

by `≈ 1.779e-13`

, so in this case the returned error estimate is only a little conservative. Unlike the `sin(x)/x`

example, `extrapolate`

is not able
to attain machine precision (the floating-point cancellation error in this function is quite severe for small `h`

!), but it is able to get surprisingly close.

Another possibility for a finite-difference/Richardson combination was suggested by Ridders (1982), who computed both `f'(x)`

and `f''(x)`

(the first and second derivatives) simultaneously using a center-difference approximation, which requires two new `f(x)`

evaluations for each `h`

. In particular, the center-difference approximations are `f'(x) ≈ (f(x+h)-f(x-h))/2h`

and `f''(x) ≈ (f(x+h)-2f(x)+f(x-h))/h²`

, both of which have errors that go as `O(h²)`

. We can plug both of these functions *simultaneously* into `extrapolate`

(so that they share `f(x±h)`

evaluations) by using a vector-valued function returning `[f', f'']`

. Moreover, since these center-difference approximations are even functions of `h`

(identical for `±h`

), we can pass `power=2`

to `extrapolate`

in order to exploit the even-power Taylor expansion. Here is a function implementing both of these ideas:

```
# returns (f'(x), f''(x))
function riddersderiv2(f, x, h; atol=0, rtol=atol>0 ? 0 : sqrt(eps(typeof(float(real(x+h))))), contract=0.5)
f₀ = f(x)
val, err = extrapolate(h, atol=atol, rtol=rtol, contract=contract, power=2) do h
f₊, f₋ = f(x+h), f(x-h)
[(f₊-f₋)/2h, (f₊-2f₀+f₋)/h^2]
end
return val[1], val[2]
end
```

(This code could be made even more efficient by using StaticArrays.jl for the `[f', f'']`

vector.) The original paper by Ridders accomplishes something similar in `< 20`

lines of TI-59 calculator code, by the way; so much for high-level languages!

For example,

```
julia> riddersderiv2(1, 0.1, rtol=0) do x
@show x
sin(x)
end
x = 1
x = 1.1
x = 0.9
x = 1.05
x = 0.95
x = 1.025
x = 0.975
x = 1.0125
x = 0.9875
x = 1.00625
x = 0.99375
(0.5403023058681394, -0.841470984807975)
```

evaluates the first and second derivatives of `sin(x)`

at `x=1`

and obtains the correct answer `(cos(1), -sin(1))`

to about 15 and 13 decimal digits, respectively, using 11 function evaluations.

### Handling problematic convergence

It is useful to consider a finite-difference approximation for the derivative of
the function `1/x`

at some `x ≠ 0`

: i.e. computing the limit of `f(h) = (1/(x+h) - 1/x) / h`

as `h`

goes to zero similar to above.

This function `f(h)`

has a pole at `h=-x`

, i.e. `f(-x)`

blows up. This means
that the Taylor series of `f(h)`

only converges for `h`

values small enough to avoid this pole, and in
fact for `|h| < |x|`

. Since Richardson extrapolation is essentially
approximating the Taylor series, this means that the extrapolation process doesn't
converge if the starting `h`

is too large, and `extrapolation`

will give up and
halt with the wrong answer.

This lack of convergence is easily observed: set `x=0.01`

(where the correct derivative of `1/x`

is `-10000`

) and consider what happens for a starting `h`

that is too large compared to `|x|`

:

```
julia> extrapolate(1.0) do h
@show h
x = 0.01
(1/(x+h) - 1/x) / h
end
h = 1.0
h = 0.125
h = 0.015625
(-832.4165749908325, 733.4066740007335)
```

Before reaching an `|h| < 0.01`

where the power series could begin to converge, `extrapolate`

gave up and returned a wrong answer (with a large error estimate to let you know that the result is garbage)! In contrast, if we start with a small enough `h`

then it converges just fine and returns the correct answer (`-10000`

) to nearly machine precision:

```
julia> extrapolate(0.01) do h
@show h
x = 0.01
(1/(x+h) - 1/x) / h
end
h = 0.01
h = 0.00125
h = 0.00015625
h = 1.953125e-5
h = 2.44140625e-6
h = 3.0517578125e-7
(-10000.000000000211, 4.066770998178981e-6)
```

Of course, if you know that your function blows up like this, it is easy to choose good initial `h`

, but how can we persuade `extrapolate`

to do a better job automatically?

The trick is to use the `breaktol`

keyword argument. `breaktol`

defaults to `2`

,
which means that `extrapolate`

gives up if the best error estimate increases by
more than a factor of 2 from one iteration to the next. Ordinarily, this
kind of breakdown in convergence arises because you hit the limits of floating-point precision, and halting extrapolation is the right thing to do. Here, however, it would converge if we just continued shrinking `h`

. So, we simply set `breaktol=Inf`

to force extrapolation to continue, which works even for a large initial `h=1000.0`

:

```
julia> extrapolate(1000.0, breaktol=Inf) do h
@show h
x = 0.01
(1/(x+h) - 1/x) / h
end
h = 1000.0
h = 125.0
h = 15.625
h = 1.953125
h = 0.244140625
h = 0.030517578125
h = 0.003814697265625
h = 0.000476837158203125
h = 5.9604644775390625e-5
h = 7.450580596923828e-6
h = 9.313225746154785e-7
h = 1.1641532182693481e-7
(-10000.000000029328, 5.8616933529265225e-8)
```

Not that it continues extrapolating until it reaches small `h`

values where the power-series converges, and in the end it again returns the correct answer to nearly machine precision (and would reach machine precision if we set a smaller `rtol`

). (The `extrapolate`

function *automatically* discards the initial points where the polynomial extrapolation fails.)