This package implements tools for integrating out (marginalizing) parameters
from log-probability functions, such as likelihoods and posteriors of statistical
models. This approach is useful for fitting models that incorporate random effects or
latent variables that are important for structuring the model, but whose exact
values may not be of interest in and of themselves. In the language of regression,
these are known as "random effects," and in other contexts, they may be called "nuisance
parameters." Whatever they are called, we hope that our log-density function can be
optimized faster and/or more reliably if we focus on the parameters we
are *actually* interested in, averaging the log-density over all possible values of
the ones we are not.

A basic example is shown below. We start out by writing a function for our target log-
density. This function must have the signature `f(u, data)`

, where `u`

is a numeric vector
of parameters and `data`

contains any data, or fixed parameters, needed by the function.
The `data`

argument can be anything you'd like, such as a `Vector`

, `NamedTuple`

,
`DataFrame`

, or some other `struct`

. (If your function does not depend on any external data,
just ignore that argument in the function body)

⚠️ Note that the convention for this package is to write the function as apositivelog-density.

```
using MarginalLogDensities, Distributions, LinearAlgebra
N = 3
σ = 1.5
dist = MvNormal(σ^2 * I(3))
data = (N=N, dist=dist)
function logdensity(u, data)
return logpdf(data.dist, u)
end
```

We then set up a marginalized version of this function. First, we set an initial
value for our parameter vector `u`

, as well as a set of indices `iw`

indicating
the subset of `u`

that we want to integrate out.

```
u = [1.1, -0.3, 0.1] # arbitrary initial values
iw = [1, 3]
```

We create the marginalized version of the function by calling the `MarginalLogDensity`

constructor with the original function `logdensity`

, the full parameter vector `u`

, the
indices of the marginalized values `iw`

, and the `data`

. (If your function does not depend on
the `data`

argument, it can be omitted here.)

`marginal_logdensity = MarginalLogDensity(logdensity, u, iw, data)`

Here, we are saying that we want to calculate `logdensity`

as a function of `u[2]`

only,
while integrating over all possible values of `u[1]`

and `u[3]`

. In the laguage of
mathematics, if we write `u -> logdensity(u, data)`

as `v -> marginal_logdensity(v, data)`

is calculating

By default, this package uses Laplace's method to approximate this integral. The Laplace approximation
is fast in high dimensions, and works well for log-densities that are approximately Gaussian. The
marginalization can also be done via numerical integration, a.k.a. cubature, which may be more accurate
but will not scale well for higher dimensional problems. You can choose which marginalizer to use by passing the
appropriate `AbstractMarginalizer`

to `MarginalLogDensity`

:

```
MarginalLogDensity(logdensity, u, iw, data, Cubature())
MarginalLogDensity(logdensity, u, iw, data, LaplaceApprox())
```

Both `Cubature()`

and `LaplaceApprox()`

can be specified with various options; refer to their
docstrings for details.

After defining `marginal_logdensity`

, you can call it just like the original function,
with the difference that you only need to supply `v`

, the subset of parameters you're
interested in, rather than the entire set `u`

. You also can re-run the same
`MarginalLogDensity`

with different `data`

if you want (though if you're depending on
the sparsity of your changing the `data`

causes the sparsity).

```
initial_v = [1.0] # another arbitrary starting value
length(initial_v) == N - length(iw) # true
marginal_logdensity(initial_v, data)
# -1.5466258635350596
```

Compare this value to the analytic marginal density, i.e. the log-probability of

```
logpdf(Normal(0, 1.5), 1.0)
# -1.5466258635350594
```

The point of doing all this was to find an optimal set of parameters `v`

for
your data. This package includes an interface to Optimization.jl that
works directly with a `MarginalLogDensity`

object, making optimization easy. The simplest
way is to construct an `OptimizationProblem`

directly from the `MarginalLogDensity`

and
`solve`

it:

```
using Optimization, OptimizationOptimJL
opt_problem = OptimizationProblem(marginal_logdensity, v0)
opt_solution = solve(opt_problem, NelderMead())
```

If you want more control over options, for instance setting an AD backend, you can
construct an `OptimizationFunction`

explicitly:

```
opt_function = OptimizationFunction(marginal_logdensity, AutoFiniteDiff())
opt_problem = OptimizationProblem(opt_function, v0)
opt_solution = solve(opt_problem, LBFGS())
```

Note that at present we can't differentiate through the Laplace approximation, so outer
optimizations like this need to either use a gradient-free solver (like `NelderMead()`

),
or a finite-difference backend (like `AutoFiniteDiff()`

). This is on the list of planned
improvements.

A more realistic application to a mixed-effects regression can be found in this example script.