# DirichletProcessMixtures.jl

This package implements Dirichlet Process Mixture Models in Julia using variational inference for truncated stick-breaking representation of Dirichlet Process.

## (almost) infinite mixture of Gaussians

Most likely you need this package especially for this purpose, this is how to do Gaussian clustering. You may check demo code which contains almost all functionality you may need.

First off, you define your prior over parameters of mixture component (i.e. mean and precision matrix) using `NormalWishart`

distribution:

```
using DirichletProcessMixtures
using Distributions
prior = NormalWishart(zeros(2), 1e-7, eye(2) / 4, 4.0001)
```

Then you generate your mixture

```
x = ... # your data, x[:, i] - is i-th data point
T = 20 # truncation level
alpha = 0.1 # Dirichlet process parameter, controls how many clusters you need a priori
gm, theta, predictive_likelihood = gaussian_mixture(prior, T, alpha, x)
```

`gm`

is an internal representation of mixture model. `theta`

is array of size `T`

whose elements refer to parameters of posterior `NormalWishart`

's. Finally, `predictive_likelihood`

is a function which takes a matrix containing test data and returns per-point test loglikelihood. Now we can perform inference in our model

```
function iter_callback(mix::TSBPMM, iter::Int64, lower_bound::Float64)
pl = sum(predictive_likelihood(xtest)) / M
println("iteration $iter test likelihood=$pl, lower_bound=$lower_bound")
end
maxiter = 200
ltol = 1e-5
niter = infer(gm, maxiter, ltol; iter_callback=iter_callback)
```

You may see that `infer`

method performs not more than `maxiter`

iterations until lower bound tolerance reaches `ltol`

value, calling `iter_callback`

at each iteration if provided.

Another useful quantities you may need from mixture model:

`gm.z`

- TxN array with expected mixture component assignments`gm.qv`

- posterior`Beta`

distributions for stick-breaking proportions

## General interface

It is also possible to implement custom mixture models with conjugate priors for mixture components, but this remains to be documented yet. For a reference implementation of custom mixture model use mixture of Gaussians.