# Amos Storkey

## Approximate Inference

### Exact inference using Belief Propagation

Directed acyclic probabilistic graphs (belief networks) can be used to specify conditional independence relationships between random variables. These conditional independence relationships can be used to simplify the process of inference in belief networks.

Working with belief networks is a three stage process. Firstly the network is set up, and the probabilistic relationships between nodes is specified to represent the form of prior distribution which is assumed. Secondly data is received. This data gives information about some of the nodes of the network. It is used to `instantiate' those nodes, by fixing them to their observed value. Lastly, conditioning is performed: the probability distribution of the other nodes is calculated conditioned on the fixed value of the instantiated nodes.

Computationally speaking, the calculation of this conditional distribution is not straightforward. However in special situations the conditioning process can be simplified. Pearl observed that in singly connected (polytree) belief networks inference can be done with computational complexity that is linear in the number of nodes.

### Approximation Methods

Very often though the underlying graph is not singly connected. What can be done in these situations? For simple cases, a junction tree algorithm can be used to produce a polytree of graph cliques. However this scales badly in the size of the largest clique in the (triangulated) graph. In many realistic situations this is very large.

For most realistic scenarios people resort to approximation schemes. Some of the schemes considered here include Monte Carlo approaches, maximum posterior approximations, parametric approximations, variational methods and loopy propagation. Each of these has their advantages and disadvantages, and it is important to compare them, and to learn more about the value of each.

In most cases the work done here focusses on the use of these techniques for dynamic trees. Junction tree methods are impossible for dynamic trees because the largest clique is of the order of the network size.