next up previous
Next: Sampling algorithms Up: An introduction to Markov Previous: Markov chains on countable

Monte Carlo integration

 ² The extensive use of Markov chains in statistical applications relies on the possibility of estimating the properties of a probability distribution tex2html_wrap_inline432 through Monte Carlo sampling. Indeed, let us assume that we want to compute the expectation under tex2html_wrap_inline432 of a function g defined on F:
equation61
where g is a tex2html_wrap_inline432-measurable function. The law of the large number ensures that the arithmetic mean of an independent sample tex2html_wrap_inline488 with distribution tex2html_wrap_inline432 converges towards the integral:
 equation63
This is the principle of Monte Carlo integration: each time we want to estimate a (possibly highly dimensional) integral under the distribution tex2html_wrap_inline432, we sample the distribution tex2html_wrap_inline432 and compute the arithmetic mean. Especially, the probability under tex2html_wrap_inline432 of any particular event can be computed:
eqnarray70
A nice feature about Monte Carlo integration is that the approximation in equation 6 does not depend on the dimension of the state space, but only on the size of the sample. Indeed, under the assumption of independence, the variance of the estimate tex2html_wrap_inline498 of equation 6 writes:
equation79

What happens if we now sample the distribution tex2html_wrap_inline432 with a Markov chain Z? The main difference is that the samples are no longer independent since consecutive states of a Markov chain are linked by the transition kernel K. However, if Z is an irreducible ergodic Markov chain with stationary distribution tex2html_wrap_inline432, the ergodic theorem implies the convergences of the sample means [5]:
equation84
The variance of the estimate tex2html_wrap_inline498 now depends on the correlation between consecutive states of the chain:
equation90
Let us define the integral range as the sum of the correlations [7, 8]:
 equation99
The variance of the estimate tex2html_wrap_inline498 can then be approximated by:
equation106
Hence, the larger the integral range, the larger the size m of the sample needs to be to obtain an estimate of the same quality of the independent estimate. The choice of the transition kernel K controls the convergence properties of the Monte Carlo integrals and is very important in practice. Transition kernels which increase the mixing within the chain (and thus decrease the correlation between consecutive states) must be preferred.


next up previous
Next: Sampling algorithms Up: An introduction to Markov Previous: Markov chains on countable

Bob Fisher
Fri Jul 26 09:56:32 BST 2002