7.5.2. The MCMC algorithms

In the MCMC approach, the candidate-generating density is traditionally denoted \(q(x,y)\). If this density satisfies the reversibility condition in Equation 7.13 for all \(x\) and \(y\) the search would be over, but this is very unlikely. What’s more probable is to find something like \(\pi(x)q(x,y) > \pi(y)q(y,x)\) that states that moving from \(x\) to \(y\) occurs too often (or too rarely).

The proposed way to correct this is to introduce a probability \(\alpha(x,y) < 1\) where this \(\alpha(x,y)\) is called the probability of a move that is injected in the reversibility condition to help fulfil it. Without going into too much detail (see [CG95] which nicely discusses this), the probability of move is usually set to

\[\begin{split}\begin{split} \alpha(x,y) & = \min\bigg[\frac{\pi(y)q(y,x)}{\pi(x)q(x,y)},1\bigg], \quad\quad {\rm if} \;\; \pi(x)q(x,y) > 0\\ & =1, \quad\quad {\rm otherwise} \end{split}\end{split}\]

If the chain is currently at a point \(x_k=x\), then it generates a value \(y\) accepted as \(x_{k+1}\) with the probability \(\alpha(x,y)\). If rejected, the chain remains at the current location and another drawing is performed from there.

With this, one can define the off-diagonal density of the MCMC kernel as function \(p(x,y) = q(x,y)\alpha(x,y)\) if \(x \neq y\) and 0 otherwise and thanks to Equation 7.12, one has the invariant distribution for \(P\) [Tie94].

Warning

Two important things to notice here

  • the obtained sample is obviously not independent as the (k+1)-th location is taken out from the k-th one. It might thus be useful to thin the obtained sample to ensure that the samples are less correlated (or ideally uncorrelated) with one another thanks to the lag hyperparameter.

  • the very first drawn locations are usually rejected as part of the burn-in (also called warm-up) process. As discussed above, the algorithm needs a certain number of iterations to converge through the invariant distribution.