7.5.3. Assessing convergence
Assessing convergence is a crucial step when applying MCMC methods, as a lack of convergence can compromise the validity of the inference. In practice, several complementary techniques are used:
a burn-in period is typically discarded to remove the influence of the arbitrary starting point before the chain reaches stationarity;
thinning the chain by keeping only every k-th sample (lag) can reduce autocorrelation between draws;
visual inspection of trace plots is also a simple but powerful way to verify whether the chains have mixed well and stabilized;
monitoring the acceptance ratio provides further information about the efficiency of the algorithm: values that are too high may indicate inefficient exploration, while values that are too low suggest poor mixing.
Beyond these practical checks, formal diagnostics are often employed. The Effective Sample Size (ESS) quantifies the number of effectively independent samples, accounting for autocorrelation [Gey92]:
where \(N\) is the total number of draws and \(\rho_k\) the autocorrelation at lag \(k\). In practice, the sum over \(k\) is truncated once the autocorrelation becomes small enough, often smaller than \(0.05\). A larger ESS indicates more efficient sampling.
The Gelman–Rubin statistic \(\widehat{R}\) compares within-chain and between-chain variances across multiple chains [GR92]:
with \(W\) the within-chain variance and \(B\) the between-chain variance for \(n_C\) chains of length \(N\). Values of \(\widehat{R}\) close to 1 suggest that the chains have converged to the same target distribution.
Together, these diagnostics and techniques provide complementary evidence that the MCMC algorithm has converged and that its samples can be reliably used for inference.