7.5.2.1. Metropolis-Hastings algorithm

The idea here is to choose a family of candidate-generating densities that follow \(q(x,y)=q_{1}(x-y)\) where \(q_{1}(.)\) is a multivariate density [Muller91], a classical choice being \(q_1\) typically chosen as a multivariate normal density. The candidate is indeed drawn as current value plus a noise, hence the name “random walk”.

Once the newly selected configuration-candidate is chosen, let’s call it \(\theta_{T}\), a comparison is made with the latest retained configuration, denoted \(\theta_{k}\), through the likelihood ratio, which allows to get rid of any constant factors and should look like this once transformed to its log form:

\[\log{ \Big(\frac{L(\mathbf{y}|\theta_{T})}{L(\mathbf{y}|\theta_{k})}\Big) } = \frac{1}{2} \sum_{i=1}^{n} \Big (\frac{y_{i} -f_{\theta_{k}}(\mathbf{x}_i)}{\sigma_{\varepsilon_i}}\Big)^{2} - \frac{1}{2} \sum_{i=1}^{n} \Big (\frac{y_{i} -f_{\theta_{T}}(\mathbf{x}_i)}{\sigma_{\varepsilon_i}}\Big)^{2}\]

This result is then compared to the logarithm of a random uniform drawing between 0 and 1 to decide whether one should keep this configuration (as is usually done in Monte Carlo approaches, see [CG95]).

There are a few more properties for this kind of algorithm such as the acceptation ratio, that might be tuned or used as validity check according to the dimension of our parameter space for instance [GRG+96, RGG+97] or the lag definition, sometimes used to thin the resulting sample (whose use is not always recommended as discussed in [LE12]). These subjects being very close to the implementation choices, they are not discussed here.

Convergence is often difficult to diagnose in practice. To increase confidence in the results, it is advisable to initialize several chains in Markov Chain Monte Carlo (MCMC). This approach helps assess convergence and improves the reliability of the results. Since MCMC methods sample from a probability distribution by simulating a chain of dependent draws, a single chain might get stuck in a local mode or fail to explore the distribution fully. Running multiple chains from different starting points allows us to check whether they converge to the same target distribution, thereby increasing confidence that the sampling has stabilized. Additionally, comparing chains provides diagnostics for mixing and convergence, making the inference more robust.

7.5.2.1. Metropolis-Hastings algorithm

Methodology

Navigation

Related Topics