11.6.4.4. Thinning the chains with ESS

Once convergence of the chains has been assessed using the trace plot (see Drawing the trace), the acceptance ratio plot (see Drawing the acceptation ratio), and the Gelman–Rubin diagnostic (see Checking for convergence with the Gelman–Rubin diagnostic), the next step is to evaluate the autocorrelation of the samples. If samples are highly correlated, the posterior distribution may not be properly explored (e.g., chains could remain trapped in a mode).

A common way to assess this is to compute the Effective Sample Size (ESS), which estimates how many samples can effectively be considered as independent among the available ones. ESS thus indirectly indicates the appropriate lag (the number of iterations to skip to obtain approximately uncorrelated samples). More details are provided in [Bla17].

The method diagESS has the following prototype:

std::unordered_map<string, vector<int>> diagESS();

This method takes no arguments, as it is called directly from the TCalibration object. For example, in Macro “calibrationMCMCFlowrate1D.C”, the ESS diagnostic is computed with:

std::unordered_map<string, vector<int>> ESS_values = cal->diagESS();  
ESS_values["hl"][0]; // To extract the ESS statistic for parameter hl computed on the first chain

The returned object ESS_values is an unordered map where each parameter name (e.g., hl) is associated with a vector of ESS values, one per chain. The results are also printed in the console (see Console).

In this example, the ESS diagnostic confirms that each parameter has several hundred (at least 200) effectively uncorrelated samples. It is therefore recommended to set the lag to 1 using setLag method (see Investigating the quality of the samples through diagnostics and plots), which indicates that the samples are sufficiently uncorrelated.

If the ESS is too small, it may be necessary to run additional iterations (see Running the estimate, exporting and loading chains, and continuing the calculation) or to adjust the initial standard deviation of the proposal distribution, which may be too small, causing the chain to move too slowly (see Initialising the process).