--- myst: substitutions: sentence1: "These data" sentence2: "already introduced previously" --- (calibration_reminder)= # Brief reminder of theoretical aspects ```{include} /../core/calibration/introduction.md :end-line: 4 ``` In general, a calibration procedure requires an input dataset meaning an existing set of elements (either resulting from simulations or experiments). This ensemble (of size $n$) can be written as ```{math} \mathcal{D} = \{ (\mathbf{x}^{i},\mathbf{y}^{i}), i=1,\ldots,n\} ``` where $\mathbf{x}^{i}$ is the i-th input vector which can be written as $\mathbf{x}^{i}=(x^{i}_1\,\ldots\,x^{i}_{n_X})$ while $\mathbf{y}^{i}$ is the i-th output vector which can be written as $\mathbf{y}^{i}=(y^{i}_1\,\ldots\,y^{i}_{n_Y})$. ```{include} /../core/calibration/introduction.md :start-line: 5 :end-line: 15 ``` The standard hypothesis for probabilistic calibration is that the observations differ from the predictions of the model by a certain amount which is supposed to be a random variable as ```{math} :label: epsilonCalib \varepsilon = y - f_\theta(\mathbf{x}) ``` where $\varepsilon$ is a random variable whose expectation is equal to 0 and which is called *residuals*. This variable represents the deviation between the model prediction and the observation under investigation. It might arise from two possible origins which are not mutually exclusive: - experimental: affecting the observations. For a given observation, it could be written $\varepsilon_{\rm obs} = y_{\rm real} - y$ - modelling: the chosen model $f_\theta$ is intrinsically not correct. This contribution could be written $\varepsilon_{\rm model} = f^*_\theta - f_\theta$ As the ultimate goal is to have $y_{\rm real} - f^*_\theta = 0$, injecting back the two contributions discussed above, this translates back to {eq}`epsilonCalib`, only breaking down: ```{math} y - f_\theta = \varepsilon_{\rm obs} + \varepsilon_{\rm model} ``` The rest of this section introduces two important discussions that will be referenced throughout this module: - the distance between observations and the predictions of the models, in [](#calibration_introduction_distance_compare_model); - the theoretical background and hypotheses (linear assumption, concept of prior and posterior distributions, the Bayes formulation...) in [](#calibration_reminder_discussing_theoretical). The former is simply the way to obtain statistics over the $n$ samples of the reference observations when comparing them to a set of parameters and how these statistics are computed when the $n_Y \neq 1$. The latter is a general introduction, partly reminding elements already introduced in other sections and discussing some assumptions and theoretical foundations needed to understand the methods discussed later on. On top of this description, there are several predefined calibration procedures proposed in the {{uranie}} platform: ```{include} /../core/calibration/introduction.md :start-line: 16 ``` ```{toctree} reminder/distance_compare_model reminder/discussing_theoretical ```