11.2.1. General introduction to data and model definition

All calibration problems will have at least two TDataServer objects:

The reference one, usually called tdsRef, contains the observations (both input and output attributes) on which the calibration is to be performed. It is generally read from a simple input file as done below:

tdsRef = DataServer.TDataServer("reference", "my reference")
tdsRef.fileDataRead("myInputData.dat")

the parameter one, usually called tdsPar, contains only attributes and must be empty of data. Its purpose is to define the parameters that should be tested in the calibration process and, depending on the method chosen, will only contain TAttribute members (for minimisation, see Minimisation techniques) or only TStochasticAttribute inheriting objects for all other methods. The latter case gathers the method doing the analytical computation when the chosen priors are allowed (see Analytical linear Bayesian estimation) along with all those that require generating one or more design-of-experiments (see Approximate Bayesian Computation techniques (ABC) but also Markov chain Monte Carlo approach).

This step, which should represent the first lines of the calibration procedure, goes along with the model definition. This one is tricky with respect to all the examples provided in Macros Launcher and in Macros Relauncher as the inputs of the model are coming from two different TDataServers, they can be split into two categories:

the reference ones will only have values from the reference input file myInputData.dat, which contains \(n\) observations. For each configuration, the model is evaluated \(n\) times: the configuration remains fixed, while the reference values are varied across those \(n\) observations;
the parameter ones, on the other hand, change only when moving to a new configuration and remain constant across all \(n\) reference evaluations.

Depending on the way the model is coded (and more likely on the parameters the user would like to calibrate) these attributes might not be separated in terms of order, meaning that the list of inputs of a model might look a bit like this:

# Example of input list for a fictive model (whatever the launching solution is chosen)
# ref_var1, ref_var2, ref_var3, ref_var4 are coming from the tdsRef dataserver
# par1, par2 are coming from the tdsPar dataserver
sinputList = "ref_var1:par_1:ref_var2:ref_var3:ref_var4:par_2"

Warning

As a result, there is no implicit declaration allowed in the calibration classes constructor and particular attention must be paid when defining the model: the user must provide the list of inputs (for Launcher-type model) or fill the input and output list into the TEval-inheriting object in the correct order (for Relauncher-type model). This is further discussed in Common methods of the calibration classes.

Finally, all models considered for calibration should have exactly as many outputs (whatever their names are) as the number of outputs to be compared with (the output attributes in the tdsRef TDataServer object). These outputs are those that will be used to compute the chosen agreement (meaning the result of the distance or likelihood function which is the only quantifiable measurement we have between the reference and the predictions for a given configuration). At the end of a calibration process, the user can find three different kinds of information (more can be added if needed, see Running the estimate):

the resulting calibrated value of the parameters (or values depending on the chosen method, as some of them are providing several samples of calibrated configurations) is being stored in the parameter TDataServer object: as this object was provided empty and should contain only an attribute per parameter to be calibrated, it seems to be the best place to store results. This is obviously the expected target but it should not be considered conclusive without having a look at the two other ones;
the agreement between the reference data and the model predictions, which are stored in the parameter TDataServer object tdsPar for every calibrated configuration;
the residuals: the difference between the model predictions and the reference data for the \(n\) observations, using the a priori and a posteriori configurations. These are stored in a dedicated TDataServer object, called the EvaluationTDS (referred to as tdsEval) and it is mainly called through the drawResiduals method (discussed in Drawing the residuals). If one wants to access it, it is possible to get a hand on it by calling the getEvaluationTDS() method. The residuals are important to check that the behaviour of the newly performed calibration does not show unexpected and unexplained tendencies with respect to any variables in the defined uncertainty setup, see the dedicated discussion on this in [Bla17].