11.2.1. General introduction to data and model definition
All calibration problems will have at least two TDataServer objects:
The reference one, usually called
tdsRef, contains the observations (both input and output attributes) on which the calibration is to be performed. It is generally read from a simple input file as done below:
tdsRef = DataServer.TDataServer("reference", "my reference")
tdsRef.fileDataRead("myInputData.dat")
the parameter one, usually called
tdsPar, contains only attributes and must be empty of data. Its purpose is to define the parameters that should be tested in the calibration process and, depending on the method chosen, will only containTAttributemembers (for minimisation, see Minimisation techniques) or onlyTStochasticAttributeinheriting objects for all other methods. The latter case gathers the method doing the analytical computation when the chosen priors are allowed (see Analytical linear Bayesian estimation) along with all those that require generating one or more design-of-experiments (see Approximate Bayesian Computation techniques (ABC) but also Markov chain Monte Carlo approach).
This step, which should represent the first lines of the calibration procedure, goes along with the
model definition. This one is tricky with respect to all the examples provided in
Macros Launcher and in Macros Relauncher as the inputs of the model are
coming from two different TDataServers, they can be split into two categories:
the reference ones will only have values from the reference input file
myInputData.dat, which contains \(n\) observations. For each configuration, the model is evaluated \(n\) times: the configuration remains fixed, while the reference values are varied across those \(n\) observations;the parameter ones, on the other hand, change only when moving to a new configuration and remain constant across all \(n\) reference evaluations.
Depending on the way the model is coded (and more likely on the parameters the user would like to calibrate) these attributes might not be separated in terms of order, meaning that the list of inputs of a model might look a bit like this:
# Example of input list for a fictive model (whatever the launching solution is chosen)
# ref_var1, ref_var2, ref_var3, ref_var4 are coming from the tdsRef dataserver
# par1, par2 are coming from the tdsPar dataserver
sinputList = "ref_var1:par_1:ref_var2:ref_var3:ref_var4:par_2"
Warning
As a result, there is no implicit declaration allowed in the calibration classes constructor and
particular attention must be paid when defining the model: the user must provide the list of inputs
(for Launcher-type model) or fill the input and output list into the TEval-inheriting object
in the correct order (for Relauncher-type model). This is further discussed in
Common methods of the calibration classes.
Finally, all models considered for calibration should have exactly as many outputs (whatever their
names are) as the number of outputs to be compared with (the output attributes in the tdsRef
TDataServer object). These outputs are those that will be used to compute the chosen agreement (meaning
the result of the distance or likelihood function which is the only quantifiable measurement we have between the
reference and the predictions for a given configuration). At the end of a calibration process, the
user can find three different kinds of information (more can be added if needed, see
Running the estimate):
the resulting calibrated value of the parameters (or values depending on the chosen method, as some of them are providing several samples of calibrated configurations) is being stored in the parameter
TDataServerobject: as this object was provided empty and should contain only an attribute per parameter to be calibrated, it seems to be the best place to store results. This is obviously the expected target but it should not be considered conclusive without having a look at the two other ones;the agreement between the reference data and the model predictions, which are stored in the parameter
TDataServerobjecttdsParfor every calibrated configuration;the residuals: the difference between the model predictions and the reference data for the \(n\) observations, using the a priori and a posteriori configurations. These are stored in a dedicated
TDataServerobject, called the EvaluationTDS (referred to astdsEval) and it is mainly called through thedrawResidualsmethod (discussed in Drawing the residuals). If one wants to access it, it is possible to get a hand on it by calling thegetEvaluationTDS()method. The residuals are important to check that the behaviour of the newly performed calibration does not show unexpected and unexplained tendencies with respect to any variables in the defined uncertainty setup, see the dedicated discussion on this in [Bla17].