(calibration_classes_functions_observations_data_model)= # General introduction to data and model definition All calibration problems will have at least two {{tds}} objects: - The reference one, usually called `tdsRef`, contains the observations (both input and output attributes) on which the calibration is to be performed. It is generally read from a simple input file as done below: ````{only} cpp ```cpp TDataServer *tdsRef = new TDataServer("reference","my reference"); tdsRef->fileDataRead("myInputData.dat"); ``` ```` ````{only} py ```python tdsRef = DataServer.TDataServer("reference", "my reference") tdsRef.fileDataRead("myInputData.dat") ``` ```` - the parameter one, usually called `tdsPar`, contains only attributes and must be empty of data. Its purpose is to define the parameters that should be tested in the calibration process and, depending on the method chosen, will only contain `TAttribute` members (for minimisation, see [](#calibration_minimisation)) or only `TStochasticAttribute` inheriting objects for all other methods. The latter case gathers the method doing the analytical computation when the chosen *priors* are allowed (see [](#calibration_linear_bayesian)) along with all those that require generating one or more {{doe}} (see [](#calibration_abc) but also [](#calibration_markov_chain)). This step, which should represent the first lines of the calibration procedure, goes along with the model definition. This one is tricky with respect to all the examples provided in [](#use_cases_macro_launcher) and in [](#use_cases_macro_relauncher) as the inputs of the model are coming from two different `TDataServers`, they can be split into two categories: - the reference ones will only have values from the reference input file `myInputData.dat`, which contains $n$ observations. For each configuration, the model is evaluated $n$ times: the configuration remains fixed, while the reference values are varied across those $n$ observations; - the parameter ones, on the other hand, change only when moving to a new configuration and remain constant across all $n$ reference evaluations. Depending on the way the model is coded (and more likely on the parameters the user would like to calibrate) these attributes might not be separated in terms of order, meaning that the list of inputs of a model might look a bit like this: ````{only} cpp ```cpp // Example of input list for a fictive model (whatever the launching solution is chosen) // ref_var1, ref_var2, ref_var3, ref_var4 are coming from the tdsRef dataserver // par1, par2 are coming from the tdsPar dataserver TString sinputList="ref_var1:par_1:ref_var2:ref_var3:ref_var4:par_2"; ``` ```` ````{only} py ```python # Example of input list for a fictive model (whatever the launching solution is chosen) # ref_var1, ref_var2, ref_var3, ref_var4 are coming from the tdsRef dataserver # par1, par2 are coming from the tdsPar dataserver sinputList = "ref_var1:par_1:ref_var2:ref_var3:ref_var4:par_2" ``` ```` ```{warning} As a result, there is no implicit declaration allowed in the calibration classes constructor and particular attention must be paid when defining the model: the user must provide the list of inputs (for **Launcher**-type model) or fill the input and output list into the `TEval`-inheriting object in the correct order (for **Relauncher**-type model). This is further discussed in [](#calibration_classes_functions_observations_calibration_classes). ``` Finally, all models considered for calibration should have exactly as many outputs (whatever their names are) as the number of outputs to be compared with (the output attributes in the `tdsRef` {{tds}} object). These outputs are those that will be used to compute the chosen agreement (meaning the result of the distance or likelihood function which is the only quantifiable measurement we have between the reference and the predictions for a given configuration). At the end of a calibration process, the user can find three different kinds of information (more can be added if needed, see [](#calibration_classes_functions_observations_calibration_classes_running)): - the resulting calibrated value of the parameters (or values depending on the chosen method, as some of them are providing several samples of calibrated configurations) is being stored in the parameter {{tds}} object: as this object was provided empty and should contain only an attribute per parameter to be calibrated, it seems to be the best place to store results. This is obviously the expected target but it should not be considered conclusive without having a look at the two other ones; - the agreement between the reference data and the model predictions, which are stored in the parameter {{tds}} object `tdsPar` for every calibrated configuration; - the residuals: the difference between the model predictions and the reference data for the $n$ observations, using the *a priori* and *a posteriori* configurations. These are stored in a dedicated {{tds}} object, called the **EvaluationTDS** (referred to as `tdsEval`) and it is mainly called through the `drawResiduals` method (discussed in [](#calibration_classes_functions_observations_calibration_classes_draw_residuals)). If one wants to access it, it is possible to get a hand on it by calling the `getEvaluationTDS()` method. The residuals are important to check that the behaviour of the newly performed calibration does not show unexpected and unexplained tendencies with respect to any variables in the defined uncertainty setup, see the dedicated discussion on this in {{metho}}.