11.2.2.1. Recommended distance and likelihood functions, construction method
There are several ways to define a distance or a likelihood function. The recommended approach is to use the setDistance and setLikelihood
methods, which are directly available in each calibration class such as TLinearBayesian or TMinimisation. Alternatively, one can call the
setDistanceAndLikelihood method from the TCalibration class (which is actually invoked by the two previous methods), and is inherited by
all calibration classes. However, we recommend using the first two approaches, in order to prevent mistakenly assigning distances to
likelihood-based methods, or likelihood functions to distance-based methods. The prototypes discussed here are as follows:
setDistanceAndLikelihood(funcName, tdsRef, input, reference, weight="")
setDistance(funcName, tdsRef, input, reference, weight="")
setLikelihood(funcName, tdsRef, input, reference, weight="")
It takes up to five arguments, one of which is optional:
funcName: the name of the distance or likelihood function, corresponding to one of the already implemented one, as discussed in Distances and likelihoods used to compare observations and model predictions. The possible choices are
“L1” for
TL1DistanceFunction;“LS” for
TLSDistanceFunction;“RelativeLS” for
TRelativeLSDistanceFunction;“WeightedLS” for
TWeightedLSDistanceFunction;“Mahalanobis” for
TMahalanobisDistanceFunction;“log-gauss” for
TGaussLogLikelihoodFunction.
tdsRef: the
TDataServerin which the observations are stored;input: the input variables stored in the
TDataServertdsRef, which must be defined as inputs in the code before creating the calibration object. This argument has the usual attribute list format “x:y:z”;reference: the reference variables stored in the
TDataServertdsRef, against which the output of the code or function will be compared. This argument has the usual attribute list format “out1:out2:out3”;weight (optional): this argument is optional and can be used to define the name of the (single) variable stored in the
TDataServertdsRef which, in the case of aTWeightedLSDistanceFunction, is used to fill the \(\lbrace \psi_i^j \rbrace_{i \in [1,n]}\), i.e. the coefficients that weight each observation relative to the others and, in the case of aTGaussLogLikelihoodFunction, is used to fill the \(\lbrace \sigma_i^j \rbrace_{i \in [1,n]}\), i.e. the standard deviations of each observation associated to the \(j\)-th variable (see Distances and likelihoods used to compare observations and model predictions).
Warning
The number of variables in the weight list should match the number of outputs of your code used to calibrate the parameters. If one output is not weighted, a “one” attribute should be added to handle that output, when the other requires an uncertainty model.
Once this method is called, the distance or likelihood function is created and is stored within the calibration object. It may be necessary to access it for certain options, but this is further discussed in Available options for every distance and likelihood function.
The following line summarises this construction in a case where an instance cal of the fake class
TCalClass (as if this class was inheriting from the TCalibration class) is created.
# Define the dataservers
tdsRef = DataServer.TDataServer("reference", "myReferenceData")
# Load the data, both inputs (ref_var1 and ref_var2) and a single output (ref_out1).
tdsRef.fileDataRead("myInputData.dat")
...
tdsPar = DataServer.TDataServer("parameters", "myParameters")
tdsPar.addAttribute(DataServer.TNormalDistribution("par1", 0, 1))
# the parameter to calibrate
...
# Define the model
...
# Create the instance of TCalClass
cal = Calibration.TCalClass() # Constructor is discussed later on
# Define the least squares distance
cal.setDistance("LS", tdsRef, "ref_var1:ref_var2", "ref_out1")
In this fake example, the distance function is the least squares one, and it will use the \(n\) values
of both inputs “ref_var1” and “ref_var2” and output “ref_out1” stored in tdsRef to calibrate
the parameter “par1”. No observation weight is needed in this case, as least squares method does not
require it and as there is only a single output, no variable weight needs to be defined either.