```{include} /../core/calibration/introduction/distance_compare_model.md
```

These definitions are not orthogonal. Indeed, if $\lbrace \psi_i\rbrace_{i \in [1,n]}=\alpha, 
\alpha \in \mathbb{R}$, then the least squares function is equivalent to the weighted least squares one. 
This situation is realistic, as it can correspond to the case where the least squares estimation is 
weighted with an uncertainty affecting the observations, assuming the uncertainty is constant 
throughout the data (meaning $\alpha = \sigma^{-2}$). This is called the **homoscedasticity** 
assumption and it is important for the linear case, as discussed later on.

One can also compare the relative and weighted least squares, if $\alpha = \mathbb{R}$ and $\lbrace 
\psi_i=(\alpha\%\times y_i)^{-1}\rbrace_{i\in[i,n]}$ these two forms become equivalent (the relative 
least squares is useful when uncertainty on observations is multiplicative). Finally, if one assumes 
that the covariance matrix of the observations is the identity (meaning $\Sigma = \mathbf{1}$), the 
Mahalanobis distance is equivalent to the least squares distance.

```{warning}
It might seem natural to think that the lower the distance 
is, the closer our parameters are to the real values. Bearing this in mind would mean thinking that 
"having a null distance" is the ultimate target of calibration, which is actually dangerous. As for 
the general discussion in [](#models_module), the risk could be to overfit the set of parameters 
by "learning" just the set of observations at our disposal as the "truth", not considering that the 
residuals (introduced in {eq}`epsilonCalib`) might be here to introduce observation uncertainties. In 
this case, knowing the value of the uncertainty on the observations, the ultimate target of the 
calibration might be to get the best agreement of observations and model predictions within the 
model uncertainty, which can be translated into a distribution of the reduced-residuals (that would be 
something like $\lbrace (y^i-f_\theta^i)/\sigma_{\varepsilon_i} \rbrace_{i \in [1, n]}$ in a scalar 
case) behaving like a standard normal distribution.
```