8.1. Tests based on the Empirical Distribution Function (“EDF tests”)

This part is introducing comparison tests, sometimes called “goodness of fit” tests, which are used as test hypothesis. This idea is to check, when considering a certain variable, whether it is following a predefined law among the list of implemented ones: normal, lognormal and uniform ones. To do so, there are three different tests implemented in Uranie. If one calls \(F_n(x)\) the Empirical Distribution Function of the law \(F(x)\) (i.e. the distribution that we’d like to test) and \(F_{0}(x)\) the reference law one wants to compare to, then, for \(n\) the number of data in the EDF, these tests are defined as:

Kolmogorov-Smirnov (\(D\)) [Kol33]
\[D = {\rm sup} |F_0(X_i) - F_n(X_i)|_{i=1,\ldots, n}\]
Anderson-Darling (\(A^2\)) [AD52]
\[A^2 = n \int \frac{|F_0(x) - F_n(x)|^2}{F_0(x)(1-F_0(x))} dF_0(x) = -n -\frac{1}{n} \sum_{i=1}^{n} (2i-1) \times [\log(F_n(X_i))+\log(F_n(X_{n+1-i})) ]\]
Cramer-VonMises (\(W^2\)) [And62]
\[W^2 = n \int |F_0(x) - F_n(x)|^2 dF_0(x) = \frac{1}{12n} \sum_{i=1}^{n} \left(F_0(X_i) - \frac{2i-1}{2n} \right)^2\]

In these three formulas, the \((X_i)_{i=1,\dots, n}\) set represents the ordered data of the random variable \(x\) which comes usually as the CDF distribution for convenience.