---                                                                                                 
myst:                                                                                           
    substitutions:                                                                              
        sentence: " (see {numref}`dataserver_compute_quantile` for illustration purpose)"
--- 

```{include} /../core/dataserver/statistics/quantile_computation/wilks_quantile_computation.md
``` 

Wilks computation on the other hand request not only a probability value but also a confidence level. 
The quantile $x_p^\beta$ represents the $x_p$ quantile given the $p$ probability but this time, the 
value is provided with a $\beta$% confidence level, meaning that $\beta$% of the obtain value is 
larger than the theoretical quantile. This is a way to be conservative and to be able to quantify how 
conservative one wants to be.  To do this, the size of the sample must follow a necessary condition: 

```{math}
n > \frac{\ln(1-\beta)}{\ln p}
```
This is the smallest sample size to get an estimation, and, in most cases, the accuracy reached (for 
a given sample size) is better than the one achieved with the simpler solution provided above. It is 
also possible to increase the sample size to get a better description of the quantile estimation.

{{
    "```{" "figure" "} " + parent_dir +
        "/methodology/statistics/figures/computeQuantileComparisontheo.png\n"
    ":align: center\n"
    ":name: dataserver_compute_quantile\n"
    + figure_scale_reduced + "\n"
    "\n"
    "Illustration of the results of 100000 quantile determinations, applied to a reduced centered 
    gaussian distribution, comparing the usual and Wilks methods. The number of points in the 
    reduced centered gaussian distribution is varied, as well as the confidence level.\n"
    "```"
}}

{numref}`dataserver_compute_quantile` shows a simple case: the estimation of the value of the 95% 
quantile of a centered-reduced normal distribution. The theoretical value (red dashed line) is 
compared to the results of 100000 empirical estimation, following the simple recipe (black and blue 
curves) or the Wilks method (red, green and magenta curves). Several conclusions can be drawn:

- The simpler quantile estimation average is slightly biased with respect to the theoretical value. 
This is due to the choice of k, discussed in [](#dataserver_statistics_compute_quantile) which can 
lead to under or over estimation of the quantile value. The bias becomes smaller with the increasing 
sample size.

- The standard deviation of the distributions (whatever method is considered) is becoming smaller with 
the increasing sample size.

- When using the Wilks method, the fraction of event below the theoretical value is becoming smaller 
with the increasing confidence level.