2.4.2. Computing the ranking
The ranking of variable is used in many methods that are focusing more on monotony than on linearity (this is discussed throughout this documentation when coping with regression, correlation matrix, …). The way this is done in Uranie is the following: for every attribute considered, (which means all attributes by default if the function is called without argument) a new attribute is created, whose name is constructed as the name of the considered attribute with the prefix “Rk_”. The ranking consists, for a simple double-precision attribute, in assigning to each attribute entry an integer, that goes from 1 to the number of patterns, following an order relation (in Uranie it is chosen so that 1 is the smallest value and \(N\) is the largest one).
This method has been modified in order to cope with constant size vectors, but also to stabilise its
behaviour when going from one compiler version to another. The first modification only consists in
considering every element of a constant-size vector independent from the others, so every element is
in fact treated as if they were different attributes. The second part is more technical as the sorting
method has been changed to use the std::stable_sort insuring that platforms (operating systems and
compiler versions) will have the same behaviour. The main problem was raising when two patterns had
the same value for the attribute under study. In this case, the ranking was not done in the same way
depending on the version of the compiler. Now it should be treated in the same way: if two or more
patterns have the same value for a specific attribute, the first met in the array of attribute value
will have the value \(i\) while the second one will be affected with \(i+1\) and so on… Here is a small
example of this computation:
"""
Example of rank usage for illustration purpose
"""
from URANIE import DataServer
tdsGeyser = DataServer.TDataServer("geyser", "poet")
tdsGeyser.fileDataRead("geyser.dat")
tdsGeyser.computeRank("x1")
tdsGeyser.computeStatistic("Rk_x1")
print("NPatterns="+str(tdsGeyser.getNPatterns())+"; min(Rk_x1)= " +
str(tdsGeyser.getAttribute("Rk_x1").getMinimum())+"; max(Rk_x1)= " +
str(tdsGeyser.getAttribute("Rk_x1").getMaximum()))
This macro should returns
NPatterns=272; min(Rk_x1)= 1.0; max(Rk_x1)= 272.0
Summary: computeRank
computeRank(const char* varexp=”*”, option* option)Create a new attribute for every attribute requested (or for all attributes if no argument is provided)
String-type and non-constant-vector-type attribute are disregarded and a warning is shown to let the user know.