Documentation / User's manual in C++ :
In a first step, to get accustomed with TDataServer
, we propose different macros related to this subject. Since it
constitutes the preliminary and almost mandatory step of a proper use of Uranie, these macros are only for
educational purposes.
The goal of this macro is only to master the objects TAttribute
and TDataServer
of Uranie. Three
attributes will be created and linked to a TDataServer
object, then the log of this object will be
printed to check internal data of this TDataServer
.
{
// Define the attribute "x"
TAttribute *px = new TAttribute("x", -2.0, 4.0);
px->setTitle("#Delta P^{#sigma}");
px->setUnity("#frac{mm^{2}}{s}");
// Define the attribute "y"
TAttribute *py = new TAttribute("y", 0.0, 1.0);
// Define the DataServer of the study
TDataServer *tds = new TDataServer("tds", "my first TDS");
// Add the attributes in the TDataServer
tds->addAttribute(px);
tds->addAttribute(py);
tds->addAttribute(new TAttribute("z", 0.25, 0.50));
tds->printLog();
}
The first attribute "x" is defined on [-2.0, 4.0]; its title is and unity
TAttribute *px = new TAttribute("x", -2.0, 4.0);
px->setTitle("#Delta P^{#sigma}");
px->setUnity("#frac{mm^{2}}{s}");
The second attribute "y" is defined on [0.0,1.0]; it will be set with its name as title but without unity.
TAttribute *py = new TAttribute("y", 0.0, 1.0);
Secondly, a TDataServer
object is created and the two attributes x and y
created before are linked to this one.
TDataServer *tds = new TDataServer("tds", "my first TDS");
tds->addAttribute(px);
tds->addAttribute(py);
Finally, the last attribute z (defined on [0.25,0.50]) is directly added to the TDataServer
(its
title will be its name and it will be set without unity) by creating it. An attribute could, indeed, be added to a
TDataServer
meanwhile creating it, but then no other information than those available in the constructor would be set.
tds->addAttribute(new TAttribute("z", 0.25, 0.50));
Then, the log of the TDataServer
object is printed.
tds->printLog();
Generally speaking, all Uranie objects have the printLog
method which allows to print
internal data of the object.
Processing dataserverAttributes.C... --- Uranie v0.0/0 --- Developed with ROOT (6.32.02) Copyright (C) 2013-2024 CEA/DES Contact: support-uranie@cea.fr Date: Tue Jan 09, 2024 TDataServer::printLog[] Name[tds] Title[my first TDS] Origin[Unknown] _sdatafile[] _sarchivefile[_dataserver_.root] ******************************* ** TDataSpecification::printLog ****** ** Name[uheader__tds] Title[Header of my first TDS] ** relationName[Header of my first TDS] ** attributs[4] **** With _listOfAttributes Attribute[0/4] ******************************* ** TAttribute::printLog ****** ** Name[tds__n__iter__] ** Title[tds__n__iter__] ** unity[] ** type[0] ** share[1] ** Origin[kIterator] ** Attribute[kInput] ** _snote[] -------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------- ** - min[] max[] ** - mean[] std[] ** lowerBound[-1.42387e+64] upperBound[1.42387e+64] ** NOT _defaultValue[] ** NOT _stepValue[] ** No Attribute to substitute level[0] ******************************* Attribute[1/4] ******************************* ** TAttribute::printLog ****** ** Name[x] ** Title[#Delta P^{#sigma}] ** unity[#frac{mm^{2}}{s}] ** type[0] ** share[2] ** Origin[kAttribute] ** Attribute[kInput] ** _snote[] -------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------- ** - min[] max[] ** - mean[] std[] ** lowerBound[-2] upperBound[4] ** NOT _defaultValue[] ** NOT _stepValue[] ** No Attribute to substitute level[0] ******************************* Attribute[2/4] ******************************* ** TAttribute::printLog ****** ** Name[y] ** Title[y] ** unity[] ** type[0] ** share[2] ** Origin[kAttribute] ** Attribute[kInput] ** _snote[] -------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------- ** - min[] max[] ** - mean[] std[] ** lowerBound[0] upperBound[1] ** NOT _defaultValue[] ** NOT _stepValue[] ** No Attribute to substitute level[0] ******************************* Attribute[3/4] ******************************* ** TAttribute::printLog ****** ** Name[z] ** Title[z] ** unity[] ** type[0] ** share[2] ** Origin[kAttribute] ** Attribute[kInput] ** _snote[] -------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------- ** - min[] max[] ** - mean[] std[] ** lowerBound[0.25] upperBound[0.5] ** NOT _defaultValue[] ** NOT _stepValue[] ** No Attribute to substitute level[0] ******************************* ** TDataSpecification::fin de printLog ******************************* fin de TDataServer::printLog[]
The objective of this macro is to merge data contained in a TDataServer
with data contained in another TDataServer
. Both TDataServer
have to contain the same number of patterns. We choose here to merge two TDataServer
, loaded from two ASCII files, each
of which contains 9 patterns.
The first ASCII file "tds1.dat"
defines the four variables x, dy, z,
theta:
#COLUMN_NAMES: x| dy| z| theta #COLUMN_TITLES: x_{n}| "#delta y"| ""| #theta #COLUMN_UNITS: N| Sec| KM/Sec| M^{2} 1 1 11 11 1 2 12 21 1 3 13 31 2 1 21 12 2 2 22 22 2 3 23 32 3 1 31 13 3 2 32 23 3 3 33 33
and the second ASCII file "tds2.dat"
defines the four other variables x2,
y, u, ua:
#COLUMN_NAMES: x2| y| u| ua 1 1 102 11 1 2 104 12 1 3 106 13 2 1 202 21 2 2 204 22 2 3 206 23 3 1 302 31 3 2 304 32 3 3 306 33
The merging operation will be executed in the first TDataServer
tds1
; so it will contain all the
attributes at the end.
{
TDataServer * tds1 = new TDataServer();
TDataServer * tds2 = new TDataServer();
tds1->fileDataRead("tds1.dat");
cout<<"Dumping tds1"<<endl;
tds1->Scan("*");
tds2->fileDataRead("tds2.dat");
cout<<"Dumping tds2"<<endl;
tds2->Scan("*");
tds1->merge(tds2);
cout<<"Dumping merged tds1 and tds2"<<endl;
tds1->Scan("*","","colsize=3 col=9::::::::");
}
Both TDataServers are filled with ASCII data files with the method filedataRead()
.
tds1->fileDataRead("tds1.dat");
tds2->fileDataRead("tds2.dat");
Data of the second dataserver tds2
are then merged into the first one.
tds1->merge(tds2);
Data are then dumped in the terminal:
tds1->Scan("*");
Processing dataserverMerge.C... --- Uranie v0.0/0 --- Developed with ROOT (6.32.02) Copyright (C) 2013-2024 CEA/DES Contact: support-uranie@cea.fr Date: Tue Jan 09, 2024 Dumping tds1 ************************************************************************ * Row * tds1__n__ * x.x * dy.dy * z.z * theta.the * ************************************************************************ * 0 * 1 * 1 * 1 * 11 * 11 * * 1 * 2 * 1 * 2 * 12 * 21 * * 2 * 3 * 1 * 3 * 13 * 31 * * 3 * 4 * 2 * 1 * 21 * 12 * * 4 * 5 * 2 * 2 * 22 * 22 * * 5 * 6 * 2 * 3 * 23 * 32 * * 6 * 7 * 3 * 1 * 31 * 13 * * 7 * 8 * 3 * 2 * 32 * 23 * * 8 * 9 * 3 * 3 * 33 * 33 * ************************************************************************ Dumping tds2 ************************************************************************ * Row * tds2__n__ * x2.x2 * y.y * u.u * ua.ua * ************************************************************************ * 0 * 1 * 1 * 1 * 102 * 11 * * 1 * 2 * 1 * 2 * 104 * 12 * * 2 * 3 * 1 * 3 * 106 * 13 * * 3 * 4 * 2 * 1 * 202 * 21 * * 4 * 5 * 2 * 2 * 204 * 22 * * 5 * 6 * 2 * 3 * 206 * 23 * * 6 * 7 * 3 * 1 * 302 * 31 * * 7 * 8 * 3 * 2 * 304 * 32 * * 8 * 9 * 3 * 3 * 306 * 33 * ************************************************************************ Dumping merged tds1 and tds2 ************************************************************************ * Row * tds1__n__ * x.x * dy. * z.z * the * x2. * y.y * u.u * ua. * ************************************************************************ * 0 * 1 * 1 * 1 * 11 * 11 * 1 * 1 * 102 * 11 * * 1 * 2 * 1 * 2 * 12 * 21 * 1 * 2 * 104 * 12 * * 2 * 3 * 1 * 3 * 13 * 31 * 1 * 3 * 106 * 13 * * 3 * 4 * 2 * 1 * 21 * 12 * 2 * 1 * 202 * 21 * * 4 * 5 * 2 * 2 * 22 * 22 * 2 * 2 * 204 * 22 * * 5 * 6 * 2 * 3 * 23 * 32 * 2 * 3 * 206 * 23 * * 6 * 7 * 3 * 1 * 31 * 13 * 3 * 1 * 302 * 31 * * 7 * 8 * 3 * 2 * 32 * 23 * 3 * 2 * 304 * 32 * * 8 * 9 * 3 * 3 * 33 * 33 * 3 * 3 * 306 * 33 * ************************************************************************
The objective of this macro is to load two TDataServer
objects using two different ways: either with an ASCII file
"pasture.dat"
or with a design-of-experiments. Then, we evaluate the analytic function
ModelPasture
on this two TDataServer
. The data file "pasture.dat"
is
written in the "Salome-table" format of Uranie:
#COLUMN_NAMES: time| yield 9 8.93 14 10.8 21 18.59 28 22.33 42 39.35 57 56.11 63 61.73 70 64.62 79 67.08
#include "TMath.h"
void ModelPasture(Double_t *x, Double_t *y)
{
Double_t theta1=69.95, theta2=61.68, theta3=-9.209, theta4=2.378;
y[0] = theta1;
y[0] -= theta2* TMath::Exp( -1.0 * TMath::Exp( theta3 + theta4 * TMath::Log(x[0])));
}
void dataserverLoadASCIIFilePasture()
{
TCanvas *C = new TCanvas("mycanvas","mycanvas",1);
TDataServer* tds = new TDataServer();
tds->fileDataRead("pasture.dat");
tds->getTuple()->SetMarkerStyle(8);
tds->getTuple()->SetMarkerSize(1.5);
tds->draw("yield:time");
TLauncherFunction *tlf = new TLauncherFunction(tds, ModelPasture,"time","yhat");
tlf->run();
tds->getTuple()->SetMarkerColor(kBlue);
tds->getTuple()->SetLineColor(kBlue);
tds->draw("yhat:time","","lpsame");
TDataServer *tds2 = new TDataServer();
tds2->addAttribute( new TUniformDistribution("time2",9, 80));
TSampling *tsamp = new TSampling(tds2, "lhs", 1000);
tsamp->generateSample();
tds2->getTuple()->SetMarkerColor(kGreen);
tds2->getTuple()->SetLineColor(kGreen);
tlf = new TLauncherFunction(tds2, ModelPasture,"","yhat2");
tlf->run();
tds2->draw("yhat2:time2","","psame");
tds->draw("yhat:time","","lpsame");
gPad->SaveAs("pasture.png");
}
The design ModelPasture
is defined in a function
void ModelPasture(Double_t *x, Double_t *y)
{
Double_t theta1=69.95, theta2=61.68, theta3=-9.209, theta4=2.378;
y[0] = theta1;
y[0] -= theta2* TMath::Exp( -1.0 * TMath::Exp( theta3 + theta4 * TMath::Log(x[0])));
}
The first TDataServer
is filled with the ASCII file "pasture.dat"
through the
fileDataRead
method
tds->fileDataRead("pasture.dat");
The design is evaluated with the function ModelPasture
applied on the input attribute
time, leading to the output attribute named yhat.
TLauncherFunction *tlf = new TLauncherFunction(tds, ModelPasture,"time","yhat");
tlf->run();
A TAttribute
, obeying an uniform law on [9;80] is added to the second TDataServer
which is filled
with a design-of-experiments of 1000 patterns, using the LHS method.
tds2->addAttribute( new TUniformDistribution("time2",9, 80));
TSampling *tsamp = new TSampling(tds2, "lhs", 1000);
tsamp->generateSample();
The design is now evaluated with this TDataServer
on the attribute time2
tlf = new TLauncherFunction(tds2, ModelPasture,"","yhat2");
tlf->run();
Loading data in a TDataServer
, using the "Salome-table" format of Uranie and applying basic visualisation methods on
attributes.
The data file is named "flowrateUniformDesign.dat"
and data correspond to an
Uniform design-of-experiments of 32 patterns for a "code" with 8 inputs (, , , , , , , ) along with a response
("yhat"). The data file "flowrateUniformDesign.dat"
is in the
"Salome-table" format of Uranie.
#NAME: flowrateborehole #TITLE: Uniform design of flow rate borehole problem proposed by Ho and Xu(2000) #COLUMN_NAMES: rw| r| tu| tl| hu| hl| l| kw | ystar #COLUMN_TITLES: r_{#omega}| r | T_{u} | T_{l} | H_{u} | H_{l} | L | K_{#omega} | y^{*} #COLUMN_UNITS: m | m | m^{2}/yr | m^{2}/yr | m | m | m | m/yr | m^{3}/yr 0.0500 33366.67 63070.0 116.00 1110.00 768.57 1200.0 11732.14 26.18 0.0500 100.00 80580.0 80.73 1092.86 802.86 1600.0 10167.86 14.46 0.0567 100.00 98090.0 80.73 1058.57 717.14 1680.0 11106.43 22.75 0.0567 33366.67 98090.0 98.37 1110.00 734.29 1280.0 10480.71 30.98 0.0633 100.00 115600.0 80.73 1075.71 751.43 1600.0 11106.43 28.33 0.0633 16733.33 80580.0 80.73 1058.57 785.71 1680.0 12045.00 24.60 0.0700 33366.67 63070.0 98.37 1092.86 768.57 1200.0 11732.14 48.65 0.0700 16733.33 115600.0 116.00 990.00 700.00 1360.0 10793.57 35.36 0.0767 100.0 115600.0 80.73 1075.71 751.43 1520.0 10793.57 42.44 0.0767 16733.33 80580.0 80.73 1075.71 802.86 1120.0 9855.00 44.16 0.0833 50000.00 98090.0 63.10 1041.43 717.14 1600.0 10793.57 47.49 0.0833 50000.00 115600.0 63.10 1007.14 768.57 1440.0 11419.29 41.04 0.0900 16733.33 63070.0 116.00 1075.71 751.43 1120.0 11419.29 83.77 0.0900 33366.67 115600.0 116.00 1007.14 717.14 1360.0 11106.43 60.05 0.0967 50000.00 80580.0 63.10 1024.29 820.00 1360.0 9855.00 43.15 0.0967 16733.33 80580.0 98.37 1058.57 700.00 1120.0 10480.71 97.98 0.1033 50000.00 80580.0 63.10 1024.29 700.00 1520.0 10480.71 74.44 0.1033 16733.33 80580.0 98.37 1058.57 820.00 1120.0 10167.86 72.23 0.1100 50000.00 98090.0 63.10 1024.29 717.14 1520.0 10793.57 82.18 0.1100 100.00 63070.0 98.37 1041.43 802.86 1600.0 12045.00 68.06 0.1167 33366.67 63070.0 116.00 990.00 785.71 1280.0 12045.00 81.63 0.1167 100.00 98090.0 98.37 1092.86 802.86 1680.0 9855.00 72.5 0.1233 16733.33 115600.0 80.73 1092.86 734.29 1200.0 11419.29 161.35 0.1233 16733.33 63070.0 63.10 1041.43 785.71 1680.0 12045.00 86.73 0.1300 33366.67 80580.0 116.00 1110.00 768.57 1280.0 11732.14 164.78 0.1300 100.00 98090.0 98.37 1110.00 820.00 1280.0 10167.86 121.76 0.1367 50000.00 98090.0 63.10 1007.14 820.00 1440.0 10167.86 76.51 0.1367 33366.67 98090.0 116.00 1024.29 700.00 1200.0 10480.71 164.75 0.1433 50000.00 63070.0 116.00 990.00 785.71 1440.0 9855.00 89.54 0.1433 50000.00 115600.0 63.10 1007.14 734.29 1440.0 11732.14 141.09 0.1500 33366.67 63070.0 98.37 990.00 751.43 1360.0 11419.29 139.94 0.1500 100.00 115600.0 80.73 1041.43 734.29 1520.0 11106.43 157.59
{
// Create a TDataServer
TDataServer * tds = new TDataServer();
// Load the data base in the DataServer
tds->fileDataRead("flowrateUniformDesign.dat");
// Graph
TCanvas *Canvas = new TCanvas("c1", "Graph for the Macro loadASCIIFile",5,64,1270,667);
TPad *pad = new TPad("pad","pad",0, 0.03, 1, 1); pad->Draw();
pad->Divide(2,2);
pad->cd(1); tds->draw("ystar");
pad->cd(2); tds->draw("ystar:rw");
pad->cd(3); tds->drawTufte("ystar:rw");
pad->cd(4); tds->drawProfile("ystar:rw");
tds->startViewer();
}
The objective of this macro is to load the ASCII data file "youngsmodulus.dat"
and to apply
visualisations on the attribute E with different options. The data file
"youngsmodulus.dat"
is in the "Salome-table" format of Uranie.
#NAME: youngsmodulus #TITLE: Young's Modulus E for the Golden Gate Bridge #COLUMN_NAMES: E #COLUMN_TITLES: Young's Modulues #COLUMN_UNITS: ksi 28900 29200 27400 28700 28400 29900 30200 29500 29600 28400 28300 29300 29300 28100 30200 30200 30300 31200 28800 27600 29600 25900 32000 33400 30600 32700 31300 30500 31300 29000 29400 28300 30500 31100 29300 27400 29300 29300 31300 27500 29400
Data are then exported in header file "youngsmodulus.h"
which can be imported in some C file:
// File "youngsmodulus.h" generated by ROOT v5.34/32
// DateTime Tue Nov 3 10:40:13 2015
// DataServer : Name="youngsmodulus" Title="Young's Modulus E for the Golden Gate Bridge" Global Select=""
#define youngsmodulus_nPattern 41
// Attribute Name="E"
Double_t E[youngsmodulus_nPattern] = {
2.890000000e+04,
2.920000000e+04,
2.740000000e+04,
2.870000000e+04,
2.840000000e+04,
2.990000000e+04,
3.020000000e+04,
2.950000000e+04,
2.960000000e+04,
2.840000000e+04,
2.830000000e+04,
2.930000000e+04,
2.930000000e+04,
2.810000000e+04,
3.020000000e+04,
3.020000000e+04,
3.030000000e+04,
3.120000000e+04,
2.880000000e+04,
2.760000000e+04,
2.960000000e+04,
2.590000000e+04,
3.200000000e+04,
3.340000000e+04,
3.060000000e+04,
3.270000000e+04,
3.130000000e+04,
3.050000000e+04,
3.130000000e+04,
2.900000000e+04,
2.940000000e+04,
2.830000000e+04,
3.050000000e+04,
3.110000000e+04,
2.930000000e+04,
2.740000000e+04,
2.930000000e+04,
2.930000000e+04,
3.130000000e+04,
2.750000000e+04,
2.940000000e+04,
};
// End of attribute E
// End of File youngsmodulus.h
{
TDataServer * tds = new TDataServer();
tds->fileDataRead("youngsmodulus.dat");
// gEnv->SetValue("Hist.Binning.1D.x", 10);
// tds->getTuple()->Draw("E>>Attribute E(6, 25000, 34000)","","text");
// tds->getTuple()->Draw("E>>Attribute E(16, 25000, 34000)");
tds->computeStatistic("E");
tds->getAttribute("E")->printLog();
tds->exportDataHeader("youngsmodulus.h");
TCanvas *Canvas = new TCanvas("c1", "Graph for the Macro loadASCIIFile",5,64,1270,667);
TPad *pad = new TPad("pad","pad",0, 0.03, 1, 1); pad->Draw();
pad->Divide(2,2);
pad->cd(1); tds->draw("E");
pad->cd(2); tds->draw("E" ,"", "nclass=sturges");
pad->cd(3); tds->draw("E" ,"", "nclass=scott");
pad->cd(4); tds->draw("E" ,"", "nclass=fd");
}
The TDataServer
is filled with the ASCII data file "youngsmodulus.dat"
with the method
fileDataRead
:
tds->fileDataRead("youngsmodulus.dat");
Variable E is then visualised with different options:
tds->draw("E" ,"", "nclass=sturges");
tds->draw("E" ,"", "nclass=scott");
tds->draw("E" ,"", "nclass=fd");
Characteristic values are computed (maximum, minimum, mean and standard deviation) with:
tds->computeStatistic("E");
Data are exported in a header file with
tds->exportDataHeader("youngsmodulus.h");
Processing loadASCIIFileYoungsModulus.C... --- Uranie v0.0/0 --- Developed with ROOT (6.32.02) Copyright (C) 2013-2024 CEA/DES Contact: support-uranie@cea.fr Date: Tue Jan 09, 2024 ******************************* ** TAttribute::printLog ****** ** Name[E] ** Title[ Young's Modulues] ** unity[ksi] ** type[0] ** share[1] ** Origin[kAttribute] ** Attribute[kInput] ** _snote[] -------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------- ** - min[25900] max[33400] ** - mean[29575.6] std[1506.95] ** lowerBound[-1.42387e+64] upperBound[1.42387e+64] ** NOT _defaultValue[] ** NOT _stepValue[] ** No Attribute to substitute level[0] *******************************
The objective of this macro is to load the ASCII data file ionosphere.dat
which defines 34
input variables and one output variable y and applies visualisation on one of
these variables. The data file ionosphere.dat
is in the "Salome-table" format of Uranie but
is not shown for convenience.
{
TDataServer * tds = new TDataServer();
tds->fileDataRead("ionosphere.dat");
tds->getAttribute("x28")->SetTitle("#Delta P_{e}^{F_{iso}}");
// Graph
TCanvas *Canvas = new TCanvas("c1", "Graph for the Macro loadASCIIFileIonosphere",5,64,1270,667);
tds->draw("x28");
}
The TDataServer
is filled with ionosphere.dat
with the fileDataRead
method
tds->fileDataRead("ionosphere.dat");
A new title is set for the variable x28
tds->getAttribute("x28")->SetTitle("#Delta P_{e}^{F_{iso}}");
This variable is then drawn with its new title
tds->draw("x28");
The objective of this macro is to load the ASCII data file cornell.dat
which defines seven
input variables and one output variable y on twelve patterns. The input file
cornell.dat
is in the "Salome-table" format of Uranie
#NAME: cornell #TITLE: Dataset Cornell 1990 #COLUMN_NAMES: x1 | x2 | x3 | x4 | x5 | x6 | x7 | y #COLUMN_TITLES: x_{1} | x_{2} | x_{3} | x_{4} | x_{5} | x_{6} | x_{7} | y 0.00 0.23 0.00 0.00 0.00 0.74 0.03 98.7 0.00 0.10 0.00 0.00 0.12 0.74 0.04 97.8 0.00 0.00 0.00 0.10 0.12 0.74 0.04 96.6 0.00 0.49 0.00 0.00 0.12 0.37 0.02 92.0 0.00 0.00 0.00 0.62 0.12 0.18 0.08 86.6 0.00 0.62 0.00 0.00 0.00 0.37 0.01 91.2 0.17 0.27 0.10 0.38 0.00 0.00 0.08 81.9 0.17 0.19 0.10 0.38 0.02 0.06 0.08 83.1 0.17 0.21 0.10 0.38 0.00 0.06 0.08 82.4 0.17 0.15 0.10 0.38 0.02 0.10 0.08 83.2 0.21 0.36 0.12 0.25 0.00 0.00 0.06 81.4 0.00 0.00 0.00 0.55 0.00 0.37 0.08 88.1
{
TDataServer * tds = new TDataServer();
tds->fileDataRead("cornell.dat");
TMatrix matCorr = tds->computeCorrelationMatrix("");
matCorr.Print();
}
The TDataServer
is filled with cornell.dat
with the fileDataRead
method:
tds->fileDataRead("cornell.dat");
Then the correlation matrix is computed on all attributes:
TMatrix matCorr = tds->computeCorrelationMatrix("");
Processing loadASCIIFileCornell.C... --- Uranie v0.0/0 --- Developed with ROOT (6.32.02) Copyright (C) 2013-2024 CEA/DES Contact: support-uranie@cea.fr Date: Tue Jan 09, 2024 8x8 matrix is as follows | 0 | 1 | 2 | 3 | 4 | ---------------------------------------------------------------------- 0 | 1 0.1042 0.9999 0.3707 -0.548 1 | 0.1042 1 0.1008 -0.5369 -0.2926 2 | 0.9999 0.1008 1 0.374 -0.5482 3 | 0.3707 -0.5369 0.374 1 -0.2113 4 | -0.548 -0.2926 -0.5482 -0.2113 1 5 | -0.8046 -0.1912 -0.8052 -0.6457 0.4629 6 | 0.6026 -0.59 0.6071 0.9159 -0.2744 7 | -0.8373 -0.07082 -0.838 -0.7067 0.4938 | 5 | 6 | 7 | ---------------------------------------------------------------------- 0 | -0.8046 0.6026 -0.8373 1 | -0.1912 -0.59 -0.07082 2 | -0.8052 0.6071 -0.838 3 | -0.6457 0.9159 -0.7067 4 | 0.4629 -0.2744 0.4938 5 | 1 -0.6564 0.9851 6 | -0.6564 1 -0.7411 7 | 0.9851 -0.7411 1
The objective of this macro is to test the classical quantile estimation and compare it to the Wilks estimation for a dummy gaussian distribution. Four different estimations of the 95% quantile value are done (along with one estimation of th 99% quantile) for illustration purposes:
- using the usual method with a 200-points sample.
- using the usual method with a 400-points sample.
- using the Wilks method with a 95% confidence level (with 59-points sample).
- using the Wilks method with a 95% confidence level (with 400-points sample).
- using the Wilks method with a 99% confidence level (with 90-points sample).
{
//Create a DataServer
TDataServer *tds = new TDataServer("foo","pouet");
tds->addAttribute("x"); //With one attribute
//Create Histogram to store the quantile values
TH1F *Q200 = new TH1F("quantile200","",60,1,4); Q200->SetLineColor(1); Q200->SetLineWidth(2);
TH1F *Q400 = new TH1F("quantile400","",60,1,4); Q400->SetLineColor(4); Q400->SetLineWidth(2);
TH1F *QW95 = new TH1F("quantileWilks95","",60,1,4); QW95->SetLineColor(2); QW95->SetLineWidth(2);
TH1F *QW95400 = new TH1F("quantileWilks95400","",60,1,4); QW95400->SetLineColor(8); QW95400->SetLineWidth(2);
TH1F *QW99 = new TH1F("quantileWilks99","",60,1,4); QW99->SetLineColor(6); QW99->SetLineWidth(2);
//Defining the sample size
int nb[4] = {200,400,59,90};
double quant=0.95; //Quantile value
double CL[2] = {0.95, 0.99};//Confidence level value for the two Wilks computation
//Loop over the number of estimation (2 usual with different number of sample and 1 with Wilks estimation)
for(unsigned int iq=0; iq<4; iq++)
{
//Produce 10000 drawing to get smooth distributio,
for(unsigned int itest=0; itest < 10000; itest++)
{
tds->createTuple(); // Create the tuple to store value
// Fill it with random drawing of centered gaussian
for(unsigned int ival=0; ival < nb[iq]; ival++)
tds->getTuple()->Fill(ival+1, gRandom->Gaus(0,1) );
// Estimate the quantile...
double value;
if (iq<2)
{
// ... with usual methods...
tds->computeQuantile("x",quant, value);
if(iq==0) Q200->Fill(value); // ... on a 200-points sample
else
{
Q400->Fill(value); // ... on a 400-points sample
tds->estimateQuantile("x", quant, value, CL[iq-1]);
QW95400->Fill(value); // compute the quantile at 95% CL
}
}
else
{
// ... with the wilks optimised sample
tds->estimateQuantile("x", quant, value, CL[iq-2]);
if(iq==2) QW95->Fill(value); // compute the quantile at 95% CL
else QW99->Fill(value); // compute the quantile at 99% CL
}
// Delete the tuple
tds->deleteTuple();
}
}
//Produce the plot with requested style
gStyle->SetOptStat(0);
TCanvas *can = new TCanvas("Can","Can",10,10,1000,1000);
Q400->GetXaxis()->SetTitle("Quant_{95%}(Gaus_{(0,1)})");
Q400->GetXaxis()->SetTitleOffset(1.2);
Q400->Draw();
Q200->Draw("same");
QW95->Draw("same");
QW95400->Draw("same");
QW99->Draw("same");
//Add the theoretical estimation
TLine *lin = new TLine(); lin->SetLineStyle(3);
lin->DrawLine(1.645,0,1.645, Q400->GetMaximum());
//Add a block of legend
TLegend *leg = new TLegend(0.4,0.6,0.8,0.85);
leg->AddEntry(lin,"Theoretical quantile","l");
leg->AddEntry(Q200, "Usual quantile (200 pts)","l");
leg->AddEntry(Q400, "Usual quantile (400 pts)","l");
leg->AddEntry(QW95, "Wilks quantile CL=95% (59 pts)","l");
leg->AddEntry(QW95400, "Wilks quantile CL=95% (400 pts)","l");
leg->AddEntry(QW99, "Wilks quantile CL=99% (90 pts)","l");
leg->SetBorderSize(0);
leg->Draw();
}
In this macro, a dummy dataserver is created with a single attribute named "x". Four histograms are prepared to store the resulting value. Then the same loop will be used to computed 10000 values of every quantile with different switchs to use one method instead of the other, or to change the number of points in the sample and/or the confidence level. All this is defined in the small part before the loop:
//Defining the sample size
int nb[4] = {200,400,59,90};
double quant=0.95; //Quantile value
double CL[2] = {0.95, 0.99};//Confidence level value for the two Wilks computation
Then the computation is performed, first for the usual method (first two iterations of iq
), then for
the Wilks estimation (last two iterations of iq
). Every computational result is stored in the
corresponding histogram which is finally displayed and shown in the following subsection.
This part shows the complete code used to produce the console display in Section II.4.3.
{
TDataServer *tdsGeyser =new TDataServer("geyser","poet");
tdsGeyser->fileDataRead("geyser.dat");
tdsGeyser->computeStatistic("x1");
cout<<"min(x1)= "<<tdsGeyser->getAttribute("x1")->getMinimum()<<"; max(x1)= "<<tdsGeyser->getAttribute("x1")->getMaximum()
<<"; mean(x1)= "<<tdsGeyser->getAttribute("x1")->getMean()<<"; std(x1)= "<<tdsGeyser->getAttribute("x1")->getStd()<<endl;
}
This part shows the complete code used to produce the console display in Section II.4.2.
{
TDataServer *tdsGeyser =new TDataServer("geyser","poet");
tdsGeyser->fileDataRead("geyser.dat");
tdsGeyser->computeRank("x1");
tdsGeyser->computeStatistic("Rk_x1");
cout<<"NPatterns="<<tdsGeyser->getNPatterns()<<"; min(Rk_x1)= "<<tdsGeyser->getAttribute("Rk_x1")->getMinimum()
<<"; max(Rk_x1)= "<<tdsGeyser->getAttribute("Rk_x1")->getMaximum()<<endl;
}
This part shows the complete code used to produce the console display in Section II.4.1.
{
TDataServer *tdsop =new TDataServer("foo","pouet");
tdsop->fileDataRead("tdstest.dat");
//Compute a global normalisation of v, CenterReduced
tdsop->normalize("v","GCR",TDataServer::kCR,true);
//Compute a normalisation of v, CenterReduced (not global but entry by entry)
tdsop->normalize("v","CR",TDataServer::kCR,false);
//Compute a global normalisation of v, Centered
tdsop->normalize("v","GCent",TDataServer::kCentered);
//Compute a normalisation of v, Centered (not global but entry by entry)
tdsop->normalize("v","Cent",TDataServer::kCentered,false);
//Compute a global normalisation of v, ZeroOne
tdsop->normalize("v","GZO",TDataServer::kZeroOne);
//Compute a normalisation of v, ZeroOne (not global but entry by entry)
tdsop->normalize("v","ZO",TDataServer::kZeroOne,false);
//Compute a global normalisation of v, MinusOneOne
tdsop->normalize("v","GMOO",TDataServer::kMinusOneOne,true);
//Compute a normalisation of v, MinusOneOne (not global but entry by entry)
tdsop->normalize("v","MOO",TDataServer::kMinusOneOne,false);
tdsop->scan("v:vGCR:vCR:vGCent:vCent:vGZO:vZO:vGMOO:vMOO","","colsize=4 col=2:5::::::::");
}
This macro should result in this output in console:
************************************************************************************* * Row * Instance * v * vGCR * vCR * vGCe * vCen * vGZO * vZO * vGMO * vMOO * ************************************************************************************* * 0 * 0 * 1 * -1.46 * -1 * -4 * -3 * 0 * 0 * -1 * -1 * * 0 * 1 * 2 * -1.09 * -1 * -3 * -3 * 0.12 * 0 * -0.7 * -1 * * 0 * 2 * 3 * -0.73 * -1 * -2 * -3 * 0.25 * 0 * -0.5 * -1 * * 1 * 0 * 4 * -0.36 * 0 * -1 * 0 * 0.37 * 0.5 * -0.2 * 0 * * 1 * 1 * 5 * 0 * 0 * 0 * 0 * 0.5 * 0.5 * 0 * 0 * * 1 * 2 * 6 * 0.365 * 0 * 1 * 0 * 0.62 * 0.5 * 0.25 * 0 * * 2 * 0 * 7 * 0.730 * 1 * 2 * 3 * 0.75 * 1 * 0.5 * 1 * * 2 * 1 * 8 * 1.095 * 1 * 3 * 3 * 0.87 * 1 * 0.75 * 1 * * 2 * 2 * 9 * 1.460 * 1 * 4 * 3 * 1 * 1 * 1 * 1 * *************************************************************************************
This part shows the complete code used to produce the console display in Section II.4.3.1.
{
TDataServer *tdsop =new TDataServer("foo","poet");
tdsop->fileDataRead("tdstest.dat");
//Considering every element of a vector independent from the others
tdsop->computeStatistic("x");
TAttribute *px = tdsop->getAttribute("x");
cout<<"min(x[0])= "<<px->getMinimum(0)<<"; max(x[0])= "<<px->getMaximum(0)
<<"; mean(x[0])= "<<px->getMean(0)<<"; std(x[0])= "<<px->getStd(0)<<endl;
cout<<"min(x[1])= "<<px->getMinimum(1)<<"; max(x[1])= "<<px->getMaximum(1)
<<"; mean(x[1])= "<<px->getMean(1)<<"; std(x[1])= "<<px->getStd(1)<<endl;
cout<<"min(x[2])= "<<px->getMinimum(2)<<"; max(x[2])= "<<px->getMaximum(2)
<<"; mean(x[2])= "<<px->getMean(2)<<"; std(x[2])= "<<px->getStd(2)<<endl;
cout<<"min(xtot)= "<<px->getMinimum(3)<<"; max(xtot)= "<<px->getMaximum(3)
<<"; mean(xtot)= "<<px->getMean(3)<<"; std(xtot)= "<<px->getStd(3)<<endl;
//Statistic for a single realisation of a vector, not considering other events
tdsop->addAttribute("Min_x","Min$(x)");
tdsop->addAttribute("Max_x","Max$(x)");
tdsop->addAttribute("Mean_x","Sum$(x)/Length$(x)");
tdsop->scan("x:Min_x:Max_x:Mean_x","","colsize=5 col=2::::");
}
This macro should result in this output in console, split in two parts, the first one being from Uranie's method
min(x[0])= 1; max(x[0])= 7; mean(x[0])= 3; std(x[0])= 3.4641 min(x[1])= 2; max(x[1])= 8; mean(x[1])= 4.66667; std(x[1])= 3.05505 min(x[2])= 3; max(x[2])= 9; mean(x[2])= 6.66667; std(x[2])= 3.21455 min(xtot)= 1; max(xtot)= 9; mean(xtot)= 4.77778; std(xtot)= 3.23179
The second on the other hand results from ROOT's methods (the second part of the code shown above):
***************************************************** * Row * Instance * x * Min_x * Max_x * Mean_x * ***************************************************** * 0 * 0 * 1 * 1 * 3 * 2 * * 0 * 1 * 2 * 1 * 3 * 2 * * 0 * 2 * 3 * 1 * 3 * 2 * * 1 * 0 * 7 * 7 * 9 * 8 * * 1 * 1 * 8 * 7 * 9 * 8 * * 1 * 2 * 9 * 7 * 9 * 8 * * 2 * 0 * 1 * 1 * 8 * 4.3333 * * 2 * 1 * 4 * 1 * 8 * 4.3333 * * 2 * 2 * 8 * 1 * 8 * 4.3333 * *****************************************************
This part shows the complete code used to produce the console display in Section II.4.5.1.
{
TDataServer *tdsop =new TDataServer("foo","poet");
tdsop->fileDataRead("tdstest.dat");
// Consider a and x attributes (every element of the vector)
TMatrixD globalOne = tdsop->computeCorrelationMatrix("x:a");
globalOne.Print();
// Consider a and x attributes (cherry-picking a single element of the vector)
TMatrixD focusedOne = tdsop->computeCorrelationMatrix("x[1]:a");
focusedOne.Print();
}
This macro should result in this output in console.
4x4 matrix is as follows | 0 | 1 | 2 | 3 | --------------------------------------------------------- 0 | 1 0.9449 0.6286 0.189 1 | 0.9449 1 0.8486 0.5 2 | 0.6286 0.8486 1 0.8825 3 | 0.189 0.5 0.8825 1 2x2 matrix is as follows | 0 | 1 | ------------------------------- 0 | 1 0.5 1 | 0.5 1
This part shows the complete code used to produce the console display in Section II.4.4.1.
{
TDataServer *tdsvec = new TDataServer("foo", "bar");
tdsvec->fileDataRead("aTDSWithVectors.dat");
double probas[3]={0.2, 0.6, 0.8}; double quants[3];
tdsvec->computeQuantile("rank", 3, probas, quants);
TAttribute *prank = tdsvec->getAttribute("rank");
int nbquant;
prank->getQuantilesSize(nbquant); // (1)
cout << "nbquant = " << nbquant << endl;
double aproba=0.8; double aquant;
prank->getQuantile(aproba, aquant); // (2)
cout << "aproba = " << aproba << ", aquant = " <<
aquant << endl;
double theproba[3], thequant[3];
prank->getQuantiles(theproba, thequant); // (3)
for(int i_quant=0; i_quant<nbquant; ++i_quant) {
cout << "(theproba, thequant)[" << i_quant << "] = "
<< "(" << theproba[i_quant] << ", " <<
thequant[i_quant] << ")" << endl;
}
vector<double> allquant;
prank->getQuantileVector(aproba, allquant); // (4)
cout << "aproba = " << aproba << ", allquant = ";
for(double quant_i: allquant)
cout << quant_i << " ";
cout << endl;
}
This macro is an example of how to produce QQ-plot for a certain number of randomly-drawn samples, providing the correct parameter values along with modified versions to illustrate the impact.
{
// Create a TDS with 8 kind of distributions
double p1=1.3, p2=4.5, p3=0.9, p4=4.4; // Fixed values for parameters
TDataServer *tds0 = new TDataServer();
tds0->addAttribute(new TNormalDistribution("norm", p1, p2));
tds0->addAttribute(new TLogNormalDistribution("logn", p1, p2));
tds0->addAttribute(new TUniformDistribution("unif", p1, p2));
tds0->addAttribute(new TExponentialDistribution("expo", p1, p2));
tds0->addAttribute(new TGammaDistribution("gamm", p1, p2, p3));
tds0->addAttribute(new TBetaDistribution("beta", p1, p2, p3, p4));
tds0->addAttribute(new TWeibullDistribution("weib", p1, p2, p3));
tds0->addAttribute(new TGumbelMaxDistribution("gumb", p1, p2));
// Create the sample
TBasicSampling *fsamp = new TBasicSampling(tds0, "lhs", 200);
fsamp->generateSample();
// Define number of laws, their name and numbers of parameters
unsigned int nLaws=8;
string laws[8]={"normal", "lognormal", "uniform", "gamma", "weibull", "beta", "exponential", "gumbelmax"}; // number of parameters to put in () for the corresponding law
int npar[8]={2, 2, 2, 3, 3, 4, 2, 2};
//Create the canvas
TCanvas *c = new TCanvas("c1","",800,1000);
//Create the 8 pads
TPad *apad = new TPad("apad","apad",0, 0.03, 1, 1); apad->Draw(); apad->cd();
apad->Divide(2,4);
// Number of points to compare theoretical and empirical values
int nS=1000;
double mod=0.8; // Factor used to artificially change the parameter values
TString opt=""; //option of the drawQQPlot method
stringstream sstr;
for(unsigned int ilaw=0; ilaw<nLaws; ilaw++)
{
// Clean sstr
sstr.str("");
// Add nominal configuration
sstr << laws[ilaw] << "("<<p1<<","<<p2<<((npar[ilaw]>=3)?Form(",%g",p3):"")<<((npar[ilaw]>=4)?Form(",%g",p4):"")<<")";
// Changing par1
sstr << ":" << laws[ilaw] << "("<<p1*mod<<","<<p2<<((npar[ilaw]>=3)?Form(",%g",p3):"")<<((npar[ilaw]>=4)?Form(",%g",p4):"")<<")";
// Changing par2
sstr << ":" << laws[ilaw] << "("<<p1<<","<<p2*mod<<((npar[ilaw]>=3)?Form(",%g",p3):"")<<((npar[ilaw]>=4)?Form(",%g",p4):"")<<")";
// Changing par3
if(npar[ilaw] >=3 )
sstr << ":" << laws[ilaw] << "("<<p1<<","<<p2<<((npar[ilaw]>=3)?Form(",%g",p3*mod):"")<<((npar[ilaw]>=4)?Form(",%g",p4):"")<<")";
// Changing par4
if(npar[ilaw] >=4 )
sstr << ":" << laws[ilaw] << "("<<p1<<","<<p2<<((npar[ilaw]>=3)?Form(",%g",p3):"")<<((npar[ilaw]>=4)?Form(",%g",p4*mod):"")<<")";
//cout<<sstr.str()<<endl;
apad->cd(ilaw+1);
// Produce the plot
tds0->drawQQPlot( laws[ilaw].substr(0,4).c_str(), sstr.str().c_str(), nS, opt);
}
}
The very first step of this macro is to create a sample that will contain a design-of-experiments filled with 200 locations, using
various statistical laws. All the tested laws, are those available in the drawQQPlot
method and they might depend on 2 to 4 parameters, defined a but randomly at the beginning of this piece of code.
// Create a TDS with 8 kind of distributions
double p1=1.3, p2=4.5, p3=0.9, p4=4.4; // Fixed values for parameters
TDataServer *tds0 = new TDataServer();
tds0->addAttribute(new TNormalDistribution("norm", p1, p2));
tds0->addAttribute(new TLogNormalDistribution("logn", p1, p2));
tds0->addAttribute(new TUniformDistribution("unif", p1, p2));
tds0->addAttribute(new TExponentialDistribution("expo", p1, p2));
tds0->addAttribute(new TGammaDistribution("gamm", p1, p2, p3));
tds0->addAttribute(new TBetaDistribution("beta", p1, p2, p3, p4));
tds0->addAttribute(new TWeibullDistribution("weib", p1, p2, p3));
tds0->addAttribute(new TGumbelMaxDistribution("gumb", p1, p2));
Once done, the sample is generated using TBasicSampling
object with an LHS algorithm. On top
of this, despite the plot preparation with canvas and pad generation, several variables are set to prepare the
tests, as shown below
// Define number of laws, their name and numbers of parameters
unsigned int nLaws=8;
string laws[8]={"normal", "lognormal", "uniform", "gamma", "weibull", "beta", "exponential", "gumbelmax"};
int npar[8]={2, 2, 2, 3, 3, 4, 2, 2}; // number of parameters to put in () for the corresponding law
// Number of points to compare theoretical and empirical values
int nS=1000;
double mod=0.8; // Factor used to artificially change the parameter values
Finally, after the line of hypothesis to be tested is constructed (the first paragraph in the for loop) the
drawQQPlot
method is called for every empirical law in the following line.
tds0->drawQQPlot( laws[ilaw].substr(0,4).c_str(), sstr.str().c_str(), nS, opt);
For the first case, when one wants to test the TNormalDistribution
"norm" with the known
parameters and a variation of each, it resumes as if this line was run:
tds0->drawQQPlot( "norm", "normal(1.3,4.5):normal(1.04,4.5):normal(1.3,3.6)", nS);
The first field is the attribute to be tested, while the second one provides the three hypothesis with which our attribute under investigation will be compared. The third argument is the number of steps to be computed for quantiles. The result of this macro is shown below.
This macro is an example of how to produce PP-plot for a certain number of randomly-drawn samples, providing the correct parameter values along with modified versions to illustrate the impact.
{
// Create a TDS with 8 kind of distributions
double p1=1.3, p2=4.5, p3=0.9, p4=4.4; // Fixed values for parameters
TDataServer *tds0 = new TDataServer();
tds0->addAttribute(new TNormalDistribution("norm", p1, p2));
tds0->addAttribute(new TLogNormalDistribution("logn", p1, p2));
tds0->addAttribute(new TUniformDistribution("unif", p1, p2));
tds0->addAttribute(new TExponentialDistribution("expo", p1, p2));
tds0->addAttribute(new TGammaDistribution("gamm", p1, p2, p3));
tds0->addAttribute(new TBetaDistribution("beta", p1, p2, p3, p4));
tds0->addAttribute(new TWeibullDistribution("weib", p1, p2, p3));
tds0->addAttribute(new TGumbelMaxDistribution("gumb", p1, p2));
// Create the sample
TBasicSampling *fsamp = new TBasicSampling(tds0, "lhs", 200);
fsamp->generateSample();
// Define number of laws, their name and numbers of parameters
unsigned int nLaws=8;
string laws[8]={"normal", "lognormal", "uniform", "gamma", "weibull", "beta", "exponential", "gumbelmax"}; // number of parameters to put in () for the corresponding law
int npar[8]={2, 2, 2, 3, 3, 4, 2, 2};
//Create the canvas
TCanvas *c = new TCanvas("c1","",800,1000);
//Create the 8 pads
TPad *apad = new TPad("apad","apad",0, 0.03, 1, 1); apad->Draw(); apad->cd();
apad->Divide(2,4);
// Number of points to compare theoretical and empirical values
int nS=1000;
double mod=0.8; // Factor used to artificially change the parameter values
TString opt=""; //option of the drawPPPlot method
stringstream sstr;
for(unsigned int ilaw=0; ilaw<nLaws; ilaw++)
{
// Clean sstr
sstr.str("");
// Add nominal configuration
sstr << laws[ilaw] << "("<<p1<<","<<p2<<((npar[ilaw]>=3)?Form(",%g",p3):"")<<((npar[ilaw]>=4)?Form(",%g",p4):"")<<")";
// Changing par1
sstr << ":" << laws[ilaw] << "("<<p1*mod<<","<<p2<<((npar[ilaw]>=3)?Form(",%g",p3):"")<<((npar[ilaw]>=4)?Form(",%g",p4):"")<<")";
// Changing par2
sstr << ":" << laws[ilaw] << "("<<p1<<","<<p2*mod<<((npar[ilaw]>=3)?Form(",%g",p3):"")<<((npar[ilaw]>=4)?Form(",%g",p4):"")<<")";
// Changing par3
if(npar[ilaw] >=3 )
sstr << ":" << laws[ilaw] << "("<<p1<<","<<p2<<((npar[ilaw]>=3)?Form(",%g",p3*mod):"")<<((npar[ilaw]>=4)?Form(",%g",p4):"")<<")";
// Changing par4
if(npar[ilaw] >=4 )
sstr << ":" << laws[ilaw] << "("<<p1<<","<<p2<<((npar[ilaw]>=3)?Form(",%g",p3):"")<<((npar[ilaw]>=4)?Form(",%g",p4*mod):"")<<")";
cout<<sstr.str()<<endl;
apad->cd(ilaw+1);
// Produce the plot
tds0->drawPPPlot( laws[ilaw].substr(0,4).c_str(), sstr.str().c_str(), nS, opt);
}
}
The macro is based on the one discussed in Section XIV.2.15. The only difference is this line
tds0->drawPPPlot( laws[ilaw].substr(0,4).c_str(), sstr.str().c_str(), nS, opt);
The call of the drawing method above can be resume, for the first case, like this:
tds0->drawPPPlot( "norm", "normal(1.3,4.5):normal(1.04,4.5):normal(1.3,3.6)", nS);
The first field is the attribute to be tested, while the second one provides the three hypothesis with which our attribute under investigation will be compared. The third argument is the number of steps to be computed for probabilities. The result of this macro is shown below.
The goal of this macro is to show how to handle a PCA analysis. It is not much discussed here, as a large description of both methods and concepts is done in Section II.6.
{
// Read the database
TDataServer * tdsPCA = new TDataServer("tdsPCA", "my TDS");
tdsPCA->fileDataRead("Notes.dat");
// Create the PCA object precising the variables of interest
TPCA * tpca = new TPCA(tdsPCA, "Maths:Physics:French:Latin:Music");
tpca->compute();
bool graphical=true; // do graphs
bool dumponscreen=true; //or dumping results
bool showcoordinate=false; // show the coordinate of the points while dumping results
if(graphical)
{
// Draw all point in PCA planes
TCanvas *cPCA = new TCanvas("cpca", "PCA",800,800);
TPad *apad1 = new TPad("apad1","apad1",0, 0.03, 1, 1); apad1->Draw(); apad1->cd();
apad1->Divide(2,2);
apad1->cd(1); tpca->drawPCA(1,2,"Pupil");
apad1->cd(3); tpca->drawPCA(1,3,"Pupil");
apad1->cd(4); tpca->drawPCA(2,3,"Pupil");
// Draw all variable weight in PC definition
TCanvas *cLoading = new TCanvas("cLoading", "Loading Plot",800,800);
TPad *apad2 = new TPad("apad2","apad2",0, 0.03, 1, 1); apad2->Draw(); apad2->cd();
apad2->Divide(2,2);
apad2->cd(1); tpca->drawLoading(1,2);
apad2->cd(3); tpca->drawLoading(1,3);
apad2->cd(4); tpca->drawLoading(2,3);
// Draw the eigen values in different normalisation
TCanvas *c = new TCanvas("cEigenValues", "Eigen Values Plot",1100,500);
TPad *apad3 = new TPad("apad3","apad3",0, 0.03, 1, 1); apad3->Draw(); apad3->cd();
apad3->Divide(3,1);
TNtupleD *ntd = tpca->getResultNtupleD();
apad3->cd(1); ntd->Draw("eigen:i","","lp");
apad3->cd(2); ntd->Draw("eigen_pct:i","","lp"); gPad->SetGrid();
apad3->cd(3); ntd->Draw("sum_eigen_pct:i","","lp"); gPad->SetGrid();
}
if(dumponscreen)
{
int nPCused=5; // 3 to see only the meaningful ones
TString PCname="", Cosname="", Contrname="", Variable="Pupil";
for (unsigned int iatt=1; iatt<=nPCused; iatt++)
{
PCname+=Form("PC_%d",iatt)+((iatt!=nPCused)?TString(":"):TString(""));
Cosname+=Form("cosca_%d",iatt)+((iatt!=nPCused)?TString(":"):TString(""));
Contrname+=Form("contr_%d",iatt)+((iatt!=nPCused)?TString(":"):TString(""));
}
cout<<endl<<"========= EigenValues ====================="<<endl;
tpca->getResultNtupleD()->Scan("*");
if(showcoordinate)
{
cout<<endl<<"========= EigenVectors ====================="<<endl;
tpca->_matEigenVectors.Print();
cout<<endl<<"========= New Coordinates ====================="<<endl;
tdsPCA->scan( (Variable+":"+PCname).Data() );
}
TDSNtupleD *varRes = tpca->getVariableResultNtupleD();
cout<<endl<<"=============== Looking at variables: Quality of representation ====================="<<endl;
varRes->Scan(("Variable:"+Cosname).Data());
cout<<endl<<"==================== Looking at variables: Contribution to axis ====================="<<endl;
varRes->Scan(("Variable:"+Contrname).Data());
cout<<endl<<"================== Looking at events: Quality of representation ===================="<<endl;
tdsPCA->scan((Variable+":"+Cosname).Data());
cout<<endl<<"===================== Looking at events: Contribution to axis ======================="<<endl;
tdsPCA->scan((Variable+":"+Contrname).Data());
}
}
This first part of this macro is described in Section II.6. We will
focus here on the second part, where all the numerical results are dumped on screen. These results are stored in
the dataserver for the points and in a dedicated ntuple for the variable that can be retrieved by calling the method
getVariableResultNtupleD
. In both cases, the results can be split into two kinds:
the quality of the representation: it is called "cosca_X" as it is a squared cosinus of the projection of the source under study (point or subject) on the X-th PC.
the contribution to axis: it is called "contr_X" as it is the contribution of the source under study (point or subject) to the definition of the X-th PC.
We start by defining the list of variables that one might want to display
int nPCused=5; // 3 to see only the meaningful ones
TString PCname="", Cosname="", Contrname="", Variable="Pupil";
for (unsigned int iatt=1; iatt<=nPCused; iatt++)
{
//list of PC: PC_1:PC_2:PC_3:PC_4:PC_5
PCname+=Form("PC_%d",iatt)+((iatt!=nPCused)?TString(":"):TString(""));
//list of quality coeff: cosca_1:cosca_2:cosca_3:cosca_4:cosca_5
Cosname+=Form("cosca_%d",iatt)+((iatt!=nPCused)?TString(":"):TString(""));
//list of contribution: contr_1:contr_2:contr_3:contr_4:contr_5
Contrname+=Form("contr_%d",iatt)+((iatt!=nPCused)?TString(":"):TString(""));
}
From there, once the variable ntuple is retrieved, one can dump both the quality and contribution coefficients for the variable, here the subjects (it leads to the second and third block in the output shown in Section XIV.2.17.3).
TDSNtupleD *varRes = tpca->getVariableResultNtupleD();
cout<<endl<<"=============== Looking at variables: Quality of representation ====================="<<endl;
varRes->Scan(("Variable:"+Cosname).Data());
cout<<endl<<"==================== Looking at variables: Contribution to axis ====================="<<endl;
varRes->Scan(("Variable:"+Contrname).Data());
Finally, one can do the same for the data points, with the same Scan
method (which lead to
the fourth and fifth block in the output shown in Section XIV.2.17.3).
cout<<endl<<"================== Looking at events: Quality of representation ===================="<<endl;
tdsPCA->scan((Variable+":"+Cosname).Data());
cout<<endl<<"===================== Looking at events: Contribution to axis ======================="<<endl;
tdsPCA->scan((Variable+":"+Contrname).Data());
========= EigenValues ===================== ************************************************************ * Row * i.i * eigen.eig * eigen_pct * sum_eigen * ************************************************************ * 0 * 1 * 2.8618175 * 57.236350 * 57.236350 * * 1 * 2 * 1.1506811 * 23.013622 * 80.249973 * * 2 * 3 * 0.9831407 * 19.662814 * 99.912787 * * 3 * 4 * 0.0039371 * 0.0787424 * 99.991530 * * 4 * 5 * 0.0004234 * 0.0084696 * 100 * ************************************************************ =============== Looking at variables: Quality of representation ===================== ************************************************************************************ * Row * Variable * cosca_1 * cosca_2 * cosca_3 * cosca_4 * cosca_5 * ************************************************************************************ * 0 * Maths * 0.649485 * 0.32645 * 0.0235449 * 0.0003614 * 0.0001582 * * 1 * Physics * 0.804627 * 0.185581 * 0.0086212 * 0.0010515 * 0.0001193 * * 2 * French * 0.574697 * 0.373378 * 0.0509446 * 0.0008976 * 8.252e-05 * * 3 * Latin * 0.828562 * 0.157999 * 0.0117556 * 0.0016205 * 6.335e-05 * * 4 * Music * 0.0044470 * 0.107272 * 0.888274 * 5.976e-06 * 8.299e-08 * ************************************************************************************ ==================== Looking at variables: Contribution to axis ===================== ************************************************************************************ * Row * Variable * contr_1 * contr_2 * contr_3 * contr_4 * contr_5 * ************************************************************************************ * 0 * Maths * 0.226948 * 0.283702 * 0.0239486 * 0.0917987 * 0.373602 * * 1 * Physics * 0.281159 * 0.16128 * 0.0087691 * 0.267082 * 0.28171 * * 2 * French * 0.200815 * 0.324485 * 0.0518182 * 0.228004 * 0.194878 * * 3 * Latin * 0.289523 * 0.137309 * 0.0119572 * 0.411598 * 0.149613 * * 4 * Music * 0.0015539 * 0.0932252 * 0.903507 * 0.0015180 * 0.0001959 * ************************************************************************************ ================== Looking at events: Quality of representation ==================== ************************************************************************************ * Row * Pupil * cosca_1 * cosca_2 * cosca_3 * cosca_4 * cosca_5 * ************************************************************************************ * 0 * Jean * 0.8854534 * 0.0522119 * 0.0619429 * 0.0002655 * 0.0001260 * * 1 * Aline * 0.7920409 * 0.0542262 * 0.1530381 * 0.0006354 * 5.916e-05 * * 2 * Annie * 0.4784294 * 0.4813342 * 0.0384099 * 0.0018007 * 2.560e-05 * * 3 * Monique * 0.8785990 * 0.0024790 * 0.1180158 * 0.0009035 * 2.557e-06 * * 4 * Didier * 0.8515216 * 0.1382946 * 0.0079754 * 0.0021718 * 3.640e-05 * * 5 * Andre * 0.2465355 * 0.3961581 * 0.3567663 * 9.471e-05 * 0.0004451 * * 6 * Pierre * 0.0263090 * 0.7670958 * 0.2060832 * 0.0004625 * 4.925e-05 * * 7 * Brigitte * 0.1876629 * 0.5897686 * 0.2211409 * 0.0013390 * 8.836e-05 * * 8 * Evelyne * 0.0583185 * 0.3457931 * 0.5953786 * 0.0004600 * 4.957e-05 * ************************************************************************************ ===================== Looking at events: Contribution to axis ======================= ************************************************************************************ * Row * Pupil * contr_1 * contr_2 * contr_3 * contr_4 * contr_5 * ************************************************************************************ * 0 * Jean * 0.3012931 * 0.0441856 * 0.0613538 * 0.0656820 * 0.2897335 * * 1 * Aline * 0.0618830 * 0.0105370 * 0.0348056 * 0.0360894 * 0.0312404 * * 2 * Annie * 0.0401366 * 0.1004284 * 0.0093797 * 0.1098075 * 0.0145176 * * 3 * Monique * 0.3784615 * 0.0026558 * 0.1479781 * 0.2828995 * 0.0074454 * * 4 * Didier * 0.1484067 * 0.0599446 * 0.0040461 * 0.2751439 * 0.0428726 * * 5 * Andre * 0.0348742 * 0.1393737 * 0.1469047 * 0.0097384 * 0.4255425 * * 6 * Pierre * 0.0041001 * 0.2973223 * 0.0934888 * 0.0524003 * 0.0518702 * * 7 * Brigitte * 0.0157710 * 0.1232678 * 0.0540974 * 0.0817989 * 0.0501849 * * 8 * Evelyne * 0.0150734 * 0.2222843 * 0.4479453 * 0.0864396 * 0.0865925 * ************************************************************************************