English Français

Documentation / Manuel utilisateur en C++ : PDF version

VI.5. The Sobol method

VI.5. The Sobol method

VI.5.1. Introduction to Sobol's sensitivity indices

The method described in the [metho] is, in Uranie, said to be "� la Saltelli" in contrast with the other implementation (previously used as default) which is said to be "� la Sobol". The difference between the two being the number of assessment used to get a certain precision: for the same results, the implementation "� la Sobol" was requesting and was offering more numerical results as five algorithms were used. The new implementation "� la Saltelli" requests only estimation but only three algorithms are run. This is summarised as follow where the bold name is the default stored in --first-- and --total--:

a la Saltelli (default method, option="saltelli")

First order: Saltelli02 [Saltelli02], Sumo10 [Monod06], Martinez11 [Martinez11]

Total order: Homma96 [Homma96], Sumo10 [Monod06], Martinez11 [Martinez11]

a la Sobol (option="sobol")

First order: Sobol93 [Sobol93], Saltelli02 [Saltelli02], Jansen99 [Jansen99], Sumo10 [Monod06], Martinez11 [Martinez11]

Total order: Homma96 [Homma96], Saltelli02 [Saltelli02], Jansen99 [Jansen99], Sumo10 [Monod06], Martinez11 [Martinez11]

Tip

The Martinez11 algorithm is the recommended one, as it provides an estimation of the 95% confidence interval for every coefficient determined.

VI.5.2. Computation of Sobol's sensitivity indices

The TSobol class computes the total and first order sobol indices using the so-called Saltelli & Tarantola method. It's a Monte-Carlo method which needs only model evaluations to compute the indices. In terms of implementation the TSobol class differs slightly from the other classes, even though it can be constructed with the three usual cases: a function, an external code or an already filled TDataServer. At this stage, computeIndexes's options can be specified to define:

  • the way to generate the sample, if needed. Possible choices are: "SRS", "LHS", "QMC=sobol" and "QMC=halton"

  • the chosen implementation, to generate the and matrices (as discussed above and in [metho]), possible choices being "sobol" and "saltelli".

If the object is created with an empty TDataServer and a code or function, the procedure is the following:

  1. generateSample: produce the design-of-experiments

  2. Run a TLauncher object. This has nothing to do with the TSobol and is a needed extra step to compute the output corresponding to every design-of-experiments data points.

  3. computeIndexes: get the indices value.

The following subsections will show an example, detailing every steps one-by-one.

Tip

Given the cost of this method (particularly once run with a real code), from version 4.6.0, the way to use and re-use data have be clarified and new methods have been implemented:
  • One can re-run an already done estimation. When running TSobol with default configuration meaning letting the class generates the design-of-experiments and launching the estimations, there are two files created: _sobol_sampling_.dat and _sobol_launching_.dat. Both contains input value, internal variable to figure out from what matrices every configuration is taken out of (see [metho] for more explanation), but only the latter contains the output values (after estimations). To re-use one of these _sobol_launching_.dat, one should simply use the constructor whose only mandatory arguments are the dataserver pointer, the list of input variables (second argument) and the list of output variables (third argument) in the usual form, once the file has been loaded into the dataserver thanks to fileDataRead). An example can be found in Section XIV.5.13.

  • One can use the "WithData" option which allows to shorten a bit the sobol indices estimation by providing already done estimations. This means that if one has already a design-of-experiments (LHS, SRS or QMC for instance) these computations are done already and will be used as the first estimation corresponding to both the and matrices (see [metho] for more details). The class will still have to create all the cross configurations (the matrices) and launch their corresponding estimations. This new (from v4.6.0) option should not be used with a constructor that would not provided a way to perform the estimation. An example can be found in Section XIV.5.14.

  • If the statistical accuracy is not sufficient, up to v4.6.0, one has to re-run the macro requesting more computations but losing the ones done previously. A new method has been created in order to be able to use all estimations done in a previous attempt by passing the file _sobol_launching_.dat, already introduced above. This method is called loadOtherSobolFile and it takes as only argument the name of the input file that will contains just as much attributes as needed for this analysis (inputs and outputs, but also the internal variable to details from which matrices the configurations are coming from). The important matter here is to be sure that the seed used to generate the design-of-experiments in the imported file and the one used to re-generate a second step are not the same. An example can be found in Section XIV.5.15.

VI.5.2.1. Macro computing the Sobol sensitivity indices

The example script below uses the TSobol class to compute and display Sobol sensitivity indices:

{
  gROOT->LoadMacro("UserFunctions.C");

  // Define the DataServer
  TDataServer *tds = new TDataServer("tdsflowrate", "DataBase flowrate");
  tds->addAttribute( new TUniformDistribution("rw", 0.05, 0.15));
  tds->addAttribute( new TUniformDistribution("r", 100.0, 50000.0));
  tds->addAttribute( new TUniformDistribution("tu", 63070.0, 115600.0));
  tds->addAttribute( new TUniformDistribution("tl", 63.1, 116.0));
  tds->addAttribute( new TUniformDistribution("hu", 990.0, 1110.0));
  tds->addAttribute( new TUniformDistribution("hl", 700.0, 820.0));
  tds->addAttribute( new TUniformDistribution("l", 1120.0, 1680.0));
  tds->addAttribute( new TUniformDistribution("kw", 9855.0, 12045.0));
  
  Int_t nComp = 3000;
  TSobol * tsobol = new TSobol(tds, "flowrateModel", nComp, "rw:r:tu:tl:hu:hl:l:kw", "flowrateModel", "pouet");

  tsobol->computeIndexes();
  
  TCanvas *cc = new TCanvas("c1", "histgramme",5,64,1270,667);
  tsobol->drawIndexes("Flowrate", "", "nonewcanv,hist,all");
  
  TCanvas *ccc = new TCanvas("c2", "pie",5,64,1270,667);
  TPad *apad = new TPad("apad","apad",0, 0.03, 1, 1); apad->Draw(); apad->cd();
  tsobol->drawIndexes("Flowrate", "", "nonewcanv,pie");
  
  }

The figures resulting from this script are shown below:

Figure VI.5. Histogram of Sobol's indices

Histogram of Sobol's indices

Figure VI.6. Pie chart of Sobol's indices

Pie chart of Sobol's indices

VI.5.2.2. Specifying the input parameters

First, define the uncertain parameters and add them to a TDataServer object:

// Define the DataServer
TDataServer *tds = new TDataServer("tdsflowrate", "DataBase flowrate");
tds->addAttribute( new TUniformDistribution("rw", 0.05, 0.15));
tds->addAttribute( new TUniformDistribution("r", 100.0, 50000.0));
tds->addAttribute( new TUniformDistribution("tu", 63070.0, 115600.0));
tds->addAttribute( new TUniformDistribution("tl", 63.1, 116.0));
tds->addAttribute( new TUniformDistribution("hu", 990.0, 1110.0));
tds->addAttribute( new TUniformDistribution("hl", 700.0, 820.0));
tds->addAttribute( new TUniformDistribution("l", 1120.0, 1680.0));
tds->addAttribute( new TUniformDistribution("kw", 9855.0, 12045.0));

VI.5.2.3. TSobol constructor

There are four different constructors to build a TSobol object, each corresponding to a different problem:

  • the model is an analytic function run by Uranie,
  • the model is a code run by Uranie,
  • the outputs of the model are already computed and saved in a TDataServer object.
  • the model is either a function or a code and the problem is specified through a Relauncher architecture.

Warning

The two first request a parameter called : it represents the number of estimation to be done by the code or the function in total. This value is used to estimate the size of the matrices and introduced in [metho], using the formula

VI.5.2.3.1. TSobol constructor for an analytic function

The constructor prototype used with an analytic function is:

// Create a TSobol object with an analytic function
TSobol(TDataServer *tds,  void *fcn,  const char *inp,  const char *out, Int_t nComp,  Option_t *option="")
TSobol(TDataServer *tds,  const char *fcn,  Int_t nComp,  const char *inp,  const char *out, Option_t *option="")

This constructor takes five arguments:

  • a pointer to a TDataServer object,
  • a pointer to an analytic function (either a void or a const char that represents the function's name when it has been loaded in ROOT's memory),
  • an integer to specify the number of computation to be performed,
  • a string to specify the names of the input factors separated by ':' (ex. "rw:r:tu:tl:hu:hl:l:kw"), its default value is the empty string "",
  • a string to specify the names of the output variables of the model, its default value is the empty string "".

Here is an example of how to use the constructor with an analytic function:

Int_t ncomp = 100;
TSobol * tsobol = new TSobol(tds, "flowrateModel", ncomp, "rw:r:tu:tl:hu:hl:l:kw", "flowrateModel");  
VI.5.2.3.2. TSobol constructor for a code

The constructor prototype used with a code is:

// Create a TSobol object with a code
TSobol(TDataServer *tds, TCode *fcode,  Int_t nComp, Option_t *option="")

This constructor takes three arguments:

  • a pointer to a TDataServer object,
  • a pointer to a TCode,
  • an integer to specify the number of computations to be performed.

Here is an example of the use of this constructor on the flowrate case:

// The reference input file
TString sJDDReference = TString("flowrate_input_with_keys.in");

// Set the reference input file and the key for each input attributes
tds->getAttribute("rw")->setFileKey(sJDDReference, "Rw");
tds->getAttribute("r")->setFileKey(sJDDReference, "R");
tds->getAttribute("tu")->setFileKey(sJDDReference, "Tu");
tds->getAttribute("tl")->setFileKey(sJDDReference, "Tl");
tds->getAttribute("hu")->setFileKey(sJDDReference, "Hu");
tds->getAttribute("hl")->setFileKey(sJDDReference, "Hl");
tds->getAttribute("l")->setFileKey(sJDDReference, "L");
tds->getAttribute("kw")->setFileKey(sJDDReference, "Kw");

// The output file of the code
TOutputFileRow *fout = new TOutputFileRow("_output_flowrate_withRow_.dat");
// The attribute in the output file
fout->addAttribute(new TAttribute("yhat"));

// Instanciation de mon code
TCode *mycode = new TCode(tds, "flowrate -s -k");
// Adding the outputfile to the tcode object
mycode->addOutputFile( fout );

Int_t nComp = 100;
TSobol * tsobolC = new TSobol(tds, mycode, nComp); 
VI.5.2.3.3. TSobol constructor for a runner

The constructor prototype used with a runner is:

// Create a TSobol object with a runner
TSobol(TDataServer *tds, TRun *run,  Int_t nComp, Option_t *option="")

This constructor takes three arguments:

  • a pointer to a TDataServer object,
  • a pointer to a TRun,
  • an integer to specify the number of computations to be performed.

Here is an example of the use of this constructor on the flowrate code in a sequential mode:

// The input file
TKeyScript infile("flowrate_input_with_keys.in");
// provide the input and their key
infile.setInputs(8, tds->getAttribute("rw"), "Rw", tds->getAttribute("r"), "R",
tds->getAttribute("tu"), "Tu", tds->getAttribute("tl"), "Tl", tds->getAttribute("hu"), "Hu",
tds->getAttribute("hl"), "Hl", tds->getAttribute("l"), "L", tds->getAttribute("kw"), "Kw");

TAttribute yhat("yhat");
// The output file of the code
TKeyResult outfile("_output_flowrate_withKey_.dat");
// The attribute in the output file
outfile.addOutput(&yhat, "yhat");

// Instanciation de mon code
TCodeEval code("flowrate -s -k");
// Adding the intput/output file to the code
code.addInputFile(&infile);
code.addOutputFile(&outfile);

TSequentialRun run(&code);
run.startSlave();
if(run.onMaster())
{
    Int_t nComp = 100;
    TSobol * tsobolR = new TSobol(tds, &run, nComp);
    // ....
}
VI.5.2.3.4. TSobol constructor using a filled TDataServer object

The constructor prototype used with a TDataServer object already containing the simulations is:

//  Create a TSobol object with already filled TDS
TSobol(TDataServer *tds,  const char *inp,  const char *out,  Option_t *option="")

This constructor takes three arguments:

  • a pointer to a TDataServer object filled,
  • a string to specify the names of the input factors separated by ':' (ex. "rw:r:tu:tl:hu:hl:l:kw"),
  • a string to specify the names of the output variables of the model.

Below is an example of the constructor with a TDataServer object filled:

TDataServer *tds = new TDataServer();
tds->fileDataRead("_sobol_launching.dat");
TSobol * tsobol = new TSobol(tds, "rw:r:tu:tl:hu:hl:l:kw", "flowrateModel"); 

Warning

This constructor uses a TDataServer object already filled with a specific internal variable (sobol__n__iter__tdsflowreate) and a specific sample!

There are several conditions to use it:

  • use the constructor without argument for the TDataServer;
  • the input factors sample must have been generated with the method TSobol::generateSample.

VI.5.2.4. Generating the sample

To generate the Sobol sample, use the generateSample method:

 tsobol->generateSample(); 

Then, the sample generated can be exported in a file and used outside of Uranie to compute the simulations associated.

VI.5.2.5. Computing the indices

To compute the indices, run the method computeIndexes:

 tsobol->computeIndexes();

Note that this method is all inclusive: it constructs the sample, launches the simulations and computes the indices.

VI.5.2.6. Displaying the indices

To display graphically the coefficients, use the method drawIndexes:

void TSensitivity::drawIndexes(TString sTitre, const char *select, Option_t * option)

The method needs:

  • a TString containing the title of the figure,
  • a string containing a selection (empty if no selection),
  • a string containing the options of the graphics separated by commas.

Some of the options available are:

  • "nonewcanv": to not create a new canvas,
  • "pie": to display a pie chart,
  • "hist": to display a histogram,
  • "first": to display the indices of the first order (with "hist" only),
  • "total": to display the total indices (with "hist" only),
  • "all": to display the total and first order indices (with "hist" only),

In our example the use of this method is:

TCanvas *cc = new TCanvas("c1", "histgramme",5,64,1270,667);
cc->Divide(2,1);
cc->cd(1);
tsobol->drawIndexes("Flowrate", "", "nonewcanv,hist,all");
cc->Print("appliUranieFlowrateSobol100Histogram.png");

cc->cd(2);
tsobol->drawIndexes("Flowrate", "", "nonewcanv,pie");
cc->Print("appliUranieFlowrateSobol100Pie.png"); 

VI.5.2.7. Extracting the coefficients

The coefficients, once computed, are stored in a TTree. To get this TTree, use the method TSensitivity::getResultTuple():

TTree * results = tsobol->getResultTuple();

Several methods exist in ROOT to extract data from a TTree, it is advised to look for them into the ROOT documentation. We propose two ways of extracting the value of each coefficient from the TTree.

VI.5.2.7.1. First method of extraction

The first method use the method getValue of the TSobol object specifying the order of the extract value ("First" ou "Total"), the related input and possibly more selected options (for example "Algo==\"homa96\"").

double hl_first_index = tsobol->getValue("First","hl");
double hl_total_index = tsobol->getValue("Total","hl");
VI.5.2.7.2. Second method of extraction

The seond method uses 3 steps to extract an index:

  • scan the TTree for the chosen input variable (with a selection) in order to obtain its row number. In our example, if we chose the variable "hl", we'll use the command:
     results->Scan("*","Inp==\"hl\""); 
    and in the resulting figure below, we see that the first order index of "hl" is in the row 40:
    *************************************************************************************
    *    Row   *   Out.Out * Inp. * Order. * Method * Algo.Algo * Value * CILow * CIUpp *
    *************************************************************************************
    *       40 * flowrateM *   hl *  First *  Sobol * --first-- * 0.023 *     0 * 0.136 *
    *       41 * flowrateM *   hl *  First *  Sobol * 02saltell * 0.045 *    -1 *    -1 *
    *       42 * flowrateM *   hl *  First *  Sobol *    sumo10 * 0.020 *    -1 *    -1 *
    *       43 * flowrateM *   hl *  First *  Sobol * martinez1 * 0.023 *     0 * 0.136 *
    *       44 * flowrateM *   hl *  Total *  Sobol * --total-- * 0.058 * 0.046 * 0.072 *
    *       45 * flowrateM *   hl *  Total *  Sobol *   homma96 * 0.078 *    -1 *    -1 *
    *       46 * flowrateM *   hl *  Total *  Sobol *    sumo10 * 0.071 *    -1 *    -1 *
    *       47 * flowrateM *   hl *  Total *  Sobol * martinez1 * 0.058 * 0.046 * 0.072 *
    *************************************************************************************
    
  • set the entry of the TTree on this row with the method GetEntry;
  • get the value of the index with GetValue method on the "Value" leaf of the TTree.

Below is an example of extraction of the index for "hl" in our flowrate case:

results->Scan("*","Inp==\"hl\"");
results->GetEntry(40);
Double_t S_Rw_Indexe =  results->GetLeaf("Value")->GetValue(); 
VI.5.2.7.3. Third method of extraction

The third method uses 2 steps to extract an index:

  • use the Draw method with a selection to select the index, for example the selection for the first order index of "rw" is "Inp==\"rw\" && Algo==\"--first--\"";
  • get the pointer on the value of the index with GetV1 method on the TTree.

Below is another example of extraction of the first order index for "rw" in our flowrate case:

results->Draw("Value","Inp==\"rw\" && Algo==\"--first--\"","goff");
Double_t S_Rw_IndexeS = results->GetV1()[0]; 
/language/en