13.4.14. Macro “samplingSingularCorrelationCase.py

13.4.14.1. Objective

This macro shows the usage of the SVD decomposition for the specific case where the target correlation matrix is singular. The idea is to provide a tool that will allow the user to compare quickly both the TSampling and TBasicSampling implementation, with a singular correlation matrix, or not. In order to do that, a toy random correlation matrix generator is provided.

The resulting design-of-experiments is presented side-by-side with the residual of the obtained correlation matrix.

13.4.14.2. Macro Uranie

"""
Example of DoE generation when the correlation matrix is singular
"""
from math import sqrt
from URANIE import DataServer, Sampler
import ROOT

# What classes to be used (true==TSampling, false==TBasicSampling)
ImanConover = False
SamplingType = "srs"
# Remove svd to use Cholesky
SamplerOption = "svd"

# Generating randomly a singular correlation matrix
# dimension of the  problem in total.
n = 10
# Number of dimension self explained.
# SHOULD BE SMALLER (singular) OR EQUAL TO (full-rank) n.
p = 6
# number of location in the doe
m = 300

# =======================================================================
#           Internal recipe to get this correlation matrix
# =======================================================================
A = ROOT.TMatrixD(n, p)
Rand = ROOT.TRandom3()
for i in range(n):
    for j in range(p):
        A[i][j] = Rand.Gaus(0., 1.)

Gamma = ROOT.TMatrixD(n, n)
Gamma.MultT(A, A)
Gamma *= 1./n

Sig = ROOT.TMatrixD(n, n)
for i in range(n):
    Sig[i][i] = 1./sqrt(Gamma[i][i])

Corr = ROOT.TMatrixD(Sig, ROOT.TMatrix.kMult, Gamma)
Corr *= Sig
# =======================================================================
#                    Corr is our correlation matrix.
# =======================================================================


# Creating the TDS
tds = DataServer.TDataServer("pouet", "pouet")

# Adding attributes
for i in range(n):
    tds.addAttribute(DataServer.TNormalDistribution("n"+str(i), 0., 1.))

# Create the sampler and generate the doe
if ImanConover:
    sam = Sampler.TSampling(tds, SamplingType, m)
    sam.setCorrelationMatrix(Corr)
    sam.generateSample(SamplerOption)
else:
    sam = Sampler.TBasicSampling(tds, SamplingType, m)
    sam.setCorrelationMatrix(Corr)
    sam.generateCorrSample(SamplerOption)

# Compute the empirical correlation matrix
ResultCorr = tds.computeCorrelationMatrix()
# Change it into a residual matrix
ResultCorr -= Corr

ROOT.gStyle.SetOptStat(1110)
# Plot the results
c = ROOT.TCanvas("c", "c", 1800, 900)
apad = ROOT.TPad("apad", "apad", 0, 0.03, 1, 1)
apad.Draw()
apad.cd()
apad.Divide(2, 1)
apad.cd(1)
# Residual distribution
h = ROOT.TH1F("h", ";#Delta_{#rho}", 100, -0.2, 0.2)
for i in range(n):
    for j in range(i, n):
        h.Fill(ResultCorr[i][j])
h.Draw()

# all variables
apad.cd(2)
tds.drawPairs()

This macro starts by a bunch of variables provided to offer many possible configuration, among which:

  • use Iman and Conover method or the more simple one to get the correlation (see [Bla17] for a complete description of their respective correlation handling);

  • produce a stratified or fully random sample;

  • use Cholesky or SVD decomposition to decompose the target correlation matrix;

  • use a full-rank or singular correlation matrix. This is allowed thanks to the toy random correlation matrix generator: one has to define the number of variable (n) and the number of dimension that should be self-explained (p). If the latter is strickly lower than the former, then the correlation matrix generated will be singular, whereas it will be a full-rank one if both quantities are equal.

This is the main part of this macro

# What classes to be used (true==TSampling, false==TBasicSampling)
ImanConover = False
SamplingType = "srs"
# Remove svd to use Cholesky
SamplerOption = "svd"

# Generating randomly a singular correlation matrix
# dimension of the  problem in total.
n = 10
# Number of dimension self explained.
# SHOULD BE SMALLER (singular) OR EQUAL TO (full-rank) n.
p = 6
# number of location in the doe
m = 300

Once this is settled, the correlation matrix is created and the dataserver is created and n centered-reduced normal attributes are added. The chosen design-of-experiments is generated with the chosen options:

# Create the sampler and generate the doe
if ImanConover:
    sam = Sampler.TSampling(tds, SamplingType, m)
    sam.setCorrelationMatrix(Corr)
    sam.generateSample(SamplerOption)
else:
    sam = Sampler.TBasicSampling(tds, SamplingType, m)
    sam.setCorrelationMatrix(Corr)
    sam.generateCorrSample(SamplerOption)

Finally the obtained design-of-experiments is shown along with the residual of all the correlation coefficients (difference between the target values and the obtained ones). This is shown in Figure 13.17 and can be used to compare the performance of the samplers.

13.4.14.3. Graph

../../_images/samplingSingularCorrelationCase.png

Figure 13.17 Graph of the macro “samplingSingularCorrelationCase.py”