3.2.2.2. Correlation / de-correlation

Let’s assume we have a random drawing \(\mathbf{X}(n_{S},n_{X})\), where every column is the drawing of a given random variable of size \(n_{S}\). On can then compute the following matrix \(\mathbf{T} = n_{S}^{-1}\mathbf{X}^{T}\mathbf{X}\) which is the correlation matrix (respectively covariance matrix) of our sample, if the columns have been centered and reduced (respectively only centered). If \(n_{S}\) were to be infinite, we would be able to state that the resulting empirical correlation of the drawn marginal would asymptotically be the identity matrix of dimension \(n_{X}\), noted \(\mathbf{1}_{n_{X}}\).

The next step is then to correlate the variable so that \(\mathbf{T}\) is not the identity anymore but the target correlation matrix \(\mathbf{C}^{*}\). Knowing that when one multiplies a matrix of random samples by a matrix \(\mathbf{W}\) to get \(\mathbf{Y}=\mathbf{W}\mathbf{X}\), the resulting variance is estimated as [PP12]

Equation 3.2: Simple correlation / de-correlation principle

(3.2)\[\begin{split}\begin{array}{ccc} \mathbf{C}^{*} = {\rm Var}[\mathbf{Y}] & = & \mathbf{W} {\rm Var}[\mathbf{X}] \mathbf{W}^{T} \\ & = & \mathbf{W} \mathbf{W}^{T}, \;\; {\rm when} \;\; {\rm Var}[\mathbf{X}] \rightarrow \mathbf{1}_{n_{X}}\\ \end{array}\end{split}\]

This leads to the fact that the transformation matrix that provides such a correlation matrix in the end should satisfy the last line of previous equation which is the definition of the Cholesky decomposition of an hermitian positive-definite matrix (W being a lower triangular matrix)

Equation 3.3: General form of a Cholesky decomposition with lower triangular matrix

(3.3)\[\mathbf{C}^{*} = \mathbf{W}\mathbf{W}^{T}\]

These steps are the one used to correlate the variables in both in the TBasicSampling and TGaussianSampling classes. Let’s call this method the simple decomposition.

The implementation done in the TSampling class, is far more tricky to understand and the aim is not to explain the full concept of the method, called the Iman and Conover method [IC82]. The rest of this paragraph will just provide insight on what’s done specifically to deal with the correlation part which is different from what’s been explained up to now. The main difference is coming from the underlying hypothesis, written in the second line of Equation 3.2: in a perfect world, for a given random drawing of uncorrelated variables the correlation matrix should satisfy the relation \(\mathbf{T}=\mathbf{1}_{n_{X}}\). This is obviously not the case[1], so one of the proposal to overcome this is to perform a second Cholesky decomposition, on the drawn sample correlation matrix, to get the following decomposition: \(\mathbf{T} = \mathbf{K}\mathbf{K}^{T}\). As \(\mathbf{K}\) is lower triangular, it is rather trivial to invert, we can then consider to transform the generated sample using this relation: \(\mathbf{Y} = \mathbf{X}(\mathbf{K}^{-1})^{T}\mathbf{W}^{T}\). If one consider that these multiplication does not change the fact that columns are centered and reduced, then one can write the following equations

\[\begin{split}\begin{array}{ccc} n_{S}^{-1}\mathbf{Y}^{T}\mathbf{Y} & = & n_{S}^{-1}\mathbf{W}\mathbf{K}^{-1}\mathbf{X}^{T} \mathbf{X}(\mathbf{K}^{-1})^{T}\mathbf{W}^{T} \;\; {\rm but} \;\; {\rm as} \;\; n_{S}^{-1} \mathbf{X}^{T}\mathbf{X} = \mathbf{T} = \mathbf{K}\mathbf{K}^{T} \\ & = & \mathbf{W}\,(\mathbf{K}^{-1}\mathbf{K})\,(\mathbf{K}^{T}(\mathbf{K}^{-1})^{T})\,\mathbf{W}^{T}\\ & = & \mathbf{W}\mathbf{W}^{T}\\ & = & \mathbf{C}^{*}\\ \end{array}\end{split}\]

Thanks to this procedure (and many more technicalities such as, for instance, working with Spearman coefficient to be able to handle correlation with stratified samples) the resulting correlation matrix is designed to be as close as possible to the target one.

The final part of this discussion is a limitation of both methods: relying on Cholesky decomposition to decompose the target correlation matrix. If one considers the case where \(\mathbf{C}^{*}\) is a singular matrix, then two important points can be raised:

this case means that one or more variables can be completely defined thanks to the others. The number of properly defined variable can then be estimated by the rank of the correlation matrix. This situation can occur, as in some complicated problem variables can be highly-intricated leading to this kind of situation.
with this kind of correlation matrix, the Cholesky decomposition is not doable anymore so both methods are meant to stop brutally.

In order to overcome this situation we propose to use a workaround based on the Singular Value Decomposition (SVD) which leads to, knowing that \(\mathbf{C}^*\) is real symmetric, \(\mathbf{C}^{*} = \mathbf{U} \boldsymbol\Sigma \mathbf{U}^{T}\). This writing emphasise the connection between SVD and eigenvalue decomposition (for a more general form and SVD, see for instance Equation 2.1). In this context, \(\mathbf{U}(n_{X},n_{X})\) is an unitary matrices while \(\boldsymbol\Sigma(n_{X},n_{X})\) is a diagonal matrix storing the singular values of \(\mathbf{C}^*\) in decreasing order. In our case, where the correlation is singular, it means that one or more of the singular values are very close or equal to 0. By rewriting our decomposition as below

\[\mathbf{C}^{*} = \mathbf{U} \boldsymbol\Sigma^{1/2} \boldsymbol\Sigma^{1/2} \mathbf{U}^{T} = \mathbf{U}\boldsymbol\Sigma^{1/2} (\mathbf{U}\boldsymbol\Sigma^{1/2})^{T}\]

one can redefine the matrix \(\mathbf{W} = \mathbf{U}\boldsymbol\Sigma^{1/2}\) and get the usual formula discussed above (see Equation 3.3). This decomposition can then be used instead of the Cholesky decomposition in both method (as it is either for the simple form or along with another Cholesky to decompose the correlation matrix of the drawn sample, in our modified Iman and Conover algorithm).

The usage of an SVD instead of a Cholesky decomposition for the target correlation matrix relies on the underlying hypothesis that the left singular vectors (\(\mathbf{U}\)) can be used instead of the right singular vectors (\(\mathbf{V}\)) in the general SVD formula shown for instance in Equation 2.1. This holds even for the singular case, as the only differences seen between both singular vector basis arise for the singular values close to zero. Since in this method we are always using the singular vector matrix weighted by the square roots of the singular values, these differences are vanishing by construction.

3.2.2.2. Correlation / de-correlation

Methodology

Navigation

Related Topics