2.4.5. Correlation matrix
The computation of the correlation matrix can be done either on the values (leading to the Pearson
coefficients) or on the ranks (leading to the Spearmann coefficients). It is performed in the computeCorrelationMatrix method.
tdsGeyser = DataServer.TDataServer("tdsgeyser", "Geyser DataSet")
tdsGeyser.fileDataRead("geyser.dat")
tdsGeyser.addAttribute("y", "sqrt(x2) * x1")
matCorr = tdsGeyser.computeCorrelationMatrix("x2:x1")
print("Computing correlation matrix ...")
matCorr.Print()
Computing correlation matrix ...
2x2 matrix is as follows
| 0 | 1 |
-------------------------------
0 | 1 0.9008
1 | 0.9008 1
Same thing if computing the correlation matrix on ranks:
matCorrRank = tdsGeyser.computeCorrelationMatrix("x2:x1", "", "rank")
print("Computing correlation matrix on ranks ...")
matCorrRank.Print()
Computing correlation matrix on ranks ...
2x2 matrix is as follows
| 0 | 1 |
-------------------------------
0 | 1 0.7778
1 | 0.7778 1
Summary: Correlation matrix
computeCorrelationMatrix(const char* varexp=””, const char* select=””, Option_t* option=”” )Compute the correlation matrix on the attributes given by varexp applying the filter contained in select. When the parameter varexp is empty, the correlation matrix is calculated on all the attributes in the
TDataServer. The filter select is added to the permanent selection of theTDataServer. By default, when the option is empty, the correlation matrix was calculated on the values (Pearson matrix).Tip
The possible values of the argument option are:
- rank
the correlation was calculated on the ranks (Spearmann matrix).