Documentation
/ Guide méthodologique
:
This section introduces indices whose purpose is mainly to obtain good estimators of the Shapley's values defined in Section V.1.1.4. The underlying assumption is to state that the model can be considered linear so that the results can be considered as proper estimation of the Shapley indices (with or without correlation between the input variables).
The idea here is very similar to the standard regression coefficients introduced in Section V.1.1.1, as one will use orthogonal transformation to represent our data, with dependent inputs. The method has been introduced by Johnson in [johnson2000heuristic] and its principle can be split into three steps:
transform the dependent input variables
through a linear transformation into
so that
;
compute sensitivity index of
with respect to
;
reconstruct the sensitivity index of
with respect to the component of
.
Practically, the method proposed by [johnson2000heuristic] relies on the singular value
decomposition of ,
written as
for which
, contains the eigenvectors of
,
contains the eigenvectors of
and
is a diagonal matrix containing the singular values of
. From there, the
best-fitting orthogonal approximation of
can be obtained (c.f.
[johnson1966minimal] for the demonstration) as

The second steps consists in regressing onto
, which is obtained by

The squared elements of represent the proportion of predictable variance in
accounted for by the
, but in the case where two or more original variables
are highly correlated, the
variables are not a close representation of the
ones. To take this into account, Johnson proposed to regress
onto
, leading to other weights defined as:

From there, the variance of
explained by
can be written
as
and
estimated from the following formula

where and bearing in mind that

The resulting sensitivity indices are written
