Partial Least Squares¶
The Partial Least Squares methods simply says that there is a weighting
matrix
that can be used to transform the spectra matrix
into the valence matrix
:

and seeks to find the weighting matrix
using a least-squares
approach.
In principle, :math;`W` needs
rows, but we know from PCA that
we probably can by with many fewer weighting components than spectra.
That is, it finds
such that minimizes:

The Partial Least Squares method finds a weighting matrix to satisfy this
relation. It is essentially linear regression of the spectra to the
value of valence. And we can determine
– how many weighting
components are needed to explain the variance in our external variable.
With
, we can predict the value for valence of an unknown
spectra. Without any other analysis.
This machine learning method can be a very powerful analysis approach.
We can use try many permutations to cross-validate the weights: use some spectra to find a set of weights (train the model) and then test how well this works with other spectra left out of the training set.
Warnings and Caveats¶
Like other machine learning methods, this is powerful but not without concerns
1. It has no idea that this is X-ray absorption data. No physics/chemistry at all.
2. The energy array has no meaning. The energy order does not matter, and the energy resolution does not matter.
3. Like all linear regression mehods, it will find a linear relationship between spectra and “external variable” (valence), even it one does not actually exist.
That is, things can go badly. But when it works, it can seem like magic…