# [Ifeffit] How to calculate F-value for XANES PCA results

Wayne Lukens wwlukens at lbl.gov
Mon Jan 11 11:08:05 CST 2010

```Hi Andrew,

I don't completely understand what your question is, but I will try to
answer you.  First, Sam Webb should be able to answer exactly how F was
calculated in Six-Pack. However, I can tell you how F is calculated in
general. First of all, the number in the Excel spread-sheet is probably
not F, it is the probability of F. In general, if the probability of F
is greater than 5%, that component is part of the noise. This is the
same as saying that that component is within 2-sigma of the noise.

F is easy to calculate for least-squares fitting, and is nicely
explained by wikipedia in the regression problems section:

http://en.wikipedia.org/wiki/F-test

where RSS is chi-square and n is the number of independent data points,
which is the lesser of the number of data points or the spectral range
divided buy the resolution. In your example, you have 18 independent
data points (36 eV range divided by a 2 eV resolution).

You then need to calculate the probability of that value of F given the
number of parameters and number of independent data points, which is not
explained by the wiki article but can easily be done in Excel.

In your example, you have two components with probability of F less than
5%, these would be the components that you would retain.

Sincerely,

Wayne
--
Wayne Lukens
Staff Scientist
Lawrence Berkeley National Laboratory
email: wwlukens at lbl.gov
phone: (510) 486-4305
FAX: (510) 486-5596

Andrew wrote:
>
>
> Hi everyone,
>
>
>
> I was looking through the literature on how to handle PC analysis data
> and saw that there are several different methods you can use for
> determining how many components there are in the series of scans.
> Included in SixPack are the indicator function, scree test, and the
> ability to quickly do the reduced eigenvalue ratios. I’ve been digging
> through the literature as to how to calculate the F-values. The closest
> to an answer that I got was:
>
>
>
> “The above-mentioned reduction of the body of experimental data, that
> is, the decision of what components correspond to the noise and what are
> the principal components, is now made on the basis of an F test of the
> variance associated with eigenvalue k and the summed variance associated
> with noise eigenvalues (k+1, ..., c). The null hypothesis is that a
> given factor k*/ /*is a member of the pool of noise factors. The
> probability that an F*/ /*value would be higher than the current value
> is given by %SL (percentage of significance level). Thus, the kth*/
> /*factor is accepted as a principal component if %SL is lower than some
> test level.” (Garcia 1995).
>
>
>
> We ran the PCA on the reduction of iron while scanning at increased
> temperatures. I checked the foil standard but did not see any shift in
> the max at 7112, we scanned at 0.5 eV intervals (2 eV resolution at the
> beam). I thought I understood what that statement was saying but I’m
> almost certain I’m doing something wrong. I have attached the .xlsx file
> that I was working on and hope someone can point me to the right
> direction. The file includes the components of the PCA and some of the
> variances that I was calculating. If there is a paper that someone shows
> an actual calculation of this in the supplemental materials that would
> have been exactly what I was looking for!
>
>
>
> Thanks for the help!
>
> Andrew Campos
>
>
>
> Fernandez-Garcia, M., C. Marquez Alvarez, and G.L. Haller, The Journal
> of Physical Chemistry, 1995, *99*(33), 12565-12569.
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ifeffit mailing list
> Ifeffit at millenia.cars.aps.anl.gov
> http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit

```