[Ifeffit] Breaking down correlationships between parameters

Sat Mar 21 10:41:33 CDT 2015

Hi Matt,

Thanks a lot for your prompt reply. The method I am referring to is not the multiple k-weight fits by constraining N*S02. My apologies for not being clear enough. Let's do it again. I am actually referring to an approach where we take an advantage of a different k-dependence of various parameters to breakdown correlations between them. For example, S02 and sigma2. S02 is k-independent and Sigma2 has k^2 dependence.

In this case, to breakdown correlation between S02 and sigma2, one can assume a series of S02 values and perform fits using a single k-weight each time (say k-weight 1,2 and 3) and record corresponding sigma2 values. Let us say for k-weight =1, a series of preset S02 values will result in a series of corresponding sigma2 values refined in fits, which can be plotted as a straight line in sigma2 vs. S02 plot. Similar straight lines can be obtained for fits using k-weight = 2 and then 3. Now, these three lines may intersect at or near some point, which will determine the "true" value of parameters independent of k-weight. One can then constrain S02 to a value obtained from the point of intersection of three lines and vary sigma2 in a fit. In this particular case, however, the advantage is, S02 does not depend on changes inside sample and we have very good estimate of its range (say 0.7 - 1.0).

Now suppose instead of S02 (which i now set to a reasonable value), I am interested in determining N, but it is highly correlated with sigma2. Each time when disorder in the sample increases, the sigma2 increases and due to its high correlation, N is also overestimated. On the other hand, when the disorder in the sample decreases, the sigma2 decreases and I can have a "true" estimation of N in the sample. Can I still apply the above mentioned approach to break the correlationship between N and sigma2 and get a "true" estimation of N, even if disorder is high in my samples ? or it is simply not possible due to the fact that both N and sigma2 varies with changes inside the sample.

Best regards,
Jatin

________________________________________
From: ifeffit-bounces at millenia.cars.aps.anl.gov [ifeffit-bounces at millenia.cars.aps.anl.gov] on behalf of ifeffit-request at millenia.cars.aps.anl.gov [ifeffit-request at millenia.cars.aps.anl.gov]
Sent: Saturday, March 21, 2015 3:14 PM
To: ifeffit at millenia.cars.aps.anl.gov
Subject: Ifeffit Digest, Vol 145, Issue 38

Send Ifeffit mailing list submissions to
        ifeffit at millenia.cars.aps.anl.gov

To subscribe or unsubscribe via the World Wide Web, visit
        http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
or, via email, send a message with subject or body 'help' to
        ifeffit-request at millenia.cars.aps.anl.gov

You can reach the person managing the list at
        ifeffit-owner at millenia.cars.aps.anl.gov

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Ifeffit digest..."

Today's Topics:

   1. Re: amplitude parameter S02 larger than 1 (Scott Calvin)
   2. Breaking down correlationships between parameters
      (Rana, Jatinkumar Kantilal)
   3. Re: Breaking down correlationships between parameters
      (Matt Newville)

----------------------------------------------------------------------

Message: 1
Date: Fri, 20 Mar 2015 18:53:06 -0400
From: Scott Calvin <scalvin at sarahlawrence.edu>
To: XAFS Analysis using Ifeffit <ifeffit at millenia.cars.aps.anl.gov>
Subject: Re: [Ifeffit] amplitude parameter S02 larger than 1
Message-ID: <D2ADF788-9BC6-41E2-BD6B-794BD2E59DBC at slc.edu>
Content-Type: text/plain; charset="utf-8"

Hi Yanyun,

Good. So here's the procedure for a Hamilton test.

We're comparing the fit with S02 guessed to the one with S02 set to 0.90, because that was your a priori best guess at S02.

I take the ratio of the first R-factor to the second. You didn't actually say the R-factor for the fit with S02 guessed, but it's clearly around 0.0055 based on the other information you gave. The R-factor for the 0.90 fit is 0.021. So the ratio is 0.0055/0.021 = 0.26, which we'll call x.

For the first fit the degrees of freedom is 31.2 - 24 = 8.2. Take half of that and call that a. So a is 4.1.

The first fit guesses 1 parameter that the second one doesn't. Take half of 1 and call that b. So b is 0.5.

Find a regularized lower incomplete beta function calculator, like this one: http://www.danielsoper.com/statcalc3/calc.aspx?id=37

Enter x, a, and b.

The result is 0.001. This means that there is a 0.1% chance that the fits are actually consistent, and that the difference is just due to noise in the data.

So in this case, we can't just explain away the high S02 as insignificant.

Of course, you could pretty much eyeball that once you gave me the uncertainties; since your fit said 1.45 +/- 0.14, that's likely to be quite incompatible with S02 = 0.9. Still, it's nice to put that on a firmer statistical basis, and I've personally found the Hamilton test quite helpful for answering "do I need to worry about [X]?" type questions.

But in your case, you do need to worry about it. This discussion has generated several suggestions; hopefully one of them is a good lead!

--Scott Calvin
Sarah Lawrence College

> On Mar 20, 2015, at 4:30 PM, huyanyun at physics.utoronto.ca wrote:
>
> Hi Scott,
>
> In all situations, 31.2 independent data points and 24 variables were
> used. In the case of setting S02 to a value, 23 variables were used.
>
> Let me know if there is any other info needed.
>
> Best,
> Yanyun
>
>
> Quoting Scott Calvin <scalvin at sarahlawrence.edu>:
>
>> Hi Yanyun,
>>
>> To actually do a Hamilton test, the one other thing I need to know
>> the number of degrees of freedom in the fit...if you provide that,
>> I'll walk you through how to actually do a Hamilton test--it's not
>> that bad, with the aid of an online calculator, and I think it might
>> be instructive for some of the other people reading this list who
>> are trying to learn EXAFS.
>>
>> --Scott Calvin
>> Sarah Lawrence College
>>
>>
>>> On Mar 20, 2015, at 3:46 PM, huyanyun at physics.utoronto.ca wrote:
>>>
>>> Hi Scott,
>>>
>>> Thank you so much for giving me your thought again. It is very helpful
>>> to know how you and other XAFS experts deal with unusual situations.
>>>
>>> The floating S02 is fitted to be 1.45+/-0.14, this just means the fit
>>> doesn't like the idea of an S02 in a typical range. Instead of setting
>>> S02 to 0.9, I have to figure out why it happens and what it might
>>> indicate.
>>>
>>> I guess a Hamilton test is done by adjusting one parameter (i.e., S02)
>>> while keeping other conditions and model the same. Is that right?  So
>>> I record this test as following:
>>>
>>> 1) Floating S02: S02 fits to 1.45+/-0.14, R=0.0055, reduced
>>> chi^2=17.86, Percentage=0.53+/-0.04
>>> 2) Set S02=0.7, R=0.044, reduced chi^2=120.6, percentage=0.81+/-0.2
>>> 3) set S02=0.8, R=0.030, reduced chi^2=86.10, percentage=0.77+/-0.07
>>> 3) set S02=0.9, R=0.021, reduced chi^2=60.16, percentage=0.72+/-0.06
>>> 4) set S02=1.0, R=0.017, reduced chi^2=49.5, percentage=0.67+/-0.05
>>> 5) set S02=1.1, R=0.012, reduced chi^2=35.1, percentage=0.62+/-0.03
>>> 6) set S02=1.2, R=0.009, reduced chi^2=24.9, percentage=0.59+/-0.02
>>> 7) set S02=1.3, R=0.007, reduced chi^2=18.9, percentage=0.57+/-0.02
>>> 8) set S02=1.4, R=0.0057, reduced chi^2=16.1, percentage=0.55+/-0.02
>>> 9) Floating S02 to be 1.45+/-0.14
>>> 10) set S02=1.6, R=0.006, reduced chi^2=17.8, percentage=0.53+/- 0.02
>>> 11) set S02=2.0, R=0.044, reduced chi^2=120.7, percentage=0.37+/-0.06.
>>>
>>> Therefore, I will say S02 falling in the range 1.2~1.6 gives
>>> statistically improved fit, but S02=0.9 is not terrible as well. I
>>> agree with you that I could always be confident to say the percentage
>>> is 0.64+/-0.15, but I do want to shrink down the uncertainty and think
>>> about other possibilities that could cause a large S02.
>>>
>>> I did double-check the data-reduction and normalization process. I
>>> don't think I can improve anything in this step. By the way, I have a
>>> series of similar samples and their fittings all shows floating S02
>>> larger than one based on the same two-sites model.
>>>
>>> Best,
>>> Yanyun
>>>
>>>
>>>
>>>
>>> Quoting Scott Calvin <scalvin at sarahlawrence.edu>:
>>>
>>>> Hi Yanyun,
>>>>
>>>> Lots of comments coming in now, so I?m editing this as I write it!
>>>>
>>>> One possibility for why you're getting a high best-fit S02 is that
>>>> the fit doesn't care all that much about what the value of S02; i.e.
>>>> there is broad range of S02's compatible with describing the fit as
>>>> "good." That should be reflected in the uncertainty that Artemis
>>>> reports. If S02 is 1.50 +/- 0.48, for example, that means the fit
>>>> isn't all that "sure" what S02 should be. That would mean we could
>>>> just shrug our shoulders and move on, except that it correlates with
>>>> a parameter you are interested in (in this case, site occupancy). So
>>>> in such a case, I think you can cautiously fall back on what might
>>>> be called a "Bayesian prior"; i.e., the belief that the S02 should
>>>> be "around" 0.9, and set the S02 to 0.9. (Or perhaps restrain S02 to
>>>> 0.9; then you're really doing something a bit more like the notion
>>>> of a Bayesian prior.)
>>>>
>>>> On the other hand, if the S02 is, say, 1.50 +/- 0.07, then the fit
>>>> really doesn?t like the idea of an S02 in the typical range. An S02
>>>> that high, with that small an uncertainty, suggests to me that
>>>> something is wrong?although it could be as simple as a normalization
>>>> issue during data reduction. In that case, I?d be more skeptical of
>>>> just setting S02 to 0.90 and going with that result; the fit is
>>>> trying to tell you something, and it?s important to track down what
>>>> that something is.
>>>>
>>>> Of course, once in a while, a fit will find a local minimum, while
>>>> there?s another good local minimum around a more realistic value.
>>>> That would be reflected by a fit that gave similarly good
>>>> quantitative measures of fit quality (e.g. R-factors) when S02 is
>>>> fit (and yields 1.50 +/- 0.07) as when its forced to 0.90. That?s
>>>> somewhat unusual, however, particularly with a global parameter like
>>>> S02.
>>>>
>>>> A good way to defend setting S02 to 0.90 is to use the Hamilton test
>>>> to see if floating S02 yields a statistically significant
>>>> improvement over forcing it to 0.90. If not, using your prior best
>>>> estimate for S02 is reasonable.
>>>>
>>>> If you did that, though, I?d think that it would be good to mention
>>>> what happened in any eventual publication of presentation; it might
>>>> provide an important clue to someone who follows up with this or a
>>>> similar system. It would also be good to increase your reported
>>>> uncertainty for site occupancy (and indicate in the text what you?ve
>>>> done). I now see that your site occupancies are 0.53 +/- 0.04 for
>>>> the floated S02, and 0.72 +/-0.06 for the S02 = 0.90. That?s not so
>>>> bad, really. It means that you?re pretty confident that the site
>>>> occupancy is 0.64 +/- 0.15, which isn?t an absurdly large
>>>> uncertainty as these things go.
>>>>
>>>> To be concrete, if all the Hamilton test does not show statistically
>>>> significant improvement by floating S02, then I might write
>>>> something like this in any eventual paper: ?The site occupancy was
>>>> highly correlated with S02 in our fits, making it difficult to
>>>> determine the site occupancy with high precision. If S02 is
>>>> constrained to 0.90, a plausible value for element [X] [ref], then
>>>> the site occupancy is 0.53 +/- 0.04. If constrained to 1.0, the site
>>>> occupancy is [whatever it comes out to be] To reflect the increased
>>>> uncertainty associated with the unknown value for S02, we are
>>>> adopting a value of 0.53 +/- [enough uncertainty to cover the
>>>> results found for S02 = 1.0].
>>>>
>>>> Of course, if you do that, I?d also suggest tracking down as many
>>>> other possibilities for why your fit is showing high values of S02
>>>> as you can; e.g., double-check your normalization during data
>>>> reduction.
>>>>
>>>> If, on the other hand, the Hamilton test does show the floated S02
>>>> is yielding a statistically significant improvement, I think you
>>>> have a bigger issue. Looking at, e.g., whether you may have
>>>> constrained coordination numbers incorrectly becomes more critical.
>>>>
>>>> ?Scott Calvin
>>>> Sarah Lawrence College
>>>>
>>>>
>>
>>
>> _______________________________________________
>> Ifeffit mailing list
>> Ifeffit at millenia.cars.aps.anl.gov
>> http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
>
>
>
>
> _______________________________________________
> Ifeffit mailing list
> Ifeffit at millenia.cars.aps.anl.gov
> http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit

------------------------------

Message: 2
Date: Sat, 21 Mar 2015 13:06:39 +0000
From: "Rana, Jatinkumar Kantilal"
        <jatinkumar.rana at helmholtz-berlin.de>
To: "ifeffit at millenia.cars.aps.anl.gov"
        <ifeffit at millenia.cars.aps.anl.gov>
Subject: [Ifeffit] Breaking down correlationships between parameters
Message-ID: <DA4C7D90F90BC249A03505C87F2EB8AB230EE6D1 at didag1>
Content-Type: text/plain; charset="iso-8859-1"

Dear All,

I have stumbled upon a question regarding correlationships between various parameters in EXAFS fitting. As we know, the parameters S02*N and sigma2 are highly correlated (where N is the number of nearest neighbors).

I would like to determine the number of nearest neighbors for a series of sample subjected to some treatment. I can do this by simply setting S02 to a value for a given absorber (based on the literature or my own measurements of some reference compounds) and letting N and sigma2 vary in a fit. However, the problem is the physical process which changes the number of nearest neighbors, also introduces structural disorder in samples. Thus, I always get the values of N overestimated due to its correlationship with sigma2.

I know of a method which can be used to breakdown the correlationship between S02 and sigma2 by setting a series of S02 values at different k-weights and refining the corresponding sigma2 as discussed in several literature. However, in this approach the explicit assumption is, S02 is the property of absorbing atoms and thus is independent of changes occurring inside the sample. In my case, however, both sigma2 and N vary with changes inside samples. Is there any way to break this correlationship ?

I look forward to your valuable suggestions and comments.

Best regards,
Jatin

________________________________

Helmholtz-Zentrum Berlin f?r Materialien und Energie GmbH

Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V.

Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph
Gesch?ftsf?hrung: Prof. Dr. Anke Rita Kaysser-Pyzalla, Thomas Frederking

Sitz Berlin, AG Charlottenburg, 89 HRB 5583

Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin

http://www.helmholtz-berlin.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://millenia.cars.aps.anl.gov/pipermail/ifeffit/attachments/20150321/e048d533/attachment-0001.htm>

------------------------------

Message: 3
Date: Sat, 21 Mar 2015 09:13:30 -0500
From: Matt Newville <newville at cars.uchicago.edu>
To: XAFS Analysis using Ifeffit <ifeffit at millenia.cars.aps.anl.gov>
Subject: Re: [Ifeffit] Breaking down correlationships between
        parameters
Message-ID:
        <CA+7ESboCH5VkEpC7gAFGe2bzUc8zrEsrkCu+nAAY+BivcwHLdA at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

HI Jatin,

On Sat, Mar 21, 2015 at 8:06 AM, Rana, Jatinkumar Kantilal <
jatinkumar.rana at helmholtz-berlin.de> wrote:

>  Dear All,
>
>  I have stumbled upon a question regarding correlationships between
> various parameters in EXAFS fitting. As we know, the parameters S02*N and
> sigma2 are highly correlated (where N is the number of nearest neighbors).
>
>  I would like to determine the number of nearest neighbors for a series
> of sample subjected to some treatment. I can do this by simply setting S02
> to a value for a given absorber (based on the literature or my own
> measurements of some reference compounds) and letting N and sigma2 vary in
> a fit. However, the problem is the physical process which changes the
> number of nearest neighbors, also introduces structural disorder in
> samples. Thus, I always get the values of N overestimated due to its
> correlationship with sigma2.
>
>
By itself, the size of the correlation between any 2 variables should not
bias the best-fit results.  So, the high correlation of N and sigma2 should
not always overestimate N.   If you're consistently seeing N overestimated,
it is probably not because sigma2 is also overestimated, but more likely to
be due to some other reason.  Like, if N is consistently too high, perhaps
S02 is set artificially low.

>  I know of a method which can be used to breakdown the correlationship
> between S02 and sigma2 by setting a series of S02 values at different
> k-weights and refining the corresponding sigma2 as discussed in several
> literature. However, in this approach the explicit assumption is, S02 is
> the property of absorbing atoms and thus is independent of changes
> occurring inside the sample. In my case, however, both sigma2 and N vary
> with changes inside samples. Is there any way to break this
> correlationship ?
>
>
The idea of setting N*S02 and using different k-weights is sort cheating.
By setting N*S02 you're purposely ignoring the correlation.     Using
multiple k-weights in a fit can lower the correlation on N*S02 and sigma2
somewhat, but it certainly does not break it.   I've not seen a case where
it makes a substantial reduction (say, below 0.5, and rarely below 0.75).
That is, if you just check using a k-weight of 1,2, and 3 in Artemis,
you'll likely see the correlation drop from something like 0.95 to 0.90 in
the best cases -- hardly "breaking the correlation".    Extending the
k-range as much as possible (including to low-k) can also reduce the
correlation, but again, only by small amounts.   Like that for R and E0,
N*S02 and sigma2 will be highly correlated even if you measure EXAFS to
very high k and fit   The correlation is basically endemic.

But, correlation does not imply a bias.  It can *allow* some bias to skew
the results substantially, and increases the uncertainties in these values,
but it is really not the ultimate cause of the results.

--Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://millenia.cars.aps.anl.gov/pipermail/ifeffit/attachments/20150321/208dcfce/attachment.htm>

------------------------------

_______________________________________________
Ifeffit mailing list
Ifeffit at millenia.cars.aps.anl.gov
http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit

End of Ifeffit Digest, Vol 145, Issue 38
****************************************

________________________________

Helmholtz-Zentrum Berlin für Materialien und Energie GmbH

Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V.

Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph
Geschäftsführung: Prof. Dr. Anke Rita Kaysser-Pyzalla, Thomas Frederking

Sitz Berlin, AG Charlottenburg, 89 HRB 5583

Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin

http://www.helmholtz-berlin.de