New subject: Breaking down correlationships between parameters

21 Mar 2015

      Hi Matt,

Thanks a lot for your prompt reply. The method I am referring to is not the multiple k-weight fits by constraining N*S02. My apologies for not being clear enough. Let's do it again. I am actually referring to an approach where we take an advantage of a different k-dependence of various parameters to breakdown correlations between them. For example, S02 and sigma2. S02 is k-independent and Sigma2 has k^2 dependence.

In this case, to breakdown correlation between S02 and sigma2, one can assume a series of S02 values and perform fits using a single k-weight each time (say k-weight 1,2 and 3) and record corresponding sigma2 values. Let us say for k-weight =1, a series of preset S02 values will result in a series of corresponding sigma2 values refined in fits, which can be plotted as a straight line in sigma2 vs. S02 plot. Similar straight lines can be obtained for fits using k-weight = 2 and then 3. Now, these three lines may intersect at or near some point, which will determine the "true" value of parameters independent of k-weight. One can then constrain S02 to a value obtained from the point of intersection of three lines and vary sigma2 in a fit. In this particular case, however, the advantage is, S02 does not depend on changes inside sample and we have very good estimate of its range (say 0.7 - 1.0).

Now suppose instead of S02 (which i now set to a reasonable value), I am interested in determining N, but it is highly correlated with sigma2. Each time when disorder in the sample increases, the sigma2 increases and due to its high correlation, N is also overestimated. On the other hand, when the disorder in the sample decreases, the sigma2 decreases and I can have a "true" estimation of N in the sample. Can I still apply the above mentioned approach to break the correlationship between N and sigma2 and get a "true" estimation of N, even if disorder is high in my samples ? or it is simply not possible due to the fact that both N and sigma2 varies with changes inside the sample.

Best regards,
Jatin

________________________________________
From: ifeffit-bounces@millenia.cars.aps.anl.gov [ifeffit-bounces@millenia.cars.aps.anl.gov] on behalf of ifeffit-request@millenia.cars.aps.anl.gov [ifeffit-request@millenia.cars.aps.anl.gov]
Sent: Saturday, March 21, 2015 3:14 PM
To: ifeffit@millenia.cars.aps.anl.gov
Subject: Ifeffit Digest, Vol 145, Issue 38

Send Ifeffit mailing list submissions to
	ifeffit@millenia.cars.aps.anl.gov

To subscribe or unsubscribe via the World Wide Web, visit
	http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
or, via email, send a message with subject or body 'help' to
	ifeffit-request@millenia.cars.aps.anl.gov

You can reach the person managing the list at
	ifeffit-owner@millenia.cars.aps.anl.gov

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Ifeffit digest..."

Today's Topics:

   1. Re: amplitude parameter S02 larger than 1 (Scott Calvin)
   2. Breaking down correlationships between parameters
      (Rana, Jatinkumar Kantilal)
   3. Re: Breaking down correlationships between parameters
      (Matt Newville)

----------------------------------------------------------------------

Message: 1
Date: Fri, 20 Mar 2015 18:53:06 -0400
From: Scott Calvin 
To: XAFS Analysis using Ifeffit 
Subject: Re: [Ifeffit] amplitude parameter S02 larger than 1
Message-ID: 
Content-Type: text/plain; charset="utf-8"

Hi Yanyun,

Good. So here's the procedure for a Hamilton test.

We're comparing the fit with S02 guessed to the one with S02 set to 0.90, because that was your a priori best guess at S02.

I take the ratio of the first R-factor to the second. You didn't actually say the R-factor for the fit with S02 guessed, but it's clearly around 0.0055 based on the other information you gave. The R-factor for the 0.90 fit is 0.021. So the ratio is 0.0055/0.021 = 0.26, which we'll call x.

For the first fit the degrees of freedom is 31.2 - 24 = 8.2. Take half of that and call that a. So a is 4.1.

The first fit guesses 1 parameter that the second one doesn't. Take half of 1 and call that b. So b is 0.5.

Find a regularized lower incomplete beta function calculator, like this one: http://www.danielsoper.com/statcalc3/calc.aspx?id=37

Enter x, a, and b.

The result is 0.001. This means that there is a 0.1% chance that the fits are actually consistent, and that the difference is just due to noise in the data.

So in this case, we can't just explain away the high S02 as insignificant.

Of course, you could pretty much eyeball that once you gave me the uncertainties; since your fit said 1.45 +/- 0.14, that's likely to be quite incompatible with S02 = 0.9. Still, it's nice to put that on a firmer statistical basis, and I've personally found the Hamilton test quite helpful for answering "do I need to worry about [X]?" type questions.

But in your case, you do need to worry about it. This discussion has generated several suggestions; hopefully one of them is a good lead!

--Scott Calvin
Sarah Lawrence College
...
On Mar 20, 2015, at 4:30 PM, huyanyun@physics.utoronto.ca wrote:
Hi Scott,
In all situations, 31.2 independent data points and 24 variables were
used. In the case of setting S02 to a value, 23 variables were used.
Let me know if there is any other info needed.
Best,
Yanyun
Quoting Scott Calvin :
...
Hi Yanyun,
To actually do a Hamilton test, the one other thing I need to know
the number of degrees of freedom in the fit...if you provide that,
I'll walk you through how to actually do a Hamilton test--it's not
that bad, with the aid of an online calculator, and I think it might
be instructive for some of the other people reading this list who
are trying to learn EXAFS.
--Scott Calvin
Sarah Lawrence College
...
On Mar 20, 2015, at 3:46 PM, huyanyun@physics.utoronto.ca wrote:
Hi Scott,
Thank you so much for giving me your thought again. It is very helpful
to know how you and other XAFS experts deal with unusual situations.
The floating S02 is fitted to be 1.45+/-0.14, this just means the fit
doesn't like the idea of an S02 in a typical range. Instead of setting
S02 to 0.9, I have to figure out why it happens and what it might
indicate.
I guess a Hamilton test is done by adjusting one parameter (i.e., S02)
while keeping other conditions and model the same. Is that right?  So
I record this test as following:
1) Floating S02: S02 fits to 1.45+/-0.14, R=0.0055, reduced
chi^2=17.86, Percentage=0.53+/-0.04
2) Set S02=0.7, R=0.044, reduced chi^2=120.6, percentage=0.81+/-0.2
3) set S02=0.8, R=0.030, reduced chi^2=86.10, percentage=0.77+/-0.07
3) set S02=0.9, R=0.021, reduced chi^2=60.16, percentage=0.72+/-0.06
4) set S02=1.0, R=0.017, reduced chi^2=49.5, percentage=0.67+/-0.05
5) set S02=1.1, R=0.012, reduced chi^2=35.1, percentage=0.62+/-0.03
6) set S02=1.2, R=0.009, reduced chi^2=24.9, percentage=0.59+/-0.02
7) set S02=1.3, R=0.007, reduced chi^2=18.9, percentage=0.57+/-0.02
8) set S02=1.4, R=0.0057, reduced chi^2=16.1, percentage=0.55+/-0.02
9) Floating S02 to be 1.45+/-0.14
10) set S02=1.6, R=0.006, reduced chi^2=17.8, percentage=0.53+/- 0.02
11) set S02=2.0, R=0.044, reduced chi^2=120.7, percentage=0.37+/-0.06.
Therefore, I will say S02 falling in the range 1.2~1.6 gives
statistically improved fit, but S02=0.9 is not terrible as well. I
agree with you that I could always be confident to say the percentage
is 0.64+/-0.15, but I do want to shrink down the uncertainty and think
about other possibilities that could cause a large S02.
I did double-check the data-reduction and normalization process. I
don't think I can improve anything in this step. By the way, I have a
series of similar samples and their fittings all shows floating S02
larger than one based on the same two-sites model.
Best,
Yanyun
Quoting Scott Calvin :
...
Hi Yanyun,
Lots of comments coming in now, so I?m editing this as I write it!
One possibility for why you're getting a high best-fit S02 is that
the fit doesn't care all that much about what the value of S02; i.e.
there is broad range of S02's compatible with describing the fit as
"good." That should be reflected in the uncertainty that Artemis
reports. If S02 is 1.50 +/- 0.48, for example, that means the fit
isn't all that "sure" what S02 should be. That would mean we could
just shrug our shoulders and move on, except that it correlates with
a parameter you are interested in (in this case, site occupancy). So
in such a case, I think you can cautiously fall back on what might
be called a "Bayesian prior"; i.e., the belief that the S02 should
be "around" 0.9, and set the S02 to 0.9. (Or perhaps restrain S02 to
0.9; then you're really doing something a bit more like the notion
of a Bayesian prior.)
On the other hand, if the S02 is, say, 1.50 +/- 0.07, then the fit
really doesn?t like the idea of an S02 in the typical range. An S02
that high, with that small an uncertainty, suggests to me that
something is wrong?although it could be as simple as a normalization
issue during data reduction. In that case, I?d be more skeptical of
just setting S02 to 0.90 and going with that result; the fit is
trying to tell you something, and it?s important to track down what
that something is.
Of course, once in a while, a fit will find a local minimum, while
there?s another good local minimum around a more realistic value.
That would be reflected by a fit that gave similarly good
quantitative measures of fit quality (e.g. R-factors) when S02 is
fit (and yields 1.50 +/- 0.07) as when its forced to 0.90. That?s
somewhat unusual, however, particularly with a global parameter like
S02.
A good way to defend setting S02 to 0.90 is to use the Hamilton test
to see if floating S02 yields a statistically significant
improvement over forcing it to 0.90. If not, using your prior best
estimate for S02 is reasonable.
If you did that, though, I?d think that it would be good to mention
what happened in any eventual publication of presentation; it might
provide an important clue to someone who follows up with this or a
similar system. It would also be good to increase your reported
uncertainty for site occupancy (and indicate in the text what you?ve
done). I now see that your site occupancies are 0.53 +/- 0.04 for
the floated S02, and 0.72 +/-0.06 for the S02 = 0.90. That?s not so
bad, really. It means that you?re pretty confident that the site
occupancy is 0.64 +/- 0.15, which isn?t an absurdly large
uncertainty as these things go.
To be concrete, if all the Hamilton test does not show statistically
significant improvement by floating S02, then I might write
something like this in any eventual paper: ?The site occupancy was
highly correlated with S02 in our fits, making it difficult to
determine the site occupancy with high precision. If S02 is
constrained to 0.90, a plausible value for element [X] [ref], then
the site occupancy is 0.53 +/- 0.04. If constrained to 1.0, the site
occupancy is [whatever it comes out to be] To reflect the increased
uncertainty associated with the unknown value for S02, we are
adopting a value of 0.53 +/- [enough uncertainty to cover the
results found for S02 = 1.0].
Of course, if you do that, I?d also suggest tracking down as many
other possibilities for why your fit is showing high values of S02
as you can; e.g., double-check your normalization during data
reduction.
If, on the other hand, the Hamilton test does show the floated S02
is yielding a statistically significant improvement, I think you
have a bigger issue. Looking at, e.g., whether you may have
constrained coordination numbers incorrectly becomes more critical.
?Scott Calvin
Sarah Lawrence College
_______________________________________________
Ifeffit mailing list
Ifeffit@millenia.cars.aps.anl.gov
http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
_______________________________________________
Ifeffit mailing list
Ifeffit@millenia.cars.aps.anl.gov
http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
------------------------------

Message: 2
Date: Sat, 21 Mar 2015 13:06:39 +0000
From: "Rana, Jatinkumar Kantilal"

To: "ifeffit@millenia.cars.aps.anl.gov"

Subject: [Ifeffit] Breaking down correlationships between parameters
Message-ID: 
Content-Type: text/plain; charset="iso-8859-1"

Dear All,

I have stumbled upon a question regarding correlationships between various parameters in EXAFS fitting. As we know, the parameters S02*N and sigma2 are highly correlated (where N is the number of nearest neighbors).

I would like to determine the number of nearest neighbors for a series of sample subjected to some treatment. I can do this by simply setting S02 to a value for a given absorber (based on the literature or my own measurements of some reference compounds) and letting N and sigma2 vary in a fit. However, the problem is the physical process which changes the number of nearest neighbors, also introduces structural disorder in samples. Thus, I always get the values of N overestimated due to its correlationship with sigma2.

I know of a method which can be used to breakdown the correlationship between S02 and sigma2 by setting a series of S02 values at different k-weights and refining the corresponding sigma2 as discussed in several literature. However, in this approach the explicit assumption is, S02 is the property of absorbing atoms and thus is independent of changes occurring inside the sample. In my case, however, both sigma2 and N vary with changes inside samples. Is there any way to break this correlationship ?

I look forward to your valuable suggestions and comments.

Best regards,
Jatin

________________________________

Helmholtz-Zentrum Berlin f?r Materialien und Energie GmbH

Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V.

Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph
Gesch?ftsf?hrung: Prof. Dr. Anke Rita Kaysser-Pyzalla, Thomas Frederking

Sitz Berlin, AG Charlottenburg, 89 HRB 5583

Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin

http://www.helmholtz-berlin.de

Re: [Ifeffit] Breaking down correlationships between parameters

Rana, Jatinkumar Kantilal

Matt Newville

Scott Calvin

Matt Newville

tags

participants (3)