Reduced chi square values versus F-tests for second shell fits
Dear All, Would it be recommended to compare reduced chi square (RCS) values from the fit of the second shell only? While fitting the second shell only, the differences between RCS values of different models can be greater than when including both the first and second shells in the fit. For example, the RCS values of model 1 and model 2 for fits of the first and second shells versus just the second shell are 45 and 18, and then 60 and 17, respectively. This to me indicates that fitting the second shell separately from the first gave me a slightly better description of the misfit in that R-range. Regardless of the fitting range the RCS values say model 2 improves the fit significantly. However, if I use an F-test to compare models while fitting the second shell only, I’ve found that the difference between the models is actually less important than when including both the first and second shells in the fit. For examples, the F-test values of model 1 versus model 2 for fits of the first and second shell versus just the second shell are P=0.047 and then P=0.113, respectively. This is perhaps caused by the small R-range and hence decreased Nind when fitting the second shell only. However, it also indicates to me that the F-test says perhaps one fitting model is not such an improvement on the fit as I thought it was based purely on the decrease in RCS value. I assume significant decreases in RCS values to be about >2x, and P <0.05 for the F-test. Should RCS values and F-tests for comparing fitting models only be used when fitting two shells together or can they be used when just fitting the second shell? Kind regards, Matt Siebecker
Statistical parameters can be used with any kind of fit. I would consider that as a sort of tautology. Given that a fit is performed within a particular statistical framework, the tools standard to that framework are the tools to use. Our software does not police that your fitting model is defensible, only that it is expressed correctly in a numerical sense. Thus RCS and other statistical parameters are always well defined. That said, I am very skeptical of any analysis of only the second shell. I explain my reasoning in some detail here: http://www.mail-archive.com/ifeffit@millenia.cars.aps.anl.gov/msg03563.html If I understand you correctly, you are doing something similar to what that other person was doing. By not including the first shell in the fit, you are artificially excluding correlations between the two spectral regions and between the parameters you use to model them. I have no doubt that you are better able to model the Fourier components of the second shell in the limited sense of evaluating misfit when you exclude the first shell from the fit. You made the rather arbitrary decision to exclude a major source of uncertainty from your fitting model. That does not make your fit more defensible. Of course, there is always the possibility that I didn't quite understand the question.... B http://www.mail-archive.com/ifeffit@millenia.cars.aps.anl.gov/msg03563.html On Thursday, March 21, 2013 01:05:27 PM Matt Siebecker wrote:
Dear All,
Would it be recommended to compare reduced chi square (RCS) values from the fit of the second shell only? While fitting the second shell only, the differences between RCS values of different models can be greater than when including both the first and second shells in the fit. For example, the RCS values of model 1 and model 2 for fits of the first and second shells versus just the second shell are 45 and 18, and then 60 and 17, respectively. This to me indicates that fitting the second shell separately from the first gave me a slightly better description of the misfit in that R-range. Regardless of the fitting range the RCS values say model 2 improves the fit significantly.
However, if I use an F-test to compare models while fitting the second shell only, I’ve found that the difference between the models is actually less important than when including both the first and second shells in the fit. For examples, the F-test values of model 1 versus model 2 for fits of the first and second shell versus just the second shell are P=0.047 and then P=0.113, respectively. This is perhaps caused by the small R-range and hence decreased Nind when fitting the second shell only. However, it also indicates to me that the F-test says perhaps one fitting model is not such an improvement on the fit as I thought it was based purely on the decrease in RCS value. I assume significant decreases in RCS values to be about >2x, and P <0.05 for the F-test.
Should RCS values and F-tests for comparing fitting models only be used when fitting two shells together or can they be used when just fitting the second shell?
Kind regards, Matt Siebecker
-- Bruce Ravel ------------------------------------ bravel@bnl.gov National Institute of Standards and Technology Synchrotron Methods Group at NSLS --- Beamlines U7A, X24A, X23A2 Building 535A Upton NY, 11973 Homepage: http://xafs.org/BruceRavel Software: https://github.com/bruceravel
Hello Bruce, Thanks for your response. By “second shell fits” I mean that the best values for the first shell were fitted then fixed and then the fitting R-range was moved to the second shell. Similarly to how Matt Newville describes here: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2009-April/008779.html He describes this as an acceptable approach, although, others in the thread disagree. Essentially, my question is why do the F-test and the RCS results not agree with each other? The F-test indicates model 2 may not be a statistical improvement over model 1 while the RCS values show that model 2 is definitely an improvement over model 1. If I consider a reduction in the RCS value of >2x as significant, then I would pick model 2. However, can I apply this logic when fitting the second shell with the best parameters for the first shell fixed and the R-ranges over the second shell? Thank you again, Matt S
In another message in that thread, Matt calls that "a reasonable way to start" and makes the valid point that small details can "often get lost in the fit to the more dominant features of the spectra." What I am saying is that the reasson that small details "often get lost" is because there are correlations in any data. If you fix one part of the fit and float another part, you have made a very specific decision to ignore an entire set of correlations. That may make the results on those small details much more *precise*, but it most certainly does not guarantee that the results will be more *accurate*. So yes, you can do what you want to do. I said the same in my last email. But you have to be prepared to defend against the completely valid criticism that doing so arbitrarily removes correlations from the fitting model. That can have an impact on the accuracy of your result. Neither the RCS nor the F test address that problem. Of course, if you dig through all the stuff that I have written over the years, I frequently report (in the case of publications) or recommend (in the case of teaching material) doing things that fall into the category of improving precision while risking accuracy. Indeed, any use of constraints in Artemis could engender this criticism -- and I gas on and on and on about the virtues of constraints whenever I do EXAFS training courses. But I always try to emphasize the importance of honestly assessing the consequences of these actions, both with myself and with my readership. As an example, in http://dx.doi.org/10.1016/j.radphyschem.2009.05.024 I spend a rather long paragraph clearly stating the most egregious approximations I made in the analysis presented in that paper. The remainder of the paper justifies all that using both the XANES and other published work on the system. I guess that none of that answered your specific question about the nominal disagreement between the RCS and the F test. I might be exposing a weakness in my own understanding of the F test right now, but I can suggest something to think about. The F test result may be saying something about the normality of the parameters that you actually used in the fit. Try varying some of the procedural parameters of the fit. For example, try limiting or expanding the ranges in k or R by a bit; try different k-weightings; try adding a bit of artificial noise to your chi(k) data -- anything that slightly changes the conditions of the fit without actually changing the details of the model or information content of the data. Doing so might help clarify what is going on with your statistical tests. HTH, B On Thursday, March 21, 2013 02:38:22 PM Matt Siebecker wrote:
Hello Bruce,
Thanks for your response. By “second shell fits” I mean that the best values for the first shell were fitted then fixed and then the fitting R-range was moved to the second shell. Similarly to how Matt Newville describes here:
http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2009-April/008779.html
He describes this as an acceptable approach, although, others in the thread disagree.
Essentially, my question is why do the F-test and the RCS results not agree with each other? The F-test indicates model 2 may not be a statistical improvement over model 1 while the RCS values show that model 2 is definitely an improvement over model 1. If I consider a reduction in the RCS value of >2x as significant, then I would pick model 2.
However, can I apply this logic when fitting the second shell with the best parameters for the first shell fixed and the R-ranges over the second shell?
Thank you again, Matt S
-- Bruce Ravel ------------------------------------ bravel@bnl.gov National Institute of Standards and Technology Synchrotron Methods Group at NSLS --- Beamlines U7A, X24A, X23A2 Building 535A Upton NY, 11973 Homepage: http://xafs.org/BruceRavel Software: https://github.com/bruceravel
Thanks again Bruce. Yes that helps. What I want to do is defend the criticism of my fits by honestly assessing their quality, i.e., not over stating the results based on fits that compromise accuracy for precision. As you describe in the 2009 RP&C paper on mercury binding to DNA, when dealing with a limited number of independent data points yet several paths that contribute significantly to the EXAFS signal, logically creative methods to constrain fitting parameters are necessary. I have found the method that Matt Newville describes in that thread to be a useful method during the data analysis process. However, I have found it to be very important to re-open the fit to both shells after fitting the second shell. I use this method of fitting the second shell while fixing the first shell to filter out models of similar first and second shell structure yet different second shell composition, e.g., a mixed metal versus a single metal in the second shell. For both fitting models, the first shell is very similar (e.g., circa 6 oxygens). If both models have similar first shell values and correlations, perhaps it is reasonable to ignore them during part of the fitting routine to focus on the small details of the second shell that can get lost in the more dominant features of the first shell. Essentially, I compare the RCS values of the second shell fits of two models that have the same first shell to choose which model is better. I will test varying some of the procedural parameters in the fit (k-weighting, k and R ranges) as you suggested. I’ve seen some discussion about this on the mail list, and it is also strongly recommended there. For those interested in F-tests, I’ve found the following resources helpful: Title: F-test in EXAFS fitting of structural models. Author(s): Michalowicz, A; Provost, K; Laruelle, S; etal. Source: JOURNAL OF SYNCHROTRON RADIATION Volume: 6 Pages: 233-235 DOI: 10.1107/S0909049599000734 Part: Part 3 Published: MAY 1 1999 Title: Statistical evaluations in fitting problems. Author(s): Klementev, KV. Source: JOURNAL OF SYNCHROTRON RADIATION Volume: 8 Pages: 270-272 DOI: 10.1107/S0909049500015351 Part: Part 2 Published: MAR 2001 Title: A Variation of the F-Test for Determining Statistical Relevance of Particular Parameters in EXAFS Fits. Author: Downward, L. Booth, C.H. Lukens, W.W. Bridges, F. Publication Date: 07-25-2006 Publication Info: Lawrence Berkeley National Laboratory Permalink: http://www.escholarship.org/uc/item/5p60c864 And also here on the mailing list: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2009-August/008983.html In the Michalowicz paper I’ve found values slightly different when reproducing his numbers. I use some okay free web-based F-test calculators or excel – but if anyone has a preferred web-based F-test calculator that information would also be appreciated. Kind regards, Matt S.
participants (2)
-
Bruce Ravel
-
Matt Siebecker