Gustavo de Medeiros Azevedo said: GdMA> I''ve recently started using IFEFFIT, and I'm a bit confused on GdMA> the way it estimates the error bars. In the IFEFFIT Reference GdMA> guide, (page 38) you mention the variables epsilon_k and GdMA> epsilon_r, where the measurement uncertainty is stored. In GdMA> that same page, you remark that changing the values of GdMA> epsilon_k and epsilon_r will change chi_squared, Chi-reduced GdMA> but won't affect R_factor and the error bars. As far as I GdMA> understand, the error bars should given by the diagonal GdMA> elements of the co-variance matrix, multiplied by the square GdMA> root of Chi_reduced. In this way, I would expect that varying GdMA> epsilon_k or epsilon_r should affect the estimated GdMA> uncertainties. GdMA> Could you, please, clarify this point? Gustavo, Well, I live one time zone to the east of Matt, so I guess I'm seeing this before he does ;-) I'll take a stab at your question. epsilon_r is measured from the data by performing the Fourier transform of chi(k) and then computing the root mean square value of the magnitude of chi(R) between 15 and 25 Angstroms. The assumption is that, because of disorder and mean-free-path effects, any finite spectral content in that R range can only be due to white noise in the data. The relationship between epsilon_k and epsilon_r is established by the normalization of the Fourier transform. Because epsilon_R enters into the fitting metric, chi-square, when the fit is done in R-space (or epsilon_k for a fit in k space), you are correct that the value of chi-square and reduced chi-square are affected by the value of epsilon_r. The R_factor is just a percentage misfit and so is independent of epsilon. The error bars are independent of epsilon as reported by ifeffit, but the reason for this requires some explanation. The first question to ask at this point is whether it is valid to presume that the dominant error in the exafs measurement is the shot noise measured from the high-R portion of the Foruier spectrum. In most situations it is not -- that is, in most situations the shot noise is much smaller that detector non-linearity, sample inhomogeneity, errors in the theoretical fitting standards, errors in the empirical fitting standards, or a whole host of other problems you might have in a real experiment. Particularly at 3rd generation light sources, photon flux and the shot noise associated with it is the least of your problems. By a long stretch. Thus, the way ifeffit uses to estimate epsilon is manifestly wrong. It is way too small, thus chi-square is very large. Even for a fit that is clearly a good fit, such as the common example of a copper foil that comes with the Artemis, the reduced chi-sqare is much larger than 1. By the formalism of Gaussian statistics, we expect that a good fit has a reduced chi-square of about 1. So what's going on? Well, the problem is that ifeffit makes no attempt (since it operates within a Gaussian framework and not a Baysian framework) to estimate all the other sources of error besides shot noise. Thus it is almost always doomed to have reduced chi-square much larger than 1 (except in the odd case when the shot noise is quite large). So what about the error bars? The diagonal elements of the covarience matrix (i.e. the error bars) are much too small for the same reason that the reduce chi-square is too large, i.e. because epsilon has been vastly underestimated. Ifeffit then makes the assumption that the fit it just finished was, in fact, a good fit. It assumes that the reason that reduced chi-square is not near one is NOT because it is a bad fit, but rather because epsilon has been significantly underestimated. Multiplying the error bars by the square root of reduced chi-square is mathematically equivalent to re-evaluating the covarience matrix with epsilon set to the value it must be for reduced chi-square to equal 1. Thus, it is the mathematical equivalent of assuming that the fit is, in fact, a good fit. So finally, you asked how setting epsilon to some other value will effect the fit. The answer is "Not much." It will have some small effect in how the fit is evaluated numerically because chi-square may be of a different order of magnitude. However, because Ifeffit rescales the error bars under the assumption that the fit was good, manually changing epsilon will have little effect on the results. One more thing needs to be said. Because Ifeffit *always* assums that the fit is a good fit and reports statistics accordingly, it is the RESPONSIBILITY OF THE USER to actually evaluate the quality of the fit. Are the best fit values physically reasonable? Are the parameters highly correlated? Are the rescaled error bars of a reasonable size? Ifeffit cannot answer those questions. You must. And that, finally, is why a human needs to do data analysis. Sadly we cannot train a computer or a monkey to decide if the fit is, in fact, the one you want to publish. Hope that helps, Bruce -- Bruce Ravel ----------------------------------- ravel@phys.washington.edu Code 6134, Building 3, Room 222 Naval Research Laboratory phone: (1) 202 767 5947 Washington DC 20375, USA fax: (1) 202 767 1697 NRL Synchrotron Radiation Consortium (NRL-SRC) Beamlines X11a, X11b, X23b, X24c, U4b National Synchrotron Light Source Brookhaven National Laboratory, Upton, NY 11973 My homepage: http://feff.phys.washington.edu/~ravel EXAFS software: http://feff.phys.washington.edu/~ravel/software/exafs/
Hi everyone, I agree 99.99% with Bruce's explanation, and just want to clarify a couple points and go on about a few other aspects. As Bruce points out, the estimated error bars are rescaled because the automated estimate of the uncertainty in the data is almost always too small, and that the reported reduced chi-square for a good fit often exceeds 20 (whereas it should be close to 1). Bruce also points out (again, correctly) that we have very little experience training monkeys. Sorry, I couldn't resist. Chi-square, Reduced Chi-square, and the R-factor can be used to determine whether a fit is 'good'. They can certainly be used to compare two fits to decide which is 'better'. In Statistics 101 (ie, a first pass at discussing Data and Error Analysis), Chi-square is used to estimate uncertainties in fitted parameters, provided the uncertainties in the data are known. That is, in Statistics 101, Chi-square has two very different purposes: 1) Is the fit good? and 2) What are the uncertainties in the parameters? Since, we don't know the uncertainties in the data (which Statistics 101 happily ignores), we have to do the best we can. The approach used by ifeffit is common and carefully critiqued in Numerical Recipes by Press, et al. This approach definitely leads to the caution Bruce expressed about using some judgment about whether to trust fit results. The rescaling means that the reported uncertainties in the parameters are valid *IF* the fit is "good". That seems reasonable because if the fit is not "good", you probably don't care that the reported uncertainties are not good either. Incidentally, the Statistics 101 view of Chi-square also glosses over the notion of what counts as a 'data point'. That leads to the whole idea of Number of Independent Points, N_idp, which would set a maximum number of fittable parameters and would go into the Chi-square equation, and has led to lots of discussion in the EXAFS community. If non-rescaled Chi-square is used to estimate uncertainties, N_idp would also effect the error bars. Rescaling error bars to assert that the fit is good (that is, scaling epsilon so that Chi-square = N_idp - N_Parameters) actually lessens the dependence of the error bars on N_idp. The 'white-noise' estimate of epsilon_R that Ifeffit (and Feffit) does is very easy, and usually not too far off for white-noise (that is, the portion of the noise that is independent of R). It actually works reasonably well for very noisy data. We all do the best we can to avoid that situation!! Bruce gave the normal arguments of saying that the white-noise estimate doesn't include systematic errors. I would put a slight variation on this: It does include systematic and statistical errors that are 'white', but doesn't include statistical or systematic errors that are not white. There has been a lot of speculation in the EXAFS community about the importance of systematic errors. Many have suggested that systematic errors are dominated by bad background-removal. I'm not sure I agree, but this would certainly count as a non-white, systematic error. Glitches are systematic errors that have a fairly large component that is white. I don't think anyone really has a complete handle on this topic, or indeed why EXAFS fits tend to be much worse than white-noise would predict. Blaming the Feff calculations is another popular option!! In principle, a Bayesian approach could help, but I don't think that it would magically give better error bar estimates. In a Bayesian approach, we would need to put uncertainties on the Feff calculations too -- a good idea, but not trivial to do. That being said, if anyone has any ideas of a better approach or even a robust alternative, please let me know!! --Matt
participants (2)
-
Bruce Ravel
-
Matt Newville