ARE: [Ifeffit] Re: Ifeffit Digest, Vol 5, Issue 1
Hello everyone, > > I would second Matt's reservations about averaging I0 for > > different XAFS scans. The main reason we record I0 is to > > compensate for fluctuations in I0, which are mirrored in > > If (statistically they are highly correlated) so the ratio > > If/I0 eliminates the variation (as long as the > detector chain is linear). > I totally agree. This can lead to big problems in the data. We > accidentally tried it once and then made sure that the >average was performed instead. I think the best way to do it statistically is to weight each individual XAS spectrum with 1/s^2, where s is an estimate of the noise in the spectrum. This makes sure that a noisy spectrum does not alter the overall quality of the fit . Of course, a simpler way to do it is to retain only the spectra of overall similar -and hopefully good - quality, and discard the noisy ones. I think an evaluation of the spectrum noise may be obtained from the FT contributions peaking at high 'distances' (e.g., > 10Å). Hence, with a correct script, we should be able to: - extract chi(k) functions from individual XAS spectra - perform FTs for these spectra, - perform inverse FT on the spectra over a given limit, and calculate s from that, - usecalculated s values to sum weight-average individual XAS spectra. As I'm a very lousy IFEFFIT user (means that I've stopped short from hitting buttons on Athena and Artemis), I would humbly ask Bruce if there's any chance to see such a methodology implemented on Athena one of these days (not in a hurry, though) Best regards, Michel Schlegel
On Friday 04 July 2003 02:48 am, SCHLEGEL Michel 177447 wrote: > I think an evaluation of the spectrum noise may be obtained > from the FT contributions peaking at high 'distances' (e.g., > > 10Å). Hence, with a correct script, we should be able to: > - extract chi(k) functions from individual XAS spectra > - perform FTs for these spectra, > - perform inverse FT on the spectra over a given limit, and > calculate s from that, > - usecalculated s values to sum weight-average individual > XAS spectra. > > As I'm a very lousy IFEFFIT user (means that I've stopped > short from hitting buttons on Athena and Artemis), I would > humbly ask Bruce if there's any chance to see such a methodology > implemented on Athena one of these days (not in a hurry, though) The short answer is yes. Currently Athena sets the noise to 1 for all scans involved in the merge. However, Ifeffit offers the very handy chi_noise function, which computes the noise in a manner similar to what Michel suggests. (In fact, it computes the average of the chi(R) spectrum between 15 and 25 Å.) It would be simple matter to weight each spectrum by that number and to give the user the option of choosing how to do the weighting. I'll put it on my list of things to do, which seems to grow faster than I can shrink it! Thanks, B -- Bruce Ravel ----------------------------------- ravel@phys.washington.edu Code 6134, Building 3, Room 222 Naval Research Laboratory phone: (1) 202 767 5947 Washington DC 20375, USA fax: (1) 202 767 1697 NRL Synchrotron Radiation Consortium (NRL-SRC) Beamlines X11a, X11b, X23b, X24c, U4b National Synchrotron Light Source Brookhaven National Laboratory, Upton, NY 11973 My homepage: http://feff.phys.washington.edu/~ravel EXAFS software: http://feff.phys.washington.edu/~ravel/software/exafs/
Hi all, Michel Schelgel said:
I think an evaluation of the spectrum noise may be obtained from the FT contributions peaking at high 'distances' (e.g.,
10Å).
Although I think the scheme outlined by Michel is basically a good one, I would not advise using the high r FT as a measure of the spectrum's overall noise, at least on unfamiliar beamlines. It is my experience that on some beamlines there are spurious effects, such as those related to monochromator lock-in, pumps, etc. that introduce high-frequency (i. e. high r) oscillations without affecting oscillations in the 1-10 Angstrom range. The method IFEFFIT uses for estimating errors in fitted parameters is fortunately not dependent on this error estimate, although some statistical information, such as the value of chi-squared, is. But using high-r noise to weight spectra during merges may rely more about the behavior of a pump or a feedback mechanism than on the actual noise within the FT's region of interest. Unless I have a better way of estimating noise, I prefer to throw out obviously screwy scans and count the rest equally. --Scott Calvin
On Fri, 4 Jul 2003 SCHLEGEL Michel 177447wrote: > > I totally agree. This can lead to big problems in the data. We > > accidentally tried it once and then made sure that the > > average was performed instead. > > I think the best way to do it statistically is to weight each > individual XAS spectrum with 1/s^2, where s is an estimate of > the noise in the spectrum. This makes sure that a noisy > spectrum does not alter the overall quality of the fit . Of > course, a simpler way to do it is to retain only the spectra > of overall similar -and hopefully good - quality, and discard > the noisy ones. Just to be clear on the original point: setting mu = / is a very bad idea. Getting to look at the variations in Io itself might be useful, however. I do agree that weighted averages of data based on the noise in the data could be helpful, but getting a good estimate for the noise in data can be tricky. > I think an evaluation of the spectrum noise may be obtained > from the FT contributions peaking at high 'distances' (e.g., > > 10�). Hence, with a correct script, we should be able to: > - extract chi(k) functions from individual XAS spectra > - perform FTs for these spectra, > - perform inverse FT on the spectra over a given limit, and > calculate s from that, > - usecalculated s values to sum weight-average individual > XAS spectra. > > As I'm a very lousy IFEFFIT user (means that I've stopped > short from hitting buttons on Athena and Artemis), I would > humbly ask Bruce if there's any chance to see such a methodology > implemented on Athena one of these days (not in a hurry, though) As Bruce pointed out, Ifeffit's chi_noise() command will estimate the noise in chi(k) with essentially this procedure (averaging chi(R) between 15 and 25Ang and using Parseval's theorem to give the noise in chi(k)). For noisy data with known counting statistics, and tests with known random noise added to spectra, this works well. This method will miss noise that depends strongly on R. These "non-white" portions are usually interpreted as "systematic errors". Whether these are really systematic measurement errors is not well-established. Nearly everyone (including me) believes ths non-whate terms to dominate the noise, though there's not a lot of hard evidence for this either. Of course, what you want is the noise in chi that is at the same frequencies as the signal you're interested in. That is *very* hard to estimate. Scan-to-scan variations can be useful to look at, but are, by themselves, fairly poor measures of uncertainty in the data. Scott Calvin wrote: > It is my experience that on some beamlines there are spurious > effects, such as those related to monochromator lock-in, pumps, > etc. that introduce high-frequency (i. e. high r) oscillations > without affecting oscillations in the 1-10 Angstrom range. The > method IFEFFIT uses for estimating errors in fitted parameters is > fortunately not dependent on this error estimate, although some > statistical information, such as the value of chi-squared, is. But > using high-r noise to weight spectra during merges may rely more > about the behavior of a pump or a feedback mechanism than on the > actual noise within the FT's region of interest. I'd assumed that vibrations would actually cause fairly white noise, though feedback mechanisms could skew towards high frequency. Other effects (temperature/pressure/flow fluctuations in ion chamber gases and optics) might skew toward low-frequency noises. I have not seen many studies of vibrations, feedback mechanism, or other beamline-specific effects on data quality, and none discussing the spectral weight of the beamline-specific noise. On the other hand, all data interpolation schemes do some smoothing, which suppresses high frequency components. And it usually appears that the high-frequency estimate of the noise from chi_noise() or Feffit gives an estimate that is significantly *low*. Anyway, I think using the 'epsilon_k' that chi_noise() estimates as the noise in chi(k) is a fine way to do a weighted averages of data. It's not perfect, but neither is anything else. --Matt
On Monday 07 July 2003 12:45 pm, Matt Newville wrote:
Anyway, I think using the 'epsilon_k' that chi_noise() estimates as the noise in chi(k) is a fine way to do a weighted averages of data. It's not perfect, but neither is anything else.
Exactly right! As Matt explained, there are reasons to believe that the measurement uncertainty is dominated by things that Parsival's theorem doesn't address and that measuring those problems is hard. As Matt said, weighting by 1/chi_noise() is OK. As Scott said, weighting uniformly is OK. Those are two choices are the most OK I can think of, so Athena will let you choose between them. While I am of the chorus of people agreeing with Grant that mu is not equal to <If>/<I0>, I can see no reason why Athena should actively prevent the user from looking at <If> or <I0>, if that is what he wants to do. The *current* version of Athena prevents this, but that's a bug not a feature! ;-) B -- Bruce Ravel ----------------------------------- ravel@phys.washington.edu Code 6134, Building 3, Room 222 Naval Research Laboratory phone: (1) 202 767 5947 Washington DC 20375, USA fax: (1) 202 767 1697 NRL Synchrotron Radiation Consortium (NRL-SRC) Beamlines X11a, X11b, X23b, X24c, U4b National Synchrotron Light Source Brookhaven National Laboratory, Upton, NY 11973 My homepage: http://feff.phys.washington.edu/~ravel EXAFS software: http://feff.phys.washington.edu/~ravel/software/exafs/
In the anecdotal category, I've seen some fairly bizarre high-r behavior on beamline X23B at the NSLS, which I tentatively attribute to feedback problems. That line, as many of you know, can be a little pathological at times. I've also collected some data to examine this issue on X11A, a more conventional beamline, but have never gotten around to looking at it--I hope to soon. In any case, I don't really understand the logic of averaging scans based on some estimate of the noise. For that to be appropriate, you'd have to believe there was some systematic difference in the noise between scans. What's causing that difference, if they're collected on the same beamline on the same sample? (Or did I misunderstand Michel's comment--was he talking about averaging data from different beamlines or something?) If there is no systematic changes during data collection, then the noise level should be the same, and any attempt to weight by some proxy for the actual noise will actually decrease the statistical content of the averaged data by overweighting some scans (i. e. random fluctuations in the quantity being used to estimate the uncertainty will cause some scans to dominate the average more heavily, which is not ideal if the actual noise level is the same). If, on the other hand, there is a systematic difference between subsequent scans, it is fairly unlikely to be "white," and thus will not be addressed by this scheme anyway. Perhaps one of you can give me examples where this kind of variation in data quality is found. So right now I don't see the benefit to this method. Particularly if it's automated, I hesitate to add hidden complexity to my data reduction without a clear rationale for it. --Scott Calvin Naval Research Lab Code 6344 Matt Newville wrote:
I'd assumed that vibrations would actually cause fairly white noise, though feedback mechanisms could skew towards high frequency. Other effects (temperature/pressure/flow fluctuations in ion chamber gases and optics) might skew toward low-frequency noises. I have not seen many studies of vibrations, feedback mechanism, or other beamline-specific effects on data quality, and none discussing the spectral weight of the beamline-specific noise.
On the other hand, all data interpolation schemes do some smoothing, which suppresses high frequency components. And it usually appears that the high-frequency estimate of the noise from chi_noise() or Feffit gives an estimate that is significantly *low*.
Anyway, I think using the 'epsilon_k' that chi_noise() estimates as the noise in chi(k) is a fine way to do a weighted averages of data. It's not perfect, but neither is anything else.
--Matt
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
Hi Scott, On Mon, 7 Jul 2003, Scott Calvin wrote:
In any case, I don't really understand the logic of averaging scans based on some estimate of the noise. For that to be appropriate, you'd have to believe there was some systematic difference in the noise between scans. What's causing that difference, if they're collected on the same beamline on the same sample? (Or did I misunderstand Michel's comment--was he talking about averaging data from different beamlines or something?) If there is no systematic changes during data collection, then the noise level should be the same, and any attempt to weight by some proxy for the actual noise will actually decrease the statistical content of the averaged data by overweighting some scans (i. e. random fluctuations in the quantity being used to estimate the uncertainty will cause some scans to dominate the average more heavily, which is not ideal if the actual noise level is the same). If, on the other hand, there is a systematic difference between subsequent scans, it is fairly unlikely to be "white," and thus will not be addressed by this scheme anyway. Perhaps one of you can give me examples where this kind of variation in data quality is found.
Using a solid-state detector with low-concentration samples, it's common to do a couple scans counting for a few seconds per point, then more scans counting for longer time (say, first 3sec/pt then 10sec/pt). The data is typically better with longer counting time (not always by square-root-of-time), but you want to use all the noisy data you have. In such a case, a weighted average based on after-the-fact data quality would be useful.
So right now I don't see the benefit to this method. Particularly if it's automated, I hesitate to add hidden complexity to my data reduction without a clear rationale for it.
I can't see a case where it would be obviously better to do the simple average than to average using weights of the estimated noise. For very noisy data (such as the 3sec/pt scan and the 10sec/pt scan), the simple average is almost certainly worse. Anyway, it seems simple enough to allow either option and/or overriding the high-R estimate of the noise. Maybe I'm just not understanding your objection to using the high-R estimate of data noise, but I don't see how a weigthed average would "actually decrease the statistical content of the averaged data" unless the high-R estimate of noise is pathologically *way* off, which I don't think it is (I think it's generally a little low but reasonable). If two spectra give different high-R estimates of noise, saying they actually have the same noise seems pretty bold. Of course, one can assert that the data should be evenly weighted or assert that they should be weighted by their individually estimated noise. Either way, something is being asserted about different measurements in order to treat them as one. Whatever average is used, it is that assertion that is probably the most questionable, complex, and in need of a rationale. So maybe it is better to make the assertion more explicit and provide options for how it is done. --Matt
Matt Newville writes:
Using a solid-state detector with low-concentration samples, it's common to do a couple scans counting for a few seconds per point, then more scans counting for longer time (say, first 3sec/pt then 10sec/pt). The data is typically better with longer counting time (not always by square-root-of-time), but you want to use all the noisy data you have. In such a case, a weighted average based on after-the-fact data quality would be useful.
This was the kind of example I was looking for. I agree that in this case it makes sense to use the high-R noise estimate for averaging, and thus it's useful to have this option implemented in software. --Scott
participants (4)
-
Bruce Ravel
-
Matt Newville
-
SCHLEGEL Michel 177447
-
Scott Calvin