RE: [Ifeffit] A basic question about data collection
I am probably missing the point, but it is not immediately obvious to me why the following is equivalent in terms of improving the signal to noise: a) constant E-space increment and b) constant k-space increment combined with k-dependent integration time. In a), the data cluster at high E, but each data point in E corresponds to a different final state and thus is unique. Averaging over E-space data in the small interval Delta E, (1/Delta E)*Int [xmu(E) dE] is not equivalent to the time average of xmu(E) collected at a fixed E: (1/T)*Int [xmu(E) dt]. Thus, k^n-weighted integration time, to my mind, is the only proper way of reducing statistical noise. Anatoly -----Original Message----- From: ifeffit-bounces@millenia.cars.aps.anl.gov on behalf of Carlo Segre Sent: Thu 8/25/2005 5:13 PM To: XAFS Analysis using Ifeffit Cc: Subject: Re: [Ifeffit] A basic question about data collection Matt and Scott: On Thu, 25 Aug 2005, Matt Newville wrote:
This is important for QEXAFS (which typically does sample at a very fine energy grid). I've been told by people doing QEXAFS that a simple box-car average is good enough for binnning QEXAFS data. That's what Ifeffit's rebin() function does. I'd think that a more sophisticated rolling average (convolution) would be better (and not screw up energy resolution), but apparantly it's not an issue.
I have ben playing with the athena smoothing and rebinning funcionalities and I think that I prefer the rebinning because smoothing tends to attenuate sharp peaked structure. A rolling average might be good too but I haven't tried it too much. My guess is that for gentle features such as in the EXAFS region, rebinning, rolling averages and smoothing will all give statistically indistiguishable results. I amy be wrong. Carlo -- Carlo U. Segre -- Professor of Physics Associate Dean for Special Projects, Graduate College Illinois Institute of Technology Voice: 312.567.3498 Fax: 312.567.3494 Carlo.Segre@iit.edu http://www.iit.edu/~segre _______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
Hi Anatoly, I agree--they are not equivalent, and the constant k-space increment with k-dependent integration time is formally "more proper." But if the spacing is small compared to the size of an EXAFS oscillation, then there isn't a lot of difference between the two. It could even be argued that sampling over a range of k (or E) and binning is less susceptible to artifacts than choosing fewer points and spending longer on them, although as was pointed out earlier, the former takes longer because of mono settling time. Unfortunately, the beam lines I work on don't have software implemented to use a k^n weighted integration time, so I'd have to define a scan with a lot of segments that gradually increase integration time. Constant energy increment is a lazier way to move things in that direction. The real solution is to think about getting the k-weighted integration time implemented in the software... Question: you say k^n weighted integration time. Shouldn't it ideally be k^(2n), since noise might be expected to decrease as the square root of the number of counts? --Scott Calvin Sarah Lawrence College
I am probably missing the point, but it is not immediately obvious to me
why the following is equivalent in terms of improving the signal to noise: a) constant E-space increment and b) constant k-space increment combined with k-dependent integration time. In a), the data cluster at high E, but each data point in E corresponds to a different final state and thus is unique. Averaging over E-space data in the small interval Delta E, (1/Delta E)*Int [xmu(E) dE] is not equivalent to the time average of xmu(E) collected at a fixed E: (1/T)*Int [xmu(E) dt]. Thus, k^n-weighted integration time, to my mind, is the only proper way of reducing statistical noise.
Hi Scott, Not exactly related to my last posint, here is an argument against any tampering with integration time, whether k^n or k^2n weighting... The Fourier transform of white noise is easy to calculate: it's magnitude does not depend on r, and this property is used as a measure of statistical errors in experimental data, by using the FT magnitude at higher r, as described in this document (page 5, eq. (6)): http://ixs.iit.edu/subcommittee_reports/sc/err-rep.pdf Since k-weighted integration time alters the noise, it is no longer white. That means, its FT at high r is not indicative of the noise at low r, because it varies with r. Moreover, according to literature, this behavior is strongly dependent on the Fourier window function used. Thus, Eq. (6) that is implemented in IFEFFIT, is no longer valid for statistical error analysis if variable integration time used. Eqs. (7) and (8) in this summary, that also can be used to estimate statistical errors do not allow to account for a measurement with variable integration times either. In summary, to my opinion, one may improve the data quality by varying integration time, but it is not straightforward how to quantify such an improvement in terms of the effect on statistical errors in the data made by such trick. Anatoly ****************** Anatoly Frenkel, Ph.D. Associate Professor Physics Department Yeshiva University 245 Lexington Avenue New York, NY 10016 (YU) 212-340-7827 (BNL) 631-344-3013 (Fax) 212-340-7788 anatoly.frenkel@yu.edu http://www.yu.edu/faculty/afrenkel -----Original Message----- From: ifeffit-bounces@millenia.cars.aps.anl.gov [mailto:ifeffit-bounces@millenia.cars.aps.anl.gov]On Behalf Of scalvin@slc.edu Sent: Thursday, August 25, 2005 10:58 PM To: XAFS Analysis using Ifeffit Subject: RE: [Ifeffit] A basic question about data collection Hi Anatoly, I agree--they are not equivalent, and the constant k-space increment with k-dependent integration time is formally "more proper." But if the spacing is small compared to the size of an EXAFS oscillation, then there isn't a lot of difference between the two. It could even be argued that sampling over a range of k (or E) and binning is less susceptible to artifacts than choosing fewer points and spending longer on them, although as was pointed out earlier, the former takes longer because of mono settling time. Unfortunately, the beam lines I work on don't have software implemented to use a k^n weighted integration time, so I'd have to define a scan with a lot of segments that gradually increase integration time. Constant energy increment is a lazier way to move things in that direction. The real solution is to think about getting the k-weighted integration time implemented in the software... Question: you say k^n weighted integration time. Shouldn't it ideally be k^(2n), since noise might be expected to decrease as the square root of the number of counts? --Scott Calvin Sarah Lawrence College
I am probably missing the point, but it is not immediately obvious to me
why the following is equivalent in terms of improving the signal to noise: a) constant E-space increment and b) constant k-space increment combined with k-dependent integration time. In a), the data cluster at high E, but each data point in E corresponds to a different final state and thus is unique. Averaging over E-space data in the small interval Delta E, (1/Delta E)*Int [xmu(E) dE] is not equivalent to the time average of xmu(E) collected at a fixed E: (1/T)*Int [xmu(E) dt]. Thus, k^n-weighted integration time, to my mind, is the only proper way of reducing statistical noise.
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
Anatoly, On Fri, 26 Aug 2005, Anatoly Frenkel wrote:
Hi Scott,
Not exactly related to my last posint, here is an argument against any tampering with integration time, whether k^n or k^2n weighting...
The Fourier transform of white noise is easy to calculate: it's magnitude does not depend on r, and this property is used as a measure of statistical errors in experimental data, by using the FT magnitude at higher r, as described in this document (page 5, eq. (6)):
http://ixs.iit.edu/subcommittee_reports/sc/err-rep.pdf
Since k-weighted integration time alters the noise, it is no longer white. That means, its FT at high r is not indicative of the noise at low r, because it varies with r.
Err, well, what we care about the noise in chi(R) which is related by Parseval's theorem to the noise in chi(k)*k^n, the k-weighted chi(k). We assume the noise in chi(R) is white, and so also that the noise in chi(k)*k^n is white. If you want white noise (ie, independent of k) in k-weighted chi(k), then k-weighting the collection time is a good approach. It's not perfect, but then the noise won't really be white or dominated by statistical noise unless the spectra are really bad.
Moreover, according to literature, this behavior is strongly dependent on the Fourier window function used.
What literature is that? The window function should have very little effect on the estimated noise.
Thus, Eq. (6)that is implemented in IFEFFIT, is no longer valid for statistical error analysis if variable integration time used. Eqs. (7) and (8) in this summary, that also can be used to estimate statistical errors do not allow to account for a measurement with variable integration times either.
Well, we want the noise in chi(R) (assuming we're fitting in R-space). The Section 3 of the Error Report tries to use 'chi' where it may mean 'chi(R)' or 'chi(k)' or 'chi(E)'. Eq. 6 gives the estimated uncertainty in chi(k) [un-weighted] from the estimated uncertainty in chi(R), assuming the noise in chi(R) is white (independent of R). You read the report, right? Eq. 7 (epsilon^2 = 1/N0 + 1/N, for N0 counts in I0 and N in If or I) and 8 (a variation on Eq. 7) assumes the noise is dominated by shot noise, and is most useful to estimate the noise in mu(E), as the report states. But it can also be applied point-by-point, and thus vary with counting time. The report also says right after Eq. 8: The average statistical error should be estimated from the r.m.s. value of Eq. (7) or Eq. (8) over data segments with similar statistical weight, e.g., over segments with a constant integration time. which does imply that there may be varying integration time. Of course, the people writing the report knew about k-weighting the collection time.
In summary, to my opinion, one may improve the data quality by varying integration time, but it is not straightforward how to quantify such an improvement in terms of the effect on statistical errors in the data made by such trick.
It is straightforward to take data with different collection times and different k-weightings, do a bunch of FTs and compare the values of 'epsilon_r' (the estimated noise in chi(R)), 'epsilon_k' (the estimated noise in un-weighted chi(k)), and 'kmax_suggest' (the estimated k above which the signal is smaller than the noise). --Matt
participants (4)
-
Anatoly Frenkel
-
Frenkel, Anatoly
-
Matt Newville
-
scalvin@slc.edu