Hi Scott,
Sorry, I read epsilon as "noise in chi(k)". This is the most
meaningful physical/statistical measure: epsilon_r surely depends on
k-weight and can depend on k-range as it samples different portions of
the spectra. Like you say, it will tend to increase as you increase
the k-range.
On Fri, May 13, 2011 at 11:58 AM, Scott Calvin
Matt, On May 13, 2011, at 8:39 AM, Matt Newville wrote:
After all, the epsilon should be different for different k-ranges, as your signal to noise ratio probably changes as a function of k. Using the same epsilon doesn't reflect that.
Without seeing the data in question, this seems like speculation to me. I'm not at all sure why epsilon (the variation in chi(k)) should depend strongly on the k-range. In my experience, it usually does not. The S/N ratio will surely change with k, but that would surely be dominated by the rapid decay in |chi(k)|, rather than a change in epsilon.
I'm confused. We Fourier transform k-weighted data. Since Ifeffit uses the high-R amplitude to estimate uncertainty, it seems to me that what matters is signal-to-noise, not just noise in the original unweighted chi(k). Am I wrong in that? I may be misunderstanding how epsilon_r is calculated. And epsilon_r is the relevant epsilon for a fit in R space, right?
I think your assumption that epsilon will depend strongly on k may not correct. Do you have evidence for this? I would say that it is not strongly dependent on k, and that reduced chi-square is useful in comparing fits with different k-ranges.
I just tried it on the FeC2O4 chi(k) attached to this post. It's a good example of data where it's not immediately clear to me what the "best" value for kmax is, so it would be tempting to use RCS to compare fits over different k-ranges. I used k-weight 3, and Hanning windows with dk = 1. I chose kmin as 2 and stepped kmax by 0.5, recording epsilon_r for each: kmax epsilon_r 7 0.034840105 7.5 0.041843848 8 0.082627337 8.5 0.087550367 9 0.086032007 9.5 0.085996216 10 0.088679339 10.5 0.090364699 11 0.092509939 11.5 0.108103081
There's a general trend of increasing epsilon_r with an increase in k. There's also a jump of a factor of 2 between 7.5 and 8. Why? Because there's a glitch there, and the glitch adds high-R structure.
Well, except for that jump (which I would say is appropriate, as the spike add weights at all frequencies), I'd say epsilon_r is pretty constant, varying by 10% (not bad for a crude estimate) up to k=11. |chi(k)| drops by considerably over that range, possibly to well below the noise level by k=10. So the higher end there is clearly not going to help the fit -- all you're adding is noise.
To make sure there wasn't something odd about this particular chi(k), I took one of the data sets included with the horae distribution: the file y300.chi in the ybco folder. I followed the same procedure as before, except I stepped by 1 inverse angstrom each time, because of the greater data range. kmax epsilon_r 7 0.012866125 8 0.073383695 9 0.078255772 10 0.080016040 11 0.091634572 12 0.105419473 13 0.164341701 14 0.195266957 15 0.224727593 16 0.411139882 17 0.480293296 If anything, the trend is more clear here.
Between 8 and 12 Ang^-1 there is what I would call a small change You're certainly adding more noise and progressively less signal as you increase k, even for a noise level in chi(k) that does not depend of k. There are sharp features that could easily be considered "white noise". But I don't strongly disagree either -- epsilon_r does definitely increase as you increase the k-range.
I find it confusing that you expect the noise in the data to depend (strongly, even) on k, but not on R. The general wisdom is the estimate of epsilon from the high-R components is too low, suggesting that the R dependence is significant. Every time I've looked, I come to the conclusion that noise in data is either consistent with "white" or so small as to be difficult to measure. I believe Corwin Booth's recent work supports the conventional wisdom that epsilon decreases with R, but I don't recall it suggesting a significant k dependence.
I'm not making any claims as to whether, in general, the noise in the data depends on R. I can speculate about circumstances where low R noise is greater (due, for instance, to temperature fluctuations in cooling water, which are likely to be fairly slow), or where high R noise is greater (an example here would be if whatever system is keeping the beam on the sample vertically as the mono scans is tending to overshoot). But Ifeffit's estimation of epsilon_r demonstrably does not depend on the R-range used for fitting, regardless of the distribution of noise in R. That's a very different thing. Thus, changing the R-range of a fit is completely safe as far as comparing RCS goes.
Ah, OK, I think I see what you were getting at. But I think the epsilon_r and epsilon_k are still roughly good for using reduced chi-square to compare fits of different k- and R-ranges. If anything, the estimate in the number of independent points is a much cruder estimate than the estimate of epsilon. --Matt