Hi Everyone,
Thanks for the discussion and example data and code! By itself, Larch was
definitely not handling these datasets gracefully, and now it is doing a
much better job. There were lots of different topics discussed, but I'd
like to make a few comments:
First, I do not understand the objection to rebinning QXAFS data that is
heavily oversampled in energy. I think this is necessary. When I do
continuous XAFS scans (still at ~1 sec per energy, so maybe not "Quick"
anymore), I set up a "normally gridded XAFS scan" and then use that to set
triggers and scan at approximately constant energy velocity so that each
bin of energy is centered on the target energy. That is, I don't get 8000
values, only ~400 on a nearly-normal EXAFS grid. To be clear, we do this
for convenience, so that the binning is done in the motor controller and
detector hardware, not in post-processing software. But, there should be
no difference, and like Carlo and Edmund point out, doing it in software
does add flexibility in how the data is treated. As long as the final
energy grid is fine enough relative to energy resolution and/or EXAFS
resolution (so, ~0.05Ang^-1), I think this is fine.
But, whether done in hardware or software, binning *will* happen with
QXAFS.
Interpolation and smoothing could work, but these introduce questions of
how much and which data to use for the interpolation or smoothing. For
sure, smoothing with Savitzky-Golay followed by simple interpolation could
work. But, when Larch/Ifeffit/Athena treat data by default, they use a
simple interpolation and do work to use all the data points in finely
spaced data. Instead, they use only the 2 or 3 nearest energy values and
do linear or quadratic interpolation with those limited data, assuming the
data is accurate. That does not work so well for heavily oversampled and
so results in artificially noisy data, as shown below (see especially the
Ru example).
I'm still digesting Matthew's code. I think it is close in spirit to what I
have, but I have not tested it numerically.
For rebinning in Larch and its in-active-development XAS_Viewer, here's
what I have so far (full code at
https://github.com/xraypy/xraylarch/blob/master/plugins/xafs/rebin_xafs.py#L...),
based on playing around with data from Carlo and Edmund:
Step 1: make "standard XAFS grid" array of energy, with XANES steps of
~0.5 eV or better and EXAFS steps of 0.05 Ang^-1
Step 2: identify segments in original energy array for each value in the
new energy array.
Step 3: for each energy value in the new array:
a) if the segment has 2 or few energy values, do linear
interpolation.
b) otherwise, take either the mean value ("boxcar") or
centroid of mu for the segment.
c) estimate the uncertainty as the standard deviation of the
mu for that segment.
I know most of the discussion here was about EXAFS, but I am also slightly
concerned about rebinning introducing systematic shifts at the edge, just
in case this data is used as XANES. Because of this, I use a hybrid
solution of doing rebinning when there are enough original data points (3
or more), and doing linear interpolation when there are 3 or fewer points.
I think Matthew's code does that too....
I thought that using the centroid might improve the noise, but it seemed to
have a tiny (1.e-7) effect, at least on the data looked at so far. Also,
so far, I'm calculating the uncertainty in mu due to the rebinning, but not
doing anything with that yet.
With this and other fixes for sorting and removing duplicate data, Larch
XAS Viewer now does an OK job with Carlo's and Edmund's data. Attached are
two plots showing data as originally imported without rebinning, and after
rebinning. The unbinned data look really, really noisy. For Carlo's
fluorescence data at Fe K edge (1 scan only), rebinning helps a lot. For
Edmund's 8000+ dataset for Ru in transmission, not-rebinning is a complete
disaster, and rebinning is a huge improvement.
Let me know if you have any more suggestions on how to do this better or
questions about this,
On Thu, Jun 28, 2018 at 8:19 AM Edmund Welter
Dear Matt,
attached you will find a recent data file which is showing the problem and two plots of the data (A RuO2 powder sample on Scotch tape) in the file. One plot is showing the Chi(k) with and without rebinning the other one just mue(E) with and without rebinning. It seems it is no good to convert previously rebinned data to k-space. the rebinned mue(E) looks okay and the original data converted to k-space does also look ok.
I produced the plots with Athena 0.9.24 (I know not the most recent version) under WIN 7 with IFEFFIT as backend. I wasn't able yet to reproduce this on my office desktop using LINUX and the most recent dathena and Larch, because of some trouble with reading the data file... No idea yet, might be worse another thread or might be something stupid on my site.
Cheers,
Edmund
On 27.06.2018 20:31, Matt Newville wrote:
HI Ilya, Edmund, Carlo,
Ilya and/or Carlo: can you post some example unbinned data? As it turns out, I am adding a rebinning feature in the Larch XAS Viewer GUI that should be ready for a ready-to-try release very soon (for IIT XAFS School and XAFS2018).
This seems like a good chance to test these procedures out.
My approach for this is to this is to make a "normal XAFS energy grid" of ~5 eV steps, 0.25 eV steps, 0.05 Ang^-1 steps that the downstream processing needs, and then do one of two strategies -- maybe there should be more?: a) do a straight interpolation onto this array -- that is probably the "noisy" result. b) assign each energy point in the original data to one of these energy bins, and take the average of all the points in each bin.
I'd also like to try using energy-weighted mean (centroid). Probably most of the data is so finely spaced that this won't make much difference, but it might be a good option. It might be able to help compensate for energy jitter, assuming that the recorded energy (probably from an encoder) is more accurate than the requested energy.
It's also interesting to think about doing a Savitzky-Golay smoothing, though that might require knowing if the data points are actually uniform in mono angle or mono energy. It also makes it easy to over-do the smoothing, and so a little trickier to prevent bad results.
Do you (or anyone else) have any suggestions for how to best re-bin this kind of data?
--Matt
On Wed, Jun 27, 2018 at 10:15 AM Carlo Segre
wrote: Yes, we measure fast and have taken as many as 20000 points. The problem is not in the shifts that you mention. This is normal and expected. the problem is specificallly in the rebinning algorithm in Demeter. It seems to be different than the one in the old Horae package. I have done a test of this and I attache a coule of figures that show the difference.
I have used 10 continuous scans for this test. The data were taken at the MRCAT beamline, Sector 10 at the APS. The data are for the Fe K-edge and there are about 3400 points per scan with a point density of about 0.35 eV/step. I used both versions of Athena and performed the following steps to give the data groups shown in the plots
new_athena.png Fe_new_rebin_merge - (blue) all 10 scans rebinned at input and then merged Fe_new_merge - (red) all 10 scans merged only Fe_new_merge_rebin - (green) all 10 scans merged then rebinned
old_athena.png Fe_old_rebin_merge - (blue) all 10 scans rebinned at input and then merged Fe_old_merge - (red) all 10 scans merged only Fe_old_merge_rebin - (green) all 10 scans merged then rebinned
comp_athena.png Fe_old_rebin_merge - (blue) Fe_new_rebin_merge - (red)
It is clear that the new Athena (Demeter) is not rebinning the same way as the old one (Horae). The contrast is particularly evident with the last plot. The new rebinning algorithm is introducing more noise. For the moment, I recommend only merging and perhaps smoothing if you can tolerate a bit of amplitude reduction.
I have been thinking that it might even be better to have the data acquisition software do the rebinning on the fly so the data does not have to be manipulated in Athena. I am not sure if this is a good idea yet but I think it would help my users.
Carlo
On Wed, 27 Jun 2018, Edmund Welter wrote:
Dear Carlo,
do you also measure as fast as possible in the sense that for two consecutive scans the points on the energy axis are not at the same positions? This is what happens at my beamline. The differences are typically very small but there are differences and one should not just add all the first points and all the second points and so on because they are not necessarily exactly at the same energy. Sometimes the beamline computer is doing something else in parallel (whatever that might be) and the distance between points A and B is significantly larger than the distance between B and C.
So, the problem is, at which point does it make sense to merge several spectra of the same sample? I presume that Athena is taking care of this when I use it to merge spectra, but it can only do so by interpolating the points in the spectrum onto a common grid before summing up the spectra.
The best solution might be to rebin/interpolate the spectra onto a fixed grid before they are imported into Athena (or any other program), depends on what Athena is exactly doing when it is rebinning data.
Another aspect is that Athena is not very happy about 8600 points/spectrum anyway, at least as long as it using Ifeffit.
Cheers,
Edmund
On 27.06.2018 15:14, Carlo Segre wrote:
Hi Ilya:
We always take data in this mode at APS Sector 10 and I have also find
the rebinning function is not working satisfactorily at this time. I find that for the current version of the software it is better to merge your data and let IFEFFIT interpolate to the dk=0.05 grid that it uses.
Carlo
On Wed, 27 Jun 2018, Ilya Sinev wrote:
Hi all,
I have a question regarding the chi(k) function isolation and rebinning processes. I have some data recorded in ?quasi channel-cut? modus, i.e. with the mono constantly moving and the data points collected with the highest possible rate. With 180 sec measurement in yields to a spectrum of ca. 8600 point, which obviously needs to be rebinned. The rebinned data, however, does not look good in k-space even if multiple data are merged. Moreover, I have an impression that the raw spectrum in k-space does not have
that those
8000+ points anymore but significantly less. Is there any reduction of the data points number that is not seen (e.g. as a preparation step for FT)? Since the unbinned data has higher quality, does it then make more sense to keep using it for EXAFS analysis?
Thank you
Ilya Sinev
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit Unsubscribe: http://millenia.cars.aps.anl.gov/mailman/options/ifeffit
-- Carlo U. Segre -- Duchossois Leadership Professor of Physics Interim Chair, Department of Chemistry Director, Center for Synchrotron Radiation Research and Instrumentation Illinois Institute of Technology Voice: 312.567.3498 Fax: 312.567.3494 segre@iit.edu http://phys.iit.edu/~segre segre@debian.org _______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit Unsubscribe: http://millenia.cars.aps.anl.gov/mailman/options/ifeffit
-- --Matt Newville <newville at cars.uchicago.edu> 630-252-0431
_______________________________________________ Ifeffit mailing listIfeffit@millenia.cars.aps.anl.govhttp://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit Unsubscribe: http://millenia.cars.aps.anl.gov/mailman/options/ifeffit
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit Unsubscribe: http://millenia.cars.aps.anl.gov/mailman/options/ifeffit
-- --Matt Newville <newville at cars.uchicago.edu> 630-252-0431