Matt, From my experience at the ESRF, many users dealing with XAFS data collected in continuous scan mode (either in angle, energy or time) apply the following workflow: 1) average many scans with PyMca; 2) import them in Athena; PyMca does a linear interpolation using the grid of the first scan in the list: https://github.com/vasole/pymca/blob/b9cb5209674bf0c3617b3488a75184d4f841264... I have never run a study on which merging scheme is the best, but I would recommend using an "histogram-like" approach, as you propose: 1) make a "normal XAFS energy grid"; 2) assign each energy point in the original data to one of these energy bins and take the average of all the points in each bin. This approach is usually employed in ray tracing applications, where the number of rays is huge and one needs to make an histogram for plotting (e.g. ray intensity versus a given variable). Mauro On 2018-06-27 20:31, Matt Newville wrote:
This seems like a good chance to test these procedures out.
My approach for this is to this is to make a "normal XAFS energy grid" of ~5 eV steps, 0.25 eV steps, 0.05 Ang^-1 steps that the downstream processing needs, and then do one of two strategies -- maybe there should be more?: a) do a straight interpolation onto this array -- that is probably the "noisy" result. b) assign each energy point in the original data to one of these energy bins, and take the average of all the points in each bin.
I'd also like to try using energy-weighted mean (centroid). Probably most of the data is so finely spaced that this won't make much difference, but it might be a good option. It might be able to help compensate for energy jitter, assuming that the recorded energy (probably from an encoder) is more accurate than the requested energy.
It's also interesting to think about doing a Savitzky-Golay smoothing, though that might require knowing if the data points are actually uniform in mono angle or mono energy. It also makes it easy to over-do the smoothing, and so a little trickier to prevent bad results.
Do you (or anyone else) have any suggestions for how to best re-bin this kind of data?
--Matt