Michael, Let's start by talking about where chi(k) comes from in the software. mu(E) is measured on some grid. Eventually, we want chi(k) to be on a clear, specified, reasonably (but not excessively) dense grid in k. chi(E) is (mu(E) - mu0(E)) / mu0(E0). This places chi(E) on the same grid as the original data. To prepare for the Fourier transform and the comparison with theory, chi(E) needs to be converted to k and put onto the specified k grid. In our software, this is done by interpolation. The interpolation is local and does not consider the density of the grid in E. So long as the grid in E is not very dense compared to the wavelengths represented in the data [1], this works fine. However, with a very dense E-grid, it is possible to come up with a situation where this local interpolation -- i.e. using a few points from the original grid which surround the point on the target k-grid -- will yield undesirable results. How do I know it's possible for the interpolation to go all wonky like this? You contrived just such a situation :) Your deglitching step removed about 40 eV worth of data, which corresponded to about 0.5/Ang. in the region of 11.5/Ang in k -- about 10 points on the k-grid. This is a BIG gap in the data. The normal presumption when deglitching is that you are only removing a few points of data. If you are removing a huge gash from your data -- as you did -- you need to think hard about how you fill that gash back in. Your data set #2 is a pretty bad solution. Your data set #4 seems like a much better solution. Let's examine why. In data set #2 you contrived exactly the problem I described above. You left a gash in the data which was about 10 grid steps wide. When going from chi(E) to chi(k), the interpolation was done. At the first data point in the gap, the previous few data points were used in the interpolation. Since those few points were pointing down-ish, the first point in the gap was lower than the baseline. The next point interpolated even lower, the next even lower. Eventually, the data had to hook back up with the other side of the gap, so the interpolation rose back up. By cutting a bug gash out of the data and doing a simple interpolation, you introduced the weird, downward pointing feature at 11.5. Data set 4 is much more sensible solution because you rebinned before deglitching. Rebinning is implemented in Athena as a convolution with a square kernel -- this is often called a box car average. This passes the convolution kernel over the data and evaluates it at the target grid points. The data before rebinning were on a very dense grid. After rebinning, they are on the "conventional" energy grid that was used back in the day when we all did step scans and for which the background removal algorithm was originally written and optimized. By rebinning first, the data are smoother and sparser and less susceptible to the interpolation effect that you saw in data set #2. Soooooo ..... is Athena broken? An argument could be made that the interpolation is the problem and the solution should be to make a better algorithm. There is merit to that, but I am going to argue something else. I think the problem is that beamlines are implementing quick scanning without thinking about all the needs of their users. Your original data consists of about 4000 points. It is rebinned to about 600 points -- about the size of a conventional step scan. Last week, one of my colleagues here at BNL showed me a quick scan which had almost 200,000 points in it. Yowzers! My question to the beamline scientists out there is this: what do your users want and need? While it is possible that you might have a power user with a good reason to examine the data file with 4000 or 200,000 points [2], most of your users want the data rebinned onto a conventional grid. So why are beamlines sending their users home with data in a format that is not what they want? That's just bad practice. Had you simply gone home [3] with your quick scan data rebinned onto a sensible grid, you would never have even noticed this problem because you would have naturally fallen into the case of data set #4. You should demand that from your beamline scientist. B [1] I mean the part of the data that is EXAFS, not the part of the data that is unnormalized monochromator glitch. [2] ... and it is certainly true that the beamline scientist needs to examine the large, dense, raw data file ... [3] "But what about the sanctity of raw data" is something that I am sure someone is sputtering right now. Well, to take the example of Diamond, /all/ the data are streamed into an HDF5 file. Column data files are then written for the convenience of the user. That's a great solution. The HDF5 file has all the things but the user can interact with the salient representation of the measurement. On 08/12/2016 07:56 AM, Michael Gaultois wrote:
Dear members of the Ifeffit list,
I recently collected some EXAFS data with some significant monochromator glitches that I am looking to remove. I have used a python script graciously written by the beamline scientist to remove the offending regions, but when I import the data into Athena, Athena does some funny business in an attempt to join together the regions outside of the data gap. (See the bending away in the dataset and/or attached image.) I have confirmed by plotting with other software that the strange step-like behaviour in the mu(E) is present only after importing into Athena (the raw data is fine).
I have looked through the mailing list archives and also the user manual, but can't seem to find anything that explains it, or other people who have experienced this problem in the past. From what I can determine, Athena joins together the segments to obtain a linear interpolation in the norm(E)? This leads to a warping in the mu(E). ==How does Athena try to treat this data?==
I was wondering if other people have had similar issues, and what steps can be taken to remedy the problem. For example, replacing removed data points with artificial points along a linear interpolation would be possible, but the act of adding artificial points that don't exist is concerning to me. ==What is the best way to treat data with mono glitches to reduce spurious features not intrinsic to the sample?==
If you are interested, I have included links to .prj datasets and images to highlight these problems below.
With thanks for your time, Michael
---------- .prj file with 4 ways of working up the same data: http://bit.ly/2bnfNZ5
1) raw data 2) mono glitches removed 3) rebinned data 4) rebinned and manually removed points (This leads to some strange-looking features in k-space, and this would be less than desireable on the many datasets we have collected)
images to highlight these problems: a) mu(E) http://bit.ly/2aYaJIf
b) norm(E) http://bit.ly/2aYaX27
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit Unsubscribe: http://millenia.cars.aps.anl.gov/mailman/options/ifeffit
-- Bruce Ravel ------------------------------------ bravel@bnl.gov National Institute of Standards and Technology Synchrotron Science Group at NSLS-II Building 743, Room 114 Upton NY, 11973 Homepage: http://bruceravel.github.io/home/ Software: https://github.com/bruceravel Demeter: http://bruceravel.github.io/demeter/