[Ifeffit] Athena interpolation when removing mono glitches

Fri Aug 12 08:22:22 CDT 2016

Michael,

Let's start by talking about where chi(k) comes from in the software.

mu(E) is measured on some grid.  Eventually, we want chi(k) to be on a
clear, specified, reasonably (but not excessively) dense grid in k.
chi(E) is (mu(E) - mu0(E)) / mu0(E0).  This places chi(E) on the same
grid as the original data.

To prepare for the Fourier transform and the comparison with theory,
chi(E) needs to be converted to k and put onto the specified k grid.
In our software, this is done by interpolation.  The interpolation is
local and does not consider the density of the grid in E.  So long as
the grid in E is not very dense compared to the wavelengths
represented in the data [1], this works fine.  However, with a very
dense E-grid, it is possible to come up with a situation where this
local interpolation -- i.e. using a few points from the original grid
which surround the point on the target k-grid -- will yield
undesirable results.

How do I know it's possible for the interpolation to go all wonky like
this?  You contrived just such a situation :)

Your deglitching step removed about 40 eV worth of data, which
corresponded to about 0.5/Ang. in the region of 11.5/Ang in k -- about
10 points on the k-grid.  This is a BIG gap in the data.

The normal presumption when deglitching is that you are only removing
a few points of data.  If you are removing a huge gash from your data
-- as you did -- you need to think hard about how you fill that gash
back in.

Your data set #2 is a pretty bad solution.  Your data set #4 seems
like a much better solution.  Let's examine why.

In data set #2 you contrived exactly the problem I described above.
You left a gash in the data which was about 10 grid steps wide.  When
going from chi(E) to chi(k), the interpolation was done.  At the first
data point in the gap, the previous few data points were used in the
interpolation.  Since those few points were pointing down-ish, the
first point in the gap was lower than the baseline.  The next point
interpolated even lower, the next even lower.  Eventually, the data
had to hook back up with the other side of the gap, so the
interpolation rose back up.  By cutting a bug gash out of the data and
doing a simple interpolation, you introduced the weird, downward
pointing feature at 11.5.

Data set 4 is much more sensible solution because you rebinned before
deglitching.  Rebinning is implemented in Athena as a convolution with
a square kernel -- this is often called a box car average.  This
passes the convolution kernel over the data and evaluates it at the
target grid points.  The data before rebinning were on a very dense
grid.  After rebinning, they are on the "conventional" energy grid
that was used back in the day when we all did step scans and for which
the background removal algorithm was originally written and optimized.
By rebinning first, the data are smoother and sparser and less
susceptible to the interpolation effect that you saw in data set #2.

Soooooo ..... is Athena broken?  An argument could be made that the
interpolation is the problem and the solution should be to make a
better algorithm.  There is merit to that, but I am going to argue
something else.

I think the problem is that beamlines are implementing quick scanning
without thinking about all the needs of their users.  Your original
data consists of about 4000 points.  It is rebinned to about 600
points -- about the size of a conventional step scan.

Last week, one of my colleagues here at BNL showed me a quick scan
which had almost 200,000 points in it.  Yowzers!

My question to the beamline scientists out there is this: what do your
users want and need?  While it is possible that you might have a power
user with a good reason to examine the data file with 4000 or 200,000
points [2], most of your users want the data rebinned onto a
conventional grid.  So why are beamlines sending their users home with
data in a format that is not what they want?  That's just bad
practice.

Had you simply gone home [3] with your quick scan data rebinned onto a
sensible grid, you would never have even noticed this problem because
you would have naturally fallen into the case of data set #4.

You should demand that from your beamline scientist.

B

[1] I mean the part of the data that is EXAFS, not the part of the
     data that is unnormalized monochromator glitch.

[2] ... and it is certainly true that the beamline scientist needs to
     examine the large, dense, raw data file ...

[3] "But what about the sanctity of raw data" is something that I am
     sure someone is sputtering right now.  Well, to take the example
     of Diamond, /all/ the data are streamed into an HDF5 file.  Column
     data files are then written for the convenience of the user.
     That's a great solution.  The HDF5 file has all the things but the
     user can interact with the salient representation of the
     measurement.

On 08/12/2016 07:56 AM, Michael Gaultois wrote:
> Dear members of the Ifeffit list,
>
> I recently collected some EXAFS data with some significant monochromator
> glitches that I am looking to remove. I have used a python script
> graciously written by the beamline scientist to remove the offending
> regions, but when I import the data into Athena, Athena does some funny
> business in an attempt to join together the regions outside of the data
> gap. (See the bending away in the dataset and/or attached image.) I have
> confirmed by plotting with other software that the strange step-like
> behaviour in the mu(E) is present only after importing into Athena (the
> raw data is fine).
>
> I have looked through the mailing list archives and also the user
> manual, but can't seem to find anything that explains it, or other
> people who have experienced this problem in the past. From what I can
> determine, Athena joins together the segments to obtain a linear
> interpolation in the norm(E)? This leads to a warping in the mu(E).
> ==How does Athena try to treat this data?==
>
> I was wondering if other people have had similar issues, and what steps
> can be taken to remedy the problem. For example, replacing removed data
> points with artificial points along a linear interpolation would be
> possible, but the act of adding artificial points that don't exist is
> concerning to me.
> ==What is the best way to treat data with mono glitches to reduce
> spurious features not intrinsic to the sample?==
>
> If you are interested, I have included links to .prj datasets and images
> to highlight these problems below.
>
> With thanks for your time,
> Michael
>
> ----------
> .prj file with 4 ways of working up the same data:
> http://bit.ly/2bnfNZ5
>
> 1) raw data
> 2) mono glitches removed
> 3) rebinned data
> 4) rebinned and manually removed points (This leads to some
> strange-looking features in k-space, and this would be less than
> desireable on the many datasets we have collected)
>
> images to highlight these problems:
> a) mu(E)
> http://bit.ly/2aYaJIf
>
> b) norm(E)
> http://bit.ly/2aYaX27
>
> c) k
> http://bit.ly/2baWh0h
>
>
> _______________________________________________
> Ifeffit mailing list
> Ifeffit at millenia.cars.aps.anl.gov
> http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
> Unsubscribe: http://millenia.cars.aps.anl.gov/mailman/options/ifeffit
>

-- 
  Bruce Ravel  ------------------------------------ bravel at bnl.gov

  National Institute of Standards and Technology
  Synchrotron Science Group at NSLS-II
  Building 743, Room 114
  Upton NY, 11973

  Homepage:    http://bruceravel.github.io/home/
  Software:    https://github.com/bruceravel
  Demeter:     http://bruceravel.github.io/demeter/