[Ifeffit] How to do with diffraction peak

Matthew Marcus mamarcus at lbl.gov
Thu Dec 13 23:28:09 CST 2012

> You might argue (I would agree) that with multiple measures of
> fluorescence you can determine outliers more rigorously.  But I would
> also say that then the best thing to do then is to not simply remove
> outliers but to take the average AND standard deviation of the samples
> and include both in the analysis.  A glitch would then not be removed,
> but would be reflected by having a less certain value for that energy

But how do you know what the standard deviation should be for a point you had to exclude?  I suppose you could
try to come up with some plasuibiity measure which says that it can't be more different from the results of
interpolation than some amount, a hieuristic measure.  The idea there would be that you do a best-effort
deglitch and then assign an uncertainty based on how badly you may have done at it.  This still makes me queasy.
In a typocal case, the glitch dominates over the real signal so you actually have *no* information about what the signal was
there except for what you know from the surrounding points.  Thus, attempting to use the glitched-out region is
really an attempted double-use of the surrounding points.  Better to adapt your methods so that they can deal with
non-uniform tabulation, which you have anyway.
> point.

> I agree with your larger point that we can definitely afford the CPU
> cycles to do better analysis than use something simply because it is
> Fast.  I'm not sure that a slow discrete FT would be significantly
> different than an FFT, though.  I think (but am listening...) that the
> continuous v discrete FT is like classic audiophile snobbery about the
> superiority of analog over digital music.  With sufficient sampling
> and resolution and dynamic range in the signal, the difference has to
> be trivial.  I'm willing to believe that a digital recording has less
> dynamic range, but I'm also willing to believe that this is swamped by
> my stereo and ears (and I mostly listen to mp3s through cheap
> headphones these days, so I clearly am not buying a tube amplifier
> anytime soon...).
You're right that the DFT is exactly equivalent to the FFT.  However, the VSFT is different in that
it works on non-uniformly tabulated data with no interpolation.
I've heard of an AB test in which Golden Ears were challenged with two amps, one of which was solid-state
and the other also solid-state but with some 2nd-harmonic distortion and (I think) noise added to simulate
a tube amp.  Guess which won?  Somebody once wrote a parody in which mercury-filled speaker cables were
advertised for their "liquid, shimmering sound", but I digress.
> But for EXAFS, we know the limits of the signals in k and R (say, 50
> Ang^-1 and 25 Ang for extrema), and so can choose samplings that
> definitely over-sample enough to not cause significant artifacts.
> There can be dynamic range issues (especially with standard
> pre-amplifiers and VtoF converters), but I think this is a slightly
> separate issue from "the glitch problem".

Are you suggesting changes to the methods used to pick points at which to take data?  The original discussion
had to do with data which had already been taken.  An example of what you're talking about - the QXAS setup on my
beamline scans the mono at a constant angular speed and samples on-the-fly, like an on-the-fly fluorescence map
and using the same hardware.  Thus, at the end of the spectrum, to get longer dwell times, you have to pick larger
energy steps, which means that you're smearing each point because each point is the average over, say, 5eV, rather than
being taken at one enetgy.  However, a simple calculation shows that with reasonable stepsizes, you don't lose
anything significant even out to 10A.
> Interpolation methods are definitely worth discussing and studying.
> I'm not sure that the energy values that happened to have been where
> the monochromator was stopped for a measurement of mu(E) are more
> sacred or a better sampling of the first shell of an EXAFS spectra
> than a set of energies interpolated from those measurements to be on
> uniform k grid, provided the the original data isn't woefully
> undersampled.  With a typical step size of 0.05Ang^1 you can easily
> miss several points and still accurately measure distances up to 5 or
> probably even 10 Angstroms.

That's kind of the idea behind the histogram interpolation used in at least some FT programs.  Let's say that the
data need to be interpolated on a 0.05A grid, and there happen to be points at 3.01, 3.03 and 3.04A^-1.  Then,
chi_out(3.00A^-1) = (4/5)*chi_in(3.01A^-1)+(2/5)*chi_in(3.03A^-1)+(1/5)*chi_in(3.04A^-1)+ contributions from points to the left of 3.00, and
similar for chi_out(3.05A^-1).  If there are no data points between 3.00 and 3.05, then values are assigned by linear interpolation
from the ones on either side of the 3.00-3.05 interval.

Getting back to the original problem, suppose you have a Bragg glitch whose wings go out far enough that you have to lop off the top
of an EXAFS wiggle.  If you try to interpolate over the gap, you will probably screw up the amplitude and maybe the phase in that area.
This may not affect a 1st-shell measurement with only one kind of neighbor, but it could do nasty things to anything more complex.
>>> I do think that having richer window functions and spectral weightings
>>> would be very useful.   You could view chi(k) data with glitches ias
>>> wanting a window function that had a very small value (possibly zero:
>>> remove this point) exactly at the glitch, but had a large value (~1)
>>> at nearby k-values.
>>> Another (and possibly best) approach is to assign an uncertainty to
>>> each k value of chi.  At the glitch, the uncertainty needs to go way
>>> up, so that not matching that point does not harm the overall fit.
>> A window function which went to 0 in the glitch region is tantamount to an
>> assertion that chi(k)=0 there, which is generally not so
>> and is probably a lot worse than the error you make by even the lamest
>> interpolation through the glitch. The uncertainty makes more
>> sense, except that if you filter data, the noise spreads out in a correlated
>> manner over adjacent regions.  Also, the tacit assumption
>> of an uncertainty-based method is that the noise is uncorrelated within the
>> high-uncertainty region.  That said, I can imagine that
>> this is one of those methods that works better than it has any right to.
> I think of the window as weighting of measurements to consider, and
> that a weight of zero is applied to data with infinite uncertainty.  I
> don't think of chi(k) as zero outside the window range (say, below
> kmin or above kmax), just that we don't want to consider that in the
> analysis.  Again, I think the better thing would be to have an
> uncertainty for each value of chi(k), and propagate that through the
> analysis.

That's not what's usually meant by windowing.  Usually, a window is a multiplying factor.
Now, that said, if you apply the same multiplying factor to the results of a fit, you're effectively
doing a weighting.
	Sumsq = sum{W[i]*(y[i]-yfit[i])^2) = sum((w[i]*y[i]-w[i]*yfit[i])^2) for W[i]=w[i]^2 .
Also, the issue of weighting gets into some deep and murky waters.  The idealist would say that one
should weight according to some measure of noise statistics, such as sqrt(N) noise.  However, it's
not usual that the spectrum be as good as one would expect from sqrt(N), and how do you define N
when you're using analog counters like ion chambers or Lytle detectors anyway?  Also, systematic
noise is often point-to-point correlated so not white.  Further, people like to play with varying
k-weight exponents in order to make certain features show up.  Now add the popular practice of
fitting in R-space, wherein every point you fit is actually a linear combination of the input data so
the noise in Re(chi~(R)) and Im(chi~(R)) is definitely not white.  That's one can of worms I don't plan
to open, not even to go fishing with.
>> Now, in my proposed system, how would you fit filtered (q-space) data?  The
>> first step is to do the very slow FT (VSFT) on the input
>> data with no interpolation.  This leads to a set of sines and cosines which
>> replicate the data.  I would then apply the window and
>> do the back-summation of all those sines and cosines at the given points.
>> This will undoubtedly introduce some artifacts near
>> the edges of the deleted regions simply due to our ignorance of what the data
>> really do in there.  Next, in each fit iteration, I would
>> evaluate the model function at each point, then do the VSFT and filtering
>> exactly as had been done on the data, which causes the fit to
>> have the same artifacts.  Since the VSFT and filter operations are linear, it
>> should be possible to pre-compute a kernel which relates the
>> values of unfiltered data to those of filtered data.  This would be a matrix
>> of size Np*Np with Np=# of points.  Using the kernel instead
>> of evaluating gazillions of trig functions would be reasonably fast; I
>> suspect that ifeffit does a similar trick.  Similarly, fitting in
>> R-space would be accomplished by computing the VSFT on data and model
>> function.
>> Yes, I know I talk a good game.  Unfortunately, I'm too swamped to attempt
>> implementation.  Also, I'd have to reinvent a number of
>> very well-engineered wheels to do it.
> I hear you!  I'm interested in pursuing "more advanced weightings and
> transforms", but don't have much time to spend on such things these
> days, so development is slow....
>>> Larch has a separate Transform class used for each dataset in a fit -
>>> this includes the standard FT parameters and fit ranges.  It is
>>> intended to be able to do this sort of advanced windowing (and in
>>> principle, advanced weighting too).  Currently (I'd be ready for a
>>> release but am commissioning our new beamline and am hoping to post a
>>> windows installer by the end of the month)  this has "the standard
>>> XAFS window functions", but one can create their own window functions
>>> (for k- and/or R-space) with suppressed points/regions, etc, and use
>>> them as well.  I have to admit I haven't tried this, but it was
>>> definitely part of the intention, and should work.   My hope is to
>>> extend this to include some wavelet-like weightings as well.
>> What is Larch?  Is it a replacement for ifeffit?  By 'class' do you mean
>> object as in OOP?
>> "And now for something completely different.  The larch." - Monty Python
>> 	mam
> Yes, Mostly yes, and, Yes, you get the reference exactly ("How to
> recognize different types of trees from quite a long way away, Number
> 1: The Larch").
>    http://xraypy.github.com/xraylarch/
> Another wide-open appeal:  I'd love help in implementing improvements
> like these (better windows, comparing continuous v discrete
> transforms, using wavelets, etc), and think work like this would be
> publishable.   There are several other improvements I'd like to see,
> such as generating Feff Paths on-the-fly, and so forth.    Essentially
> all of Larch is in python so that it's very easy to modify and add
> features like this.  The available numerical libraries for python are
> plentiful and well-maintained.  It's also easy to add wrappers for C
> and Fortran code if that is needed.....

Sigh.  If I had infinite time or multiple copies of myself, I'd learn Python.  It seems to be the way
to go for this kind of work.  For the last 12 years or so, I've been working almost exclisively in
> --Matt
> _______________________________________________
> Ifeffit mailing list
> Ifeffit at millenia.cars.aps.anl.gov
> http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit

More information about the Ifeffit mailing list