Re: [Ifeffit] Large XAFS data sets

16 Apr 2012

      Hi Matt,

I solved it this way: In my code only the 1-5 reference component
spectra for the LCF are permanently imported in IFEFFIT and one
variable is reserved for all the spectra that I want to have fitted,
which is done consecutively then. Each time a spectrum is fitted, the
results are exported into arrays of my software and send to the plot
pane. After that everything except the reference component spectra is
resetted in IFEFFIT and the fitted spectrum is overwritten by the next
one. Going that way IFEFFIT memory is never exceeded, neither when
fitting 3000 spectra nor in principal with 3*10^6 spectra (although
that would take more than a week I guess). Of course you need some
time to code it but it works quite well. Nevertheless, a new IFEFFIT
with faster fit routines would be most welcome of course :).

Regards,

Jan

Zitat von Matt Newville :
...
Hi Edmund, Jan, Matthew,
On Thu, Apr 12, 2012 at 8:51 AM, Jan Stötzel
 wrote:
...
Hi Edmund,
in my opinion evaluating 3000 spectra is neither a heroic attempt nor brute
force. The LCF of 3000 spectra yields one beautiful plot with 3000 *
components points, where the information of an enormous Gbyte data set is
included - for me that is indeed very economical. With such a plot each
reader can decide (or be convinced) which spectra are "boring" and if you
want to study high order kinetics during your reaction you will be happy to
have enough points to perform the required fits. Of course, one can also
write a program that preselects a few spectra by evaluating the differences
between the spectra. But then I guess you need at least one (avoidable)
parameter to adjust the tolerance to distinguish between differences caused
by noise and those due to real variations in sample composition.
That´s why I´d prefer fitting all spectra and afterwards highlighting the
interesting parts. That might take 2-3 hours for a few thosand spectra
(extended lunch break) but it is still much more convenient than finding the
"interesting" spectra manually by going through all spectra (or than finding
the most suitable tolerance parameter). And the resulting plots are worth
the time imo. I am rather going to check how much faster I can get by
parallelizing the processes for multicores...
Best regards,
Jan Stötzel
Zitat von Edmund Welter :
...
Dear XAFS users
recently I read several mails on this list which were dealing with the
problem of large data sets, as they are produced by Q-EXAFS scans or by
dispersive XAFS. The question was if there are tools available to handle
data sets of several 1000 spectra and perform a linear combination fit
or even a full EXAFS evaluation on each of them. Evaluating 3000 spectra
is a heroic attempt, but I wonder if it is also economical.
In most (that means not in ALL!) cases, the vast majority of these
spectra is boring, because the spectrum with the number X looks exactly
like the spectrum with the number X-1 looked and how the spectrum with
the number X+1 will look and so on. Evaluating all these (basically
identical) spectra is in principle a waste of time and working memory.
The interesting spectra are those which were measured when something was
happening in the sample. Since we do not always know at which time, or
temperature or reactant concentration etc. interesting things will
happen it is without any doubt justified to measure x-thousand spectra,
but after that we should use a more sophisticated approach than brute
force.
I think that it would be much more useful to find procedures (that means
develop computer programs) that search for the (usually relatively small
number of) interesting spectra. The most obvious parameter is how
similar is a particular spectrum to the spectra measured before and
after. The next step would probably be to identify clusters of related
spectra using statistical methods. This is a problem which had to be
solved in other areas like the automated analysis of images before and
should also be possible with our kind of data.
Anyway, how to handle thousands of XAFS spectra will become a very
important problem in the future. With all these beamlines that provide
10^12 photons per second we can measure a factor 100 -- 1000 faster than
we did with 10^9 photons per second. So, I wonder if anything beyond the
brute force approach is going on in the EXAFS software universe to make
effective and economical use of the measured data.
Best regards,
Edmund Welter
--
--------------------------------------------------------
Dr. Edmund Welter      Deutsches Elektronen-Synchrotron
DESY FS-Do
Notkestr. 85            Email: edmund.welter@desy.de
D-22607 Hamburg         Phone: +49 40 8998 4510
Germany                 Fax  : +49 40 8998 2787
--------------------------------------------------------
Sorry for the delay, but I just wanted to add my agreement to this
need to be able to work with larger data sets.  Handling ~3000 XANES
spectra for linear analysis is definitely something that we need to be
able to do well, and it is not something that Ifeffit can do really at
all.  This is a definite weakness.  For reference, 3000 EXAFS spectra
can fit in well under 500 Mb of memory, and there really should be no
problem handling this amount of data as a multi-dimensional array in
memory on any modern machine -- in fact, expecting an order of
magnitude larger is not really that absurd.   The main problem is
simply that the tools we have are not built with this is mind, and the
hardware + data collection have outpaced analysis software.
I believe the Ifeffit2 approach will be much better suited for
handling such large datasets, and can easily use any of the linear
analysis tools from scipy (http://scipy.org/).  If you or anyone else
has some suggested algorithms to include (PCA, non-negative
least-squares, etc), that would be very helpful.
--Matt
_______________________________________________
Ifeffit mailing list
Ifeffit@millenia.cars.aps.anl.gov
http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit