Hi, Related to previous mail that I sent, my question is which one I have to chose i.e., "guess" option or "set" option. Or is there any mistake I did?, Procedure that I have followed was not right ?. Suggest me....
On Monday, April 09, 2012 07:58:44 am John Farell wrote:
Hi,
Related to previous mail that I sent, my question is which one I have to chose i.e., "guess" option or "set" option. Or is there any mistake I did?, Procedure that I have followed was not right ?. Suggest me....
John, Well, one guesses the parameters that one wants to guess and one sets the parameters one wants to set. But that's probably not quite the answer you are looking for. You are not asking your questions very well. When you ask a vague, open ended question, you tend to get a vague, open ended answer. How are any of us supposed to know whether your "procedure" was "right"? We dont know what you did. How are we supposed to make suggestions? We don't what the question is. Here's a web page I wrote that tries to help people ask useful questions http://bruceravel.github.com/demeter/pods/help.pod.html B -- Bruce Ravel ------------------------------------ bravel@bnl.gov National Institute of Standards and Technology Synchrotron Methods Group at NSLS --- Beamlines U7A, X24A, X23A2 Building 535A Upton NY, 11973 My homepage: http://xafs.org/BruceRavel EXAFS software: http://cars9.uchicago.edu/ifeffit/Demeter
Dear XAFS users recently I read several mails on this list which were dealing with the problem of large data sets, as they are produced by Q-EXAFS scans or by dispersive XAFS. The question was if there are tools available to handle data sets of several 1000 spectra and perform a linear combination fit or even a full EXAFS evaluation on each of them. Evaluating 3000 spectra is a heroic attempt, but I wonder if it is also economical. In most (that means not in ALL!) cases, the vast majority of these spectra is boring, because the spectrum with the number X looks exactly like the spectrum with the number X-1 looked and how the spectrum with the number X+1 will look and so on. Evaluating all these (basically identical) spectra is in principle a waste of time and working memory. The interesting spectra are those which were measured when something was happening in the sample. Since we do not always know at which time, or temperature or reactant concentration etc. interesting things will happen it is without any doubt justified to measure x-thousand spectra, but after that we should use a more sophisticated approach than brute force. I think that it would be much more useful to find procedures (that means develop computer programs) that search for the (usually relatively small number of) interesting spectra. The most obvious parameter is how similar is a particular spectrum to the spectra measured before and after. The next step would probably be to identify clusters of related spectra using statistical methods. This is a problem which had to be solved in other areas like the automated analysis of images before and should also be possible with our kind of data. Anyway, how to handle thousands of XAFS spectra will become a very important problem in the future. With all these beamlines that provide 10^12 photons per second we can measure a factor 100 -- 1000 faster than we did with 10^9 photons per second. So, I wonder if anything beyond the brute force approach is going on in the EXAFS software universe to make effective and economical use of the measured data. Best regards, Edmund Welter -- -------------------------------------------------------- Dr. Edmund Welter Deutsches Elektronen-Synchrotron DESY FS-Do Notkestr. 85 Email: edmund.welter@desy.de D-22607 Hamburg Phone: +49 40 8998 4510 Germany Fax : +49 40 8998 2787 --------------------------------------------------------
This is where some sort of automatic classifier or dimensional reducer would come in handy. You would first have to have auto-reduction into some standard form like k^n*chi(k) (EXAFS) or pre-edge-subtracted, post-edge nornalized (XANES). After that, you might use PCA or some other such tool to express each spectrum as a point in some high-dimensional space, then find a projection in that space that allows you to see interesting features. For instance, spectra along a reaction sequence might plot out as a 1D curve twisting through a higher-dimenionsal space. I've done something like that for XANES spectra of inhomogeneous samples, identifying clusters of 'alike' spectra. Projection pursuit methods might be a way to go for finding 'interesting' projections. mam On 4/11/2012 6:51 AM, Edmund Welter wrote:
Dear XAFS users
recently I read several mails on this list which were dealing with the problem of large data sets, as they are produced by Q-EXAFS scans or by dispersive XAFS. The question was if there are tools available to handle data sets of several 1000 spectra and perform a linear combination fit or even a full EXAFS evaluation on each of them. Evaluating 3000 spectra is a heroic attempt, but I wonder if it is also economical.
In most (that means not in ALL!) cases, the vast majority of these spectra is boring, because the spectrum with the number X looks exactly like the spectrum with the number X-1 looked and how the spectrum with the number X+1 will look and so on. Evaluating all these (basically identical) spectra is in principle a waste of time and working memory. The interesting spectra are those which were measured when something was happening in the sample. Since we do not always know at which time, or temperature or reactant concentration etc. interesting things will happen it is without any doubt justified to measure x-thousand spectra, but after that we should use a more sophisticated approach than brute force.
I think that it would be much more useful to find procedures (that means develop computer programs) that search for the (usually relatively small number of) interesting spectra. The most obvious parameter is how similar is a particular spectrum to the spectra measured before and after. The next step would probably be to identify clusters of related spectra using statistical methods. This is a problem which had to be solved in other areas like the automated analysis of images before and should also be possible with our kind of data.
Anyway, how to handle thousands of XAFS spectra will become a very important problem in the future. With all these beamlines that provide 10^12 photons per second we can measure a factor 100 – 1000 faster than we did with 10^9 photons per second. So, I wonder if anything beyond the brute force approach is going on in the EXAFS software universe to make effective and economical use of the measured data.
Best regards, Edmund Welter
-- -------------------------------------------------------- Dr. Edmund Welter Deutsches Elektronen-Synchrotron DESY FS-Do
Notkestr. 85 Email:edmund.welter@desy.de D-22607 Hamburg Phone: +49 40 8998 4510 Germany Fax : +49 40 8998 2787 --------------------------------------------------------
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
Hi Edmund,
in my opinion evaluating 3000 spectra is neither a heroic attempt nor
brute force. The LCF of 3000 spectra yields one beautiful plot with
3000 * components points, where the information of an enormous Gbyte
data set is included - for me that is indeed very economical. With
such a plot each reader can decide (or be convinced) which spectra are
"boring" and if you want to study high order kinetics during your
reaction you will be happy to have enough points to perform the
required fits. Of course, one can also write a program that preselects
a few spectra by evaluating the differences between the spectra. But
then I guess you need at least one (avoidable) parameter to adjust the
tolerance to distinguish between differences caused by noise and those
due to real variations in sample composition.
That´s why I´d prefer fitting all spectra and afterwards highlighting
the interesting parts. That might take 2-3 hours for a few thosand
spectra (extended lunch break) but it is still much more convenient
than finding the "interesting" spectra manually by going through all
spectra (or than finding the most suitable tolerance parameter). And
the resulting plots are worth the time imo. I am rather going to check
how much faster I can get by parallelizing the processes for
multicores...
Best regards,
Jan Stötzel
Zitat von Edmund Welter
Dear XAFS users
recently I read several mails on this list which were dealing with the problem of large data sets, as they are produced by Q-EXAFS scans or by dispersive XAFS. The question was if there are tools available to handle data sets of several 1000 spectra and perform a linear combination fit or even a full EXAFS evaluation on each of them. Evaluating 3000 spectra is a heroic attempt, but I wonder if it is also economical.
In most (that means not in ALL!) cases, the vast majority of these spectra is boring, because the spectrum with the number X looks exactly like the spectrum with the number X-1 looked and how the spectrum with the number X+1 will look and so on. Evaluating all these (basically identical) spectra is in principle a waste of time and working memory. The interesting spectra are those which were measured when something was happening in the sample. Since we do not always know at which time, or temperature or reactant concentration etc. interesting things will happen it is without any doubt justified to measure x-thousand spectra, but after that we should use a more sophisticated approach than brute force.
I think that it would be much more useful to find procedures (that means develop computer programs) that search for the (usually relatively small number of) interesting spectra. The most obvious parameter is how similar is a particular spectrum to the spectra measured before and after. The next step would probably be to identify clusters of related spectra using statistical methods. This is a problem which had to be solved in other areas like the automated analysis of images before and should also be possible with our kind of data.
Anyway, how to handle thousands of XAFS spectra will become a very important problem in the future. With all these beamlines that provide 10^12 photons per second we can measure a factor 100 -- 1000 faster than we did with 10^9 photons per second. So, I wonder if anything beyond the brute force approach is going on in the EXAFS software universe to make effective and economical use of the measured data.
Best regards, Edmund Welter
-- -------------------------------------------------------- Dr. Edmund Welter Deutsches Elektronen-Synchrotron DESY FS-Do
Notkestr. 85 Email: edmund.welter@desy.de D-22607 Hamburg Phone: +49 40 8998 4510 Germany Fax : +49 40 8998 2787 --------------------------------------------------------
Hi Edmund, Jan, Matthew,
On Thu, Apr 12, 2012 at 8:51 AM, Jan Stötzel
Hi Edmund,
in my opinion evaluating 3000 spectra is neither a heroic attempt nor brute force. The LCF of 3000 spectra yields one beautiful plot with 3000 * components points, where the information of an enormous Gbyte data set is included - for me that is indeed very economical. With such a plot each reader can decide (or be convinced) which spectra are "boring" and if you want to study high order kinetics during your reaction you will be happy to have enough points to perform the required fits. Of course, one can also write a program that preselects a few spectra by evaluating the differences between the spectra. But then I guess you need at least one (avoidable) parameter to adjust the tolerance to distinguish between differences caused by noise and those due to real variations in sample composition.
That´s why I´d prefer fitting all spectra and afterwards highlighting the interesting parts. That might take 2-3 hours for a few thosand spectra (extended lunch break) but it is still much more convenient than finding the "interesting" spectra manually by going through all spectra (or than finding the most suitable tolerance parameter). And the resulting plots are worth the time imo. I am rather going to check how much faster I can get by parallelizing the processes for multicores...
Best regards,
Jan Stötzel
Zitat von Edmund Welter
: Dear XAFS users
recently I read several mails on this list which were dealing with the problem of large data sets, as they are produced by Q-EXAFS scans or by dispersive XAFS. The question was if there are tools available to handle data sets of several 1000 spectra and perform a linear combination fit or even a full EXAFS evaluation on each of them. Evaluating 3000 spectra is a heroic attempt, but I wonder if it is also economical.
In most (that means not in ALL!) cases, the vast majority of these spectra is boring, because the spectrum with the number X looks exactly like the spectrum with the number X-1 looked and how the spectrum with the number X+1 will look and so on. Evaluating all these (basically identical) spectra is in principle a waste of time and working memory. The interesting spectra are those which were measured when something was happening in the sample. Since we do not always know at which time, or temperature or reactant concentration etc. interesting things will happen it is without any doubt justified to measure x-thousand spectra, but after that we should use a more sophisticated approach than brute force.
I think that it would be much more useful to find procedures (that means develop computer programs) that search for the (usually relatively small number of) interesting spectra. The most obvious parameter is how similar is a particular spectrum to the spectra measured before and after. The next step would probably be to identify clusters of related spectra using statistical methods. This is a problem which had to be solved in other areas like the automated analysis of images before and should also be possible with our kind of data.
Anyway, how to handle thousands of XAFS spectra will become a very important problem in the future. With all these beamlines that provide 10^12 photons per second we can measure a factor 100 -- 1000 faster than
we did with 10^9 photons per second. So, I wonder if anything beyond the brute force approach is going on in the EXAFS software universe to make effective and economical use of the measured data.
Best regards, Edmund Welter
-- -------------------------------------------------------- Dr. Edmund Welter Deutsches Elektronen-Synchrotron DESY FS-Do
Notkestr. 85 Email: edmund.welter@desy.de D-22607 Hamburg Phone: +49 40 8998 4510 Germany Fax : +49 40 8998 2787 --------------------------------------------------------
Sorry for the delay, but I just wanted to add my agreement to this need to be able to work with larger data sets. Handling ~3000 XANES spectra for linear analysis is definitely something that we need to be able to do well, and it is not something that Ifeffit can do really at all. This is a definite weakness. For reference, 3000 EXAFS spectra can fit in well under 500 Mb of memory, and there really should be no problem handling this amount of data as a multi-dimensional array in memory on any modern machine -- in fact, expecting an order of magnitude larger is not really that absurd. The main problem is simply that the tools we have are not built with this is mind, and the hardware + data collection have outpaced analysis software. I believe the Ifeffit2 approach will be much better suited for handling such large datasets, and can easily use any of the linear analysis tools from scipy (http://scipy.org/). If you or anyone else has some suggested algorithms to include (PCA, non-negative least-squares, etc), that would be very helpful. --Matt
On Apr 16, 2012, at 10:44 AM, Matt Newville wrote:
Hi Edmund, Jan, Matthew,
On Thu, Apr 12, 2012 at 8:51 AM, Jan Stötzel
wrote: Hi Edmund,
in my opinion evaluating 3000 spectra is neither a heroic attempt nor brute force. The LCF of 3000 spectra yields one beautiful plot with 3000 * components points, where the information of an enormous Gbyte data set is included - for me that is indeed very economical. With such a plot each reader can decide (or be convinced) which spectra are "boring" and if you want to study high order kinetics during your reaction you will be happy to have enough points to perform the required fits. Of course, one can also write a program that preselects a few spectra by evaluating the differences between the spectra. But then I guess you need at least one (avoidable) parameter to adjust the tolerance to distinguish between differences caused by noise and those due to real variations in sample composition.
That´s why I´d prefer fitting all spectra and afterwards highlighting the interesting parts. That might take 2-3 hours for a few thosand spectra (extended lunch break) but it is still much more convenient than finding the "interesting" spectra manually by going through all spectra (or than finding the most suitable tolerance parameter). And the resulting plots are worth the time imo. I am rather going to check how much faster I can get by parallelizing the processes for multicores...
Best regards,
Jan Stötzel
Zitat von Edmund Welter
: Dear XAFS users
recently I read several mails on this list which were dealing with the problem of large data sets, as they are produced by Q-EXAFS scans or by dispersive XAFS. The question was if there are tools available to handle data sets of several 1000 spectra and perform a linear combination fit or even a full EXAFS evaluation on each of them. Evaluating 3000 spectra is a heroic attempt, but I wonder if it is also economical.
In most (that means not in ALL!) cases, the vast majority of these spectra is boring, because the spectrum with the number X looks exactly like the spectrum with the number X-1 looked and how the spectrum with the number X+1 will look and so on. Evaluating all these (basically identical) spectra is in principle a waste of time and working memory. The interesting spectra are those which were measured when something was happening in the sample. Since we do not always know at which time, or temperature or reactant concentration etc. interesting things will happen it is without any doubt justified to measure x-thousand spectra, but after that we should use a more sophisticated approach than brute force.
I think that it would be much more useful to find procedures (that means develop computer programs) that search for the (usually relatively small number of) interesting spectra. The most obvious parameter is how similar is a particular spectrum to the spectra measured before and after. The next step would probably be to identify clusters of related spectra using statistical methods. This is a problem which had to be solved in other areas like the automated analysis of images before and should also be possible with our kind of data.
Anyway, how to handle thousands of XAFS spectra will become a very important problem in the future. With all these beamlines that provide 10^12 photons per second we can measure a factor 100 -- 1000 faster than
we did with 10^9 photons per second. So, I wonder if anything beyond the brute force approach is going on in the EXAFS software universe to make effective and economical use of the measured data.
Best regards, Edmund Welter
-- -------------------------------------------------------- Dr. Edmund Welter Deutsches Elektronen-Synchrotron DESY FS-Do
Notkestr. 85 Email: edmund.welter@desy.de D-22607 Hamburg Phone: +49 40 8998 4510 Germany Fax : +49 40 8998 2787 --------------------------------------------------------
Sorry for the delay, but I just wanted to add my agreement to this need to be able to work with larger data sets. Handling ~3000 XANES spectra for linear analysis is definitely something that we need to be able to do well, and it is not something that Ifeffit can do really at all. This is a definite weakness. For reference, 3000 EXAFS spectra can fit in well under 500 Mb of memory, and there really should be no problem handling this amount of data as a multi-dimensional array in memory on any modern machine -- in fact, expecting an order of magnitude larger is not really that absurd. The main problem is simply that the tools we have are not built with this is mind, and the hardware + data collection have outpaced analysis software.
I believe the Ifeffit2 approach will be much better suited for handling such large datasets, and can easily use any of the linear analysis tools from scipy (http://scipy.org/). If you or anyone else has some suggested algorithms to include (PCA, non-negative least-squares, etc), that would be very helpful.
--Matt
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
Hi Matt, Regarding ifeffit2, another application that would benefit from linear combination fitting on thousands or tens of thousands of spectra is in micro/nano-XANES (micro/nano-EXAFS) spatial imaging. In the case of heterogeneous samples (e.g., geochemical samples), principal component analysis (PCA) and target factor analysis (TFA) would be valuable statistical tools to apply to such large datasets. Dean DEAN HESTERBERG Professor Dept. of Soil Science College of Agriculture and Life Sciences Box 7619 3235 Williams Hall NC State University Raleigh, NC 27695-7619 voice: (919) 513-3035 fax: (919) 515-2167 email: dean_hesterberg@ncsu.edu www.soil.ncsu.edu/programs/molecular/
Hi Dean,
On Thu, Apr 26, 2012 at 12:39 PM, Dean Hesterberg
Hi Matt,
Regarding ifeffit2, another application that would benefit from linear combination fitting on thousands or tens of thousands of spectra is in micro/nano-XANES (micro/nano-EXAFS) spatial imaging. In the case of heterogeneous samples (e.g., geochemical samples), principal component analysis (PCA) and target factor analysis (TFA) would be valuable statistical tools to apply to such large datasets.
Thanks -- I completely agree. We absolutely need to analyze spatial micro-XANES spectra with linear algebra tools. I think the basic tools for this are available, and that doing the actual coding would not be too difficult. I'm hopeful that this will happen soon! Cheers, --Matt
participants (7)
-
Bruce Ravel
-
Dean Hesterberg
-
Edmund Welter
-
Jan Stötzel
-
John Farell
-
Matt Newville
-
Matthew Marcus