weighting multi-data fit
Dear all, I have a question regarding the weighting for a multiple data set fitting. I found that there is the parameter Epsilon to do this in Artemis. But in the help file there is no description how to use this parameter. Which value did I have to use to set the weighting to (for example) double or half? Best regards Jörg -- Dr. Jörg Haug Martin-Luther-Universität Halle-Wittenberg Institut für Physik Anorganisch-Nichtmetallische Materialien Friedemann-Bach-Platz 6 D-06108 Halle Telefon: 0345 55 25 529 Fax: 0345 55 27 159 e-mail: joerg.haug@physik.uni-halle.de http://www.physik.uni-halle.de/Fachgruppen/Glas/arbeitsgruppen/anw.htm
On Tuesday 19 June 2007, Jörg Haug wrote:
Dear all,
I have a question regarding the weighting for a multiple data set fitting. I found that there is the parameter Epsilon to do this in Artemis. But in the help file there is no description how to use this parameter. Which value did I have to use to set the weighting to (for example) double or half? Best regards
Jörg, This is an aspect of Artemis that is neither well designed nor well explained. Fortunately, it's possible to figure it out, if you know where to look. There are two ways of finding the value of epsilon(k). It is written to the log file and it gets written to the echo area when you select "What is epsilon_k?" from the Data menu. If you set epsilon to the reported value of epsilon(k) for each data group and rerun the fit, everything should be the same. That is, setting epsilon to 0 in Artemis means "use the value of epsilon(k) measured from the data." From there, you can change the weightings of the groups by changing the values of epsilon. B -- Bruce Ravel ---------------------------------------------- bravel@anl.gov Molecular Environmental Science Group, Building 203, Room E-165 MRCAT, Sector 10, Advanced Photon Source, Building 433, Room B007 Argonne National Laboratory phone and voice mail: (1) 630 252 5033 Argonne IL 60439, USA fax: (1) 630 252 9793 My homepage: http://cars9.uchicago.edu/~ravel EXAFS software: http://cars9.uchicago.edu/~ravel/software/exafs/
Hi Jörg, On Tuesday 19 June 2007, Jörg Haug wrote:
Dear all,
I have a question regarding the weighting for a multiple data set fitting. I found that there is the parameter Epsilon to do this in Artemis. But in the help file there is no description how to use this parameter. Which value did I have to use to set the weighting to (for example) double or half? Best regards
It's a little bit convoluted -- probably more so than it should be -- to modify the weightings for different data sets in a fit using multiple data sets. It's also poorly documented. It is discussed in some detail in the older Feffit documentation that's on the Ifeffit web site. The approach is to specify epsilon_r or epsilon_k for each data set. The two values are related and specifying either one is sufficient for the feffit() command to calculate the appropriate one. For fits in R space, epsilon_r is used as the weighting factor: For a single data set, the quantity minimized in the best fit is chi_square = (1 / N ) Sum_i=1^N { [Data_i - Fit_i]^2 / epsilon_r^2 } where the sum is over N R-space points. The data spacing for chi(R) is set by the Fourier transform range and k-grid, so that points in R space are ~0.03Ang apart. Since we really ought to be careful about how much information is in our data, the sum might be better (though less stable) if the spacing were given by the spectral resolution (see other recent discussions!) of pi/(2*Delta k), which would make the sum go to N_idp instead of N. We take that into account, and also assume that there is no R dependence of epsilon_r, so that chi_square becomes chi_square = N_idp / (N epsilon_r^2) Sum_i=1^N [ Data_i - Fit_i]^2 for one data set. For multiple data sets, there's an outer sum, so the weighting factors applied actually take N_idp and N (for each data set!) into account. For a fit with M data sets, chi_square = SUM_j=1^M { N_idp_j /(N_j epsilon_r_j^2) Sum_i=1^N_j [ Data_ij - Fit_ij]^2 } The upshot is that if you want "equal weighting" to spectra with different k- and R-ranges and/or different noise levels, you might have to be a little careful in how you set epsilon_r (or, equivalently epsilon_k) for each data set. For the simplest case of the different data sets having equal k- and R- ranges, you can just double an epsilon_k for a particular data set to mean "accept a fit that's twice as bad for this data set". Cheers, --Matt
Am Dienstag, 19. Juni 2007 20:38 schrieb Matt Newville:
Hi Jörg,
On Tuesday 19 June 2007, Jörg Haug wrote:
Dear all,
I have a question regarding the weighting for a multiple data set fitting. I found that there is the parameter Epsilon to do this in Artemis. But in the help file there is no description how to use this parameter. Which value did I have to use to set the weighting to (for example) double or half? Best regards
It's a little bit convoluted -- probably more so than it should be -- to modify the weightings for different data sets in a fit using multiple data sets. It's also poorly documented. It is discussed in some detail in the older Feffit documentation that's on the Ifeffit web site.
The approach is to specify epsilon_r or epsilon_k for each data set. The two values are related and specifying either one is sufficient for the feffit() command to calculate the appropriate one. For fits in R space, epsilon_r is used as the weighting factor: For a single data set, the quantity minimized in the best fit is chi_square = (1 / N ) Sum_i=1^N { [Data_i - Fit_i]^2 / epsilon_r^2 }
where the sum is over N R-space points. The data spacing for chi(R) is set by the Fourier transform range and k-grid, so that points in R space are ~0.03Ang apart. Since we really ought to be careful about how much information is in our data, the sum might be better (though less stable) if the spacing were given by the spectral resolution (see other recent discussions!) of pi/(2*Delta k), which would make the sum go to N_idp instead of N.
We take that into account, and also assume that there is no R dependence of epsilon_r, so that chi_square becomes chi_square = N_idp / (N epsilon_r^2) Sum_i=1^N [ Data_i - Fit_i]^2
for one data set. For multiple data sets, there's an outer sum, so the weighting factors applied actually take N_idp and N (for each data set!) into account. For a fit with M data sets, chi_square = SUM_j=1^M { N_idp_j /(N_j epsilon_r_j^2) Sum_i=1^N_j [ Data_ij - Fit_ij]^2 }
The upshot is that if you want "equal weighting" to spectra with different k- and R-ranges and/or different noise levels, you might have to be a little careful in how you set epsilon_r (or, equivalently epsilon_k) for each data set. For the simplest case of the different data sets having equal k- and R- ranges, you can just double an epsilon_k for a particular data set to mean "accept a fit that's twice as bad for this data set".
Cheers,
--Matt
Thanks Matt and Bruce for the answers. I will try to play with this parameter now a little bit. -- Dr. Jörg Haug Martin-Luther-Universität Halle-Wittenberg Institut für Physik Anorganisch-Nichtmetallische Materialien Friedemann-Bach-Platz 6 D-06108 Halle Telefon: 0345 55 25 529 Fax: 0345 55 27 159 e-mail: joerg.haug@physik.uni-halle.de http://www.physik.uni-halle.de/Fachgruppen/Glas/arbeitsgruppen/anw.htm
Hi all, The EXAFS Divination Dataset is here: http://www.xafs.org/EXAFS_Divination_Set You may well be asking, "What is the EXAFS Divination Dataset"? It's a set of data collected on mixtures of iron compounds, mostly oxides of one sort or another. My research assistants have mixed random amounts of the compounds together--they know how much they used of each, but I don't, you don't, and (if you're a mentor to those learning XAFS), your students don't either. Thus, if you wish, you can try your hand at analyzing the data, knowing that when you're done, you can find out what the honest-to-goodness right answer is (how refreshing!). Or you can make your students try it, either as practice or to assess their current skills. My own motivation is to find out how accurate XAFS analysis really is for this kind of problem. Some researchers have attacked that question from the bottom up, evaluating the uncertainties inherent in each step of analysis. I'd like to complement that by looking at the issue from the top down, using these double-blind conditions to determine how much accuracy we can get in practice. So although you're free to use the dataset as you will (and there are a bunch of standards in there), if you'd like to know the correct phase identifications and answers, then I'd like to know the answers you got (including things like nearest-neighbor bond length if you got them), your level of expertise, an estimate of the time it took, and a brief (or not brief, if you prefer) description of the methods you used. (Note: If you do that, please don't respond to this email, as your findings will be posted to the entire list! Email to me directly at SCalvin.mailaps.org or SCalvin.slc.edu. Likewise, please don't use this list to discuss your attempts to fit samples in the dataset, as that will compromise my experiment.) If I end up publishing this study, I won't do so in a way that allows people to identify which analysis was done by whom. I will include you in the acknowledgments if you so choose. Some details of the dataset: The set consists of raw data files from X-11B at the NSLS. There are multiple scans for each sample and standard, and the data quality varies from moderate to good. Each sample and standard was measured at a different random temperature between 303 and 403 K; this reduces (but does not eliminate) the utility of methods like linear combinations of standards. There are seven standards: iron metal, Fe2O3, Fe3O4, FeO, alpha-FeOOH, gamma-FeOOH, iron(II) oxalate hydrate. Matt--you can put these standards in your library if you'd like, although temperature information will necessarily be missing until my study is complete (I don't even know the temperatures they were measured at yet). The first part of the dataset consists of mixtures where the constituents are known, but the fractions aren't. These problems are presumably pretty easy. In fact, if you or a student of yours wanted to take a very quick stab at these using linear combination methods, that's fine--just let me know that's what you did when you send the request for the answers. It will be interesting to see how far off linear combination methods are when the temperatures of the samples and standards are signficantly different. The second part consists of 2-3 standards mixed together, but you don't know which ones or how much of each. The third part consists of 1-2 standards that are specified, and one mystery compound that is not, except to say that it is a fairly simple organic compound (salt, probably) of iron. I don't yet know what it is; I had a colleague in the chemistry department pick something and order it for me, and then my students prepped it. The fourth part is 1-2 standards and the mystery compound, none of which are specified. Feel free to attack any parts of this you want, in any order, with any degree of seriousness. If you tell me that you eyeballed sample B5 and it looks like 60% iron metal and 40% Fe2O3, that's fine. :) I want a sense of how well various techniques work, not necessarily everyone's best effort. On the other hand, if you consider it a matter of personal pride to do as well as you can, then by all means... --Scott Calvin Sarah Lawrence College
Hi Scott, This sounds interesting. Can you tell me something about the energy calibration of the references and the samples? Are the reference compounds all correctly energy calibrated? I don't really care about the absolute calibration, just that the spectra are correct relative to each other. Can you say something about the resolution of the spectra? Were they all recorded at the same beamline during one run or are they from different beamlines and/or run at different times? Thanks! Sincerely, Wayne Lukens Scott Calvin wrote:
Hi all,
The EXAFS Divination Dataset is here: http://www.xafs.org/EXAFS_Divination_Set
You may well be asking, "What is the EXAFS Divination Dataset"?
It's a set of data collected on mixtures of iron compounds, mostly oxides of one sort or another. My research assistants have mixed random amounts of the compounds together--they know how much they used of each, but I don't, you don't, and (if you're a mentor to those learning XAFS), your students don't either.
Thus, if you wish, you can try your hand at analyzing the data, knowing that when you're done, you can find out what the honest-to-goodness right answer is (how refreshing!). Or you can make your students try it, either as practice or to assess their current skills.
My own motivation is to find out how accurate XAFS analysis really is for this kind of problem. Some researchers have attacked that question from the bottom up, evaluating the uncertainties inherent in each step of analysis. I'd like to complement that by looking at the issue from the top down, using these double-blind conditions to determine how much accuracy we can get in practice. So although you're free to use the dataset as you will (and there are a bunch of standards in there), if you'd like to know the correct phase identifications and answers, then I'd like to know the answers you got (including things like nearest-neighbor bond length if you got them), your level of expertise, an estimate of the time it took, and a brief (or not brief, if you prefer) description of the methods you used. (Note: If you do that, please don't respond to this email, as your findings will be posted to the entire list! Email to me directly at SCalvin.mailaps.org or SCalvin.slc.edu. Likewise, please don't use this list to discuss your attempts to fit samples in the dataset, as that will compromise my experiment.) If I end up publishing this study, I won't do so in a way that allows people to identify which analysis was done by whom. I will include you in the acknowledgments if you so choose.
Some details of the dataset:
The set consists of raw data files from X-11B at the NSLS. There are multiple scans for each sample and standard, and the data quality varies from moderate to good. Each sample and standard was measured at a different random temperature between 303 and 403 K; this reduces (but does not eliminate) the utility of methods like linear combinations of standards.
There are seven standards: iron metal, Fe2O3, Fe3O4, FeO, alpha-FeOOH, gamma-FeOOH, iron(II) oxalate hydrate. Matt--you can put these standards in your library if you'd like, although temperature information will necessarily be missing until my study is complete (I don't even know the temperatures they were measured at yet).
The first part of the dataset consists of mixtures where the constituents are known, but the fractions aren't. These problems are presumably pretty easy. In fact, if you or a student of yours wanted to take a very quick stab at these using linear combination methods, that's fine--just let me know that's what you did when you send the request for the answers. It will be interesting to see how far off linear combination methods are when the temperatures of the samples and standards are signficantly different.
The second part consists of 2-3 standards mixed together, but you don't know which ones or how much of each.
The third part consists of 1-2 standards that are specified, and one mystery compound that is not, except to say that it is a fairly simple organic compound (salt, probably) of iron. I don't yet know what it is; I had a colleague in the chemistry department pick something and order it for me, and then my students prepped it.
The fourth part is 1-2 standards and the mystery compound, none of which are specified.
Feel free to attack any parts of this you want, in any order, with any degree of seriousness. If you tell me that you eyeballed sample B5 and it looks like 60% iron metal and 40% Fe2O3, that's fine. :) I want a sense of how well various techniques work, not necessarily everyone's best effort. On the other hand, if you consider it a matter of personal pride to do as well as you can, then by all means...
--Scott Calvin Sarah Lawrence College
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
Hi Wayne, All spectra were recorded on the same beamline on the same run (X-11B). Each has the same iron foil in the reference channel. The reference channel isn't great (the samples were on the thick side), but I'd judge the energy calibration drifted less than an eV over the course of the run; probably less than half an eV. At any rate, the reference channel is in the data, so you can look for yourself. As far as the resolution of the spectra, it's probably dominated by the usual core-hole broadening at the iron K edge. The mono was detuned at least 25% on all samples, but as I said, some samples were on the thick side, so it's possible there's a bit of harmonic leakage. Looking at the iron powder standard is probably a good way to judge. --Scott Calvin Sarah Lawrence College At 05:09 PM 6/22/2007, you wrote:
Hi Scott,
This sounds interesting. Can you tell me something about the energy calibration of the references and the samples? Are the reference compounds all correctly energy calibrated? I don't really care about the absolute calibration, just that the spectra are correct relative to each other.
Can you say something about the resolution of the spectra? Were they all recorded at the same beamline during one run or are they from different beamlines and/or run at different times?
Thanks!
Sincerely,
Wayne Lukens
Scott, Superb! I will be very eager to see the results of this sociology/statistics study. I'll probably participate and I strongly encourage others to do the same so that Scott can get some decent stats. B P.S. You can discourage the wiki from turning "Fe2O3" into a link by prepending an explanation point: "!Fe2O3". -- Bruce Ravel ---------------------------------------------- bravel@anl.gov Molecular Environmental Science Group, Building 203, Room E-165 MRCAT, Sector 10, Advance Photon Source, Building 433, Room B007 Argonne National Laboratory phone and voice mail: (1) 630 252 5033 Argonne IL 60439, USA fax: (1) 630 252 9793 My homepage: http://cars9.uchicago.edu/~ravel EXAFS software: http://cars9.uchicago.edu/~ravel/software/
Dear Scott, thank you for the interesting 'virtual' experiment. It will be very informative to see if linear combinations of two or more compounds can be unambiguously identified by EXAFS analysis. Environmental samples (soils, dirt, wastewaters, sediments, ..) are typical cases where such problems are encountered. I would just like to mention a similar study of Fe XANES analysis for a particular case of iron gall inks, where linear combination method (implemented in athena) was used to identifiy relative amounts of Fe2+ and Fe3+ compounds in the historic inks (ref: I. Arcon et al. X-ray Spectrometry 2007, 36, 199-205) A methodological problem of finding proper Fe XANES references for the procedure and avoiding systematic errors is addressed. best regards Iztok Arcon http://www.p-ng.si/~arcon/xas
Hi all,
The EXAFS Divination Dataset is here: http://www.xafs.org/EXAFS_Divination_Set
You may well be asking, "What is the EXAFS Divination Dataset"?
It's a set of data collected on mixtures of iron compounds, mostly oxides of one sort or another. My research assistants have mixed random amounts of the compounds together--they know how much they used of each, but I don't, you don't, and (if you're a mentor to those learning XAFS), your students don't either.
Thus, if you wish, you can try your hand at analyzing the data, knowing that when you're done, you can find out what the honest-to-goodness right answer is (how refreshing!). Or you can make your students try it, either as practice or to assess their current skills.
My own motivation is to find out how accurate XAFS analysis really is for this kind of problem. Some researchers have attacked that question from the bottom up, evaluating the uncertainties inherent in each step of analysis. I'd like to complement that by looking at the issue from the top down, using these double-blind conditions to determine how much accuracy we can get in practice. So although you're free to use the dataset as you will (and there are a bunch of standards in there), if you'd like to know the correct phase identifications and answers, then I'd like to know the answers you got (including things like nearest-neighbor bond length if you got them), your level of expertise, an estimate of the time it took, and a brief (or not brief, if you prefer) description of the methods you used. (Note: If you do that, please don't respond to this email, as your findings will be posted to the entire list! Email to me directly at SCalvin.mailaps.org or SCalvin.slc.edu. Likewise, please don't use this list to discuss your attempts to fit samples in the dataset, as that will compromise my experiment.) If I end up publishing this study, I won't do so in a way that allows people to identify which analysis was done by whom. I will include you in the acknowledgments if you so choose.
Some details of the dataset:
The set consists of raw data files from X-11B at the NSLS. There are multiple scans for each sample and standard, and the data quality varies from moderate to good. Each sample and standard was measured at a different random temperature between 303 and 403 K; this reduces (but does not eliminate) the utility of methods like linear combinations of standards.
There are seven standards: iron metal, Fe2O3, Fe3O4, FeO, alpha-FeOOH, gamma-FeOOH, iron(II) oxalate hydrate. Matt--you can put these standards in your library if you'd like, although temperature information will necessarily be missing until my study is complete (I don't even know the temperatures they were measured at yet).
The first part of the dataset consists of mixtures where the constituents are known, but the fractions aren't. These problems are presumably pretty easy. In fact, if you or a student of yours wanted to take a very quick stab at these using linear combination methods, that's fine--just let me know that's what you did when you send the request for the answers. It will be interesting to see how far off linear combination methods are when the temperatures of the samples and standards are signficantly different.
The second part consists of 2-3 standards mixed together, but you don't know which ones or how much of each.
The third part consists of 1-2 standards that are specified, and one mystery compound that is not, except to say that it is a fairly simple organic compound (salt, probably) of iron. I don't yet know what it is; I had a colleague in the chemistry department pick something and order it for me, and then my students prepped it.
The fourth part is 1-2 standards and the mystery compound, none of which are specified.
Feel free to attack any parts of this you want, in any order, with any degree of seriousness. If you tell me that you eyeballed sample B5 and it looks like 60% iron metal and 40% Fe2O3, that's fine. :) I want a sense of how well various techniques work, not necessarily everyone's best effort. On the other hand, if you consider it a matter of personal pride to do as well as you can, then by all means...
--Scott Calvin Sarah Lawrence College
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
Thanks, Itzok. I'll check it out. --Scott Calvin Sarah Lawrence College At 05:27 AM 6/25/2007, you wrote:
Dear Scott,
I would just like to mention a similar study of Fe XANES analysis for a particular case of iron gall inks, where linear combination method (implemented in athena) was used to identifiy relative amounts of Fe2+ and Fe3+ compounds in the historic inks (ref: I. Arcon et al. X-ray Spectrometry 2007, 36, 199-205) A methodological problem of finding proper Fe XANES references for the procedure and avoiding systematic errors is addressed.
Our library doesn't carry that journal. Could someone send me a copy or put it on an FTP site for download? Thanks.
mam
----- Original Message -----
From: "Scott Calvin"
Thanks, Itzok. I'll check it out.
--Scott Calvin Sarah Lawrence College
At 05:27 AM 6/25/2007, you wrote:
Dear Scott,
I would just like to mention a similar study of Fe XANES analysis for a particular case of iron gall inks, where linear combination method (implemented in athena) was used to identifiy relative amounts of Fe2+ and Fe3+ compounds in the historic inks (ref: I. Arcon et al. X-ray Spectrometry 2007, 36, 199-205) A methodological problem of finding proper Fe XANES references for the procedure and avoiding systematic errors is addressed.
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
Registered users of the SpectroscopyNow website (it's a free
registration) can get this article through them for free until the end
of June. Here is a direct link:
http://tinyurl.com/yqaff7
-Leslie Baker
----- Original Message -----
From: Matthew Marcus
Our library doesn't carry that journal. Could someone send me a copy or put it on an FTP site for download? Thanks. mam ----- Original Message ----- From: "Scott Calvin"
To: "XAFS Analysis using Ifeffit" Sent: Monday, June 25, 2007 4:40 AM Subject: Re: [Ifeffit] EXAFS Divination Dataset Thanks, Itzok. I'll check it out.
--Scott Calvin Sarah Lawrence College
At 05:27 AM 6/25/2007, you wrote:
Dear Scott,
I would just like to mention a similar study of Fe XANES analysis for a particular case of iron gall inks, where linear combination>>method (implemented in athena) was used to identifiy relative amounts of Fe2+ and Fe3+ compounds in the historic inks (ref: I. Arcon et al. X-ray Spectrometry 2007, 36, 199-205) A methodological problem of finding proper Fe XANES references for the procedure and avoiding systematic errors is addressed.
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
participants (9)
-
Bruce Ravel
-
Iztok.Arcon@p-ng.si
-
Jörg Haug
-
Leslie Baker
-
Matt Newville
-
Matthew Marcus
-
Scott Calvin
-
Scott Calvin
-
Wayne Lukens