Hi Matt, I think that your reply sounds reasonable. Lets talk some more about this.
The other side of this is that I think it's more difficult than generally acknowleged to over-interpret data. Most of the cases I've seen are due to blatantly ignore the confidence limits or do completely unfair things like fitting So2 to get 0.9+/-0.1, then fix So2 to 0.90, fitting N, and claiming N to better than 10%. Those are serious, but are the normal mistakes in analysis that can happen with any data.
If you fix s02 and then you fit N and get a error lets call it dn. Then the more generous estimation of the uncertainty for N is d(s02*n) = ds02*n + s02*dn. How would you estimate the error for N if you fix s02, R and sigma2? I will try to be more specific to my original problem. lets say that the original data set (s3) gives these results for the Ca shell. Ncas3=3.4+/-0.9 Rcas3=4.01+/-0.01 sigma2cas3=0.006+/-0.3 s02s3=1.05+/-0.10 Now I run another fit and I fix all the parameters except for Nca. This time I fit three data sets s3, s4 and s5 together and I get these results. Ncas3 = 3.2+/-0.1 Ncas4 = 2.5+/-0.4 Ncas5 = 2.0+/-0.4 What would you think would be an "appropriate" uncertainty? Surely 0.1 is too small for Ncas3 but 0.9 seems rather huge. So we have some limits but can I do any better than that? The difference is between these to extremes is quite profound. In one case we see a change in the value for Nca in the other we see nothing. Shelly
-----Original Message----- From: Newville, Matthew G. Sent: Friday, November 22, 2002 10:17 AM To: 'ifeffit@millenia.cars.aps.anl.gov' Subject: Re: [Ifeffit] Double dipping
Hi Shelly,
I would like to take a bunch of parameters (all of them that do not vary significantly from the known, as determined by individual fits to each data set) from this "known" data set and then fit the series including the "original" to see how a particluar parameter (the number of Ca atoms) changes as the ratio of U to Ca changes. I would like to hear a discussion on the fairness of this approach and how would you calculate the number of independent points?
I think what you propose is fair, but you might need to be a little more specific....
In general, I think it's fair to use parameters from one fit to to another data set. Once one accepts the fact that there is limited amount of information, getting hung up on trying to count or enforce N_idp becomes pointless -- the data and estimated uncertainties __WILL__ indicate how many parameters can be determined with any confidence.
N_idp is one statistic, it should not be construed as an exact value nor a hard upper limit on how much data can be extracted from data. Any attempt to use N_idp this way is abuse.
The classic test is whether 100 very noisy scans (say, because data was collected for 0.01s per point), has 100x more 'independent points' as 1 scan that collected for 1.0s per point. The 100 noisy scans __ARE__ independent, but when added together with give exactly the same result as using the 1 clean scan. So independent data does not mean you can determine you can determine more parameters.
The other side of this is that I think it's more difficult than generally acknowleged to over-interpret data. Most of the cases I've seen are due to blatantly ignore the confidence limits or do completely unfair things like fitting So2 to get 0.9+/-0.1, then fix So2 to 0.90, fitting N, and claiming N to better than 10%. Those are serious, but are the normal mistakes in analysis that can happen with any data.
--Matt
_______________________________________________ Ifeffit mailing list Ifeffit@millenia.cars.aps.anl.gov http://millenia.cars.aps.anl.gov/mailman/listinfo/ifeffit
Hi all, Interesting discussion. It seems to me that one issue of importance in these cases is the uncertainty in the difference of the parameters between sites. In Shelly's case, it certainly seems reasonable to say that the coordination number decreases from set 3 to 4 to 5, with the difference between 3 and 4 being statistically significant, and the difference between 4 and 5 being borderline. Fixing S02 introduces a very similar fractional error to every determination of N, and thus does not affect determinations of difference. The same may be true of sigma2, although it also possible that changes in coordination number are accompanied by changes in sigma2 that are not being allowed under this scheme. If your willing to believe that the sigma2 is similar for all samples, it seems reasonably conservative to me to assume the uncertainties found in the constrained fits are independent and add them in quadrature to get the uncertainties in the differences: Ncas4 - NCas3 = -0.7 +/- 0.4; Ncas5 - Ncas4 = -0.5 +/- 0.6. It would then be reasonable to assign an uncertainty of +/- 0.9 to the absolute value of Ncas3, although the relative values are known more precisely. Comments? --Scott
Hi Matt,
I think that your reply sounds reasonable. Lets talk some more about this.
The other side of this is that I think it's more difficult than generally acknowleged to over-interpret data. Most of the cases I've seen are due to blatantly ignore the confidence limits or do completely unfair things like fitting So2 to get 0.9+/-0.1, then fix So2 to 0.90, fitting N, and claiming N to better than 10%. Those are serious, but are the normal mistakes in analysis that can happen with any data.
If you fix s02 and then you fit N and get a error lets call it dn. Then the more generous estimation of the uncertainty for N is d(s02*n) = ds02*n + s02*dn. How would you estimate the error for N if you fix s02, R and sigma2?
I will try to be more specific to my original problem. lets say that the original data set (s3) gives these results for the Ca shell. Ncas3=3.4+/-0.9 Rcas3=4.01+/-0.01 sigma2cas3=0.006+/-0.3 s02s3=1.05+/-0.10
Now I run another fit and I fix all the parameters except for Nca. This time I fit three data sets s3, s4 and s5 together and I get these results. Ncas3 = 3.2+/-0.1 Ncas4 = 2.5+/-0.4 Ncas5 = 2.0+/-0.4 What would you think would be an "appropriate" uncertainty? Surely 0.1 is too small for Ncas3 but 0.9 seems rather huge. So we have some limits but can I do any better than that? The difference is between these to extremes is quite profound. In one case we see a change in the value for Nca in the other we see nothing.
Shelly
--
participants (2)
-
Kelly, Shelly D.
-
Scott Calvin