Hi Matt, Thanks again for clearing up my confusion over the origin of the R factor for fits in R-space. I now have a different and unrelated question about the R factor. From looking through the ifeffit code, it appears that the R factor is defined as sum(data-fit)/sum(data). Is this correct? The reason that I am asking is that I would like to use Hamilton's test (Acta Cryst 1965, V18, p.502) to determine whether adding additional shells to a fit actually results in a better fit. Hamilton's test uses the "crystallographic R factor", which is sqrt(sum(data-fit)/sum(data)), so I would like to know whether or not to take the square root of the R-factor ratios in Hamilton's test. Thanks again for the help! Sincerely, Wayne --- Wayne Lukens Scientist Lawrence Berkeley National Laboratory email: wwlukens@lbl.gov phone: (510) 486-4305 FAX: (510) 486-5596
Hi Wayne,
Thanks again for clearing up my confusion over the origin of the R factor for fits in R-space. I now have a different and unrelated question about the R factor. From looking through the ifeffit code, it appears that the R factor is defined as sum(data-fit)/sum(data). Is this correct?
Yes, that is correct. It is a fractional misfit.
The reason that I am asking is that I would like to use Hamilton's test (Acta Cryst 1965, V18, p.502) to determine whether adding additional shells to a fit actually results in a better fit. Hamilton's test uses the "crystallographic R factor", which is sqrt(sum(data-fit)/sum(data)), so I would like to know whether or not to take the square root of the R-factor ratios in Hamilton's test. Thanks again for the help!
I'm not familiar with "Hamilton's test", just downloaded the paper, and glanced at page 2 of it. I am sure I do not know all the subtleties of R-factor(s) used in crystallography: I thought there were a couple different R-factors used, and except for something called 'R merge' (which seems to be only about data quality???) that they were all essentially 'sum(data-fit)/sum(data)', differing in whether they used Intensities, F values, and how they weighted the different reflections (this would seem similar to the different ways of treating XAFS data: weighting, k-, R-space, etc). It looks to me like Hamilton used F values. I have no doubt that there are people on this list know a lot more about this than I do. Can anybody provide any insight on the R-factors and tests used in crystallography, and correct all the mistakes above? I think you might also be interested in the Joyner tests, from Joyner et al, J Phyc C 20, p 4005 (1987). If I recall correctly, these are very close to standard statistics F-tests on the chi-square values, with the aim of testing whether adding data and/or variables improves a fit. I believe the on-line EXCURVE manuals might have some discussion of these. Of course, seeing if reduced chi-square is improved is the simplest way to compare two fits with different number of variables or data ranges. --Matt
Hi Matt, Thanks again for the help; I was not sure if I missed something in the code. Also, I meant to say that R_factor in ifeffit is sum((data-fit)^2)/sum(data^2). On May 6, 2004, at 7:46 AM, Matt Newville wrote:
Hi Wayne,
Thanks again for clearing up my confusion over the origin of the R factor for fits in R-space. I now have a different and unrelated question about the R factor. From looking through the ifeffit code, it appears that the R factor is defined as sum(data-fit)/sum(data). Is this correct?
Yes, that is correct. It is a fractional misfit.
The reason that I am asking is that I would like to use Hamilton's test (Acta Cryst 1965, V18, p.502) to determine whether adding additional shells to a fit actually results in a better fit. Hamilton's test uses the "crystallographic R factor", which is sqrt(sum(data-fit)/sum(data)), so I would like to know whether or not to take the square root of the R-factor ratios in Hamilton's test. Thanks again for the help!
I'm not familiar with "Hamilton's test", just downloaded the paper, and glanced at page 2 of it. I am sure I do not know all the subtleties of R-factor(s) used in crystallography: I thought there were a couple different R-factors used, and except for something called 'R merge' (which seems to be only about data quality???) that they were all essentially 'sum(data-fit)/sum(data)', differing in whether they used Intensities, F values, and how they weighted the different reflections (this would seem similar to the different ways of treating XAFS data: weighting, k-, R-space, etc). It looks to me like Hamilton used F values.
As far as R factors in crystallography go, the usual definition of R is sum(||data|-|model|)/sum(|model|), but sometimes the RMS value is used instead: R=sqrt(sum((|data|-|model|)^2)/sum(data^2)). R-merge is the residual for averaging equivalent data, R_merge=sum(||data|-|average||)/sum(|data|), if I recall correctly, and reflects the data quality. From the manual for SHELX, one of the typical single crystal refinement programs, the R-factors reported for model quality when doing crystal structures are the weighted R-factor, wR2, and the R-factor, R1. In SHELX, wR2 = sqrt(sum(w(data^2-model^2)2)/sum(w(data)^2)), where each reflection has its own weight, w. The R-factor, R1 is sum(||data|-|model|)/sum(|model|) and is equivalent to the square root of the R-factor reported by ifeffit, as I currently understand it.
I have no doubt that there are people on this list know a lot more about this than I do. Can anybody provide any insight on the R-factors and tests used in crystallography, and correct all the mistakes above?
I think you might also be interested in the Joyner tests, from Joyner et al, J Phyc C 20, p 4005 (1987). If I recall correctly, these are very close to standard statistics F-tests on the chi-square values, with the aim of testing whether adding data and/or variables improves a fit. I believe the on-line EXCURVE manuals might have some discussion of these. Of course, seeing if reduced chi-square is improved is the simplest way to compare two fits with different number of variables or data ranges.
Hamilton's test is an F-test that compares the ratios of R for two different models with different degrees of freedom. The R ratio is converted to an F value, then the likelihood of the F value is examined. Typically, the hypothesis that the two models are equivalent is rejected if the likelihood of the F value is less than 5%, but that number is somewhat arbitrary. Hamilton's test gives somewhat different information from examining the reduced chi squared. The reduced chi squared value tells you which fit is better; Hamilton's test tells you the probability that the fits are actually equivalent. Thank you for the reference to Joyner, et al.'s paper; it looks like this is the same test. Joyner, et al.'s paper defines the # of degrees of freedom as # of points in the chi curve - # of variables rather than # of independent data points - # of variables. The F test is very sensitive to the # of degrees of freedom. Thanks again for the help! Sincerely, Wayne
participants (2)
-
Matt Newville
-
Wayne Lukens