Hi Wayne,
Thanks again for clearing up my confusion over the origin of the R factor for fits in R-space. I now have a different and unrelated question about the R factor. From looking through the ifeffit code, it appears that the R factor is defined as sum(data-fit)/sum(data). Is this correct?
Yes, that is correct. It is a fractional misfit.
The reason that I am asking is that I would like to use Hamilton's test (Acta Cryst 1965, V18, p.502) to determine whether adding additional shells to a fit actually results in a better fit. Hamilton's test uses the "crystallographic R factor", which is sqrt(sum(data-fit)/sum(data)), so I would like to know whether or not to take the square root of the R-factor ratios in Hamilton's test. Thanks again for the help!
I'm not familiar with "Hamilton's test", just downloaded the paper, and glanced at page 2 of it. I am sure I do not know all the subtleties of R-factor(s) used in crystallography: I thought there were a couple different R-factors used, and except for something called 'R merge' (which seems to be only about data quality???) that they were all essentially 'sum(data-fit)/sum(data)', differing in whether they used Intensities, F values, and how they weighted the different reflections (this would seem similar to the different ways of treating XAFS data: weighting, k-, R-space, etc). It looks to me like Hamilton used F values. I have no doubt that there are people on this list know a lot more about this than I do. Can anybody provide any insight on the R-factors and tests used in crystallography, and correct all the mistakes above? I think you might also be interested in the Joyner tests, from Joyner et al, J Phyc C 20, p 4005 (1987). If I recall correctly, these are very close to standard statistics F-tests on the chi-square values, with the aim of testing whether adding data and/or variables improves a fit. I believe the on-line EXCURVE manuals might have some discussion of these. Of course, seeing if reduced chi-square is improved is the simplest way to compare two fits with different number of variables or data ranges. --Matt