[Ifeffit] Different R-factor values

Matt Newville newville at cars.uchicago.edu
Fri Jan 25 11:04:14 CST 2013


Hi Jason, Chris,

On Fri, Jan 25, 2013 at 10:01 AM, Jason Gaudet <jason.r.gaudet at gmail.com> wrote:
> Hi Chris,
>
> Might be helpful also to link to the archived thread you're talking about.
>
> http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2006-June/007048.html
>
> Bruce might have to correct me on this, but if I remember right there were
> individual-data-set R-factor and chi-square calculations at some point,
> which come not from IFEFFIT but from Bruce's own post-fit calculations, and
> these eventually were found to be pretty buggy and were dropped.
>
> I don't understand what "the average over the k weights" R factor is;
> analyzing the same data set with multiple k weights (which is pretty
> typical) still means a single fit result and a single statistical output in
> IFEFFIT, as far back as I can remember, anyhow.  The discussion about
> multiple R-factors is for when you're simultaneously fitting multiple data
> sets (i.e. trying to fit a couple different data sets to some shared or
> partially shared set of guess variables).
>
> I think the overall residuals and chi-square are the more statistically
> meaningful values, as they are actually calculated by the same algorithm
> used to determine the guess variables - they're the quantities IFEFFIT is
> attempting to reduce.  I don't believe I've reported the per-data-set
> residuals in my final results, as I only treated it as an internal check for
> myself.  (It would be nice to have again, though...)
>
> -Jason

I can understand the desire for "per data set" R-factors.  I think
there are a few reasons why this hasn't been done so far.  First, The
main purpose of chi-square and R-factor are to be simple, well-defined
statistics that can be used to compare different fits.   In the case
of R-factor,  the actual value can also be readily interpreted and so
mapped to "that's a good fit" and "that's a poor fit" more easily
(even if still imperfect).   Second, it would be a slight technical
challenge for Ifeffit to make these different statistics and decide
what to call them.     Third, this is  really asking for information
on different portions of the fit, and it's not necessarily obvious how
to break the whole into parts.  OK, for fitting multiple data sets, it
might *seem* obvious how to break the whole.

But, well, fitting with multiple k-weights *is* fitting different
data.  Also, multiple-data-set fits can mix fits in different fit
spaces, with different k-weights, and so on.  Should the chi-squared
and R-factors be broken up for different k-weights too?  Perhaps they
should.  You can different weights to different data sets in a fit,
but how to best do this can quickly become a field of study on its
own.  I guess that's not a valid reason to not report these....

So, again, I think it's reasonable to ask for per-data-set and/or
per-k-weight statistics, but not necessarily obvious what to report
here.  For example, you might also want to use other partial
sums-of-squares (based on k- or R-range, for example) to see where a
fit was better and worse.    Of course, you can calculate any of the
partial sums and R-factors yourself.  This isn't so obvious with
Artemis or DArtemis, but it is possible.  It's  much easier to do
yourself and implement for others with larch than doing it in Ifeffit
or Artemis.  Patches welcome for this and/or any other advanced
statistical analyses.

Better visualizations of the fit and/or mis-fit might be useful to
think about too.

--Matt



More information about the Ifeffit mailing list