New subject: statistical parameters

15 Dec 2005

      Hi Stefano, all,
...
same old question that emerges again:
What formula is used by IFEFFIT to calculate the number of
independent points? Does it use the formula 2DkDR/pi? or 2DkDR/pi +1?
or 2DkDR/pi +2? And how does it approximate the resulting number to
integer?
Ifeffit uses an approximation of 2*delta_k * delta_R / pi.  Well, it
could use a better approximation -- see below.  But Ifeffit does not
use 2dKdR/pi + 2.  The Stern paper considers "independent points" to
be discrete, counts the endpoints of this discrete array of
"independent points", and neglects the rather significant problem of
the limit dk->0.   I find each of these to be problematic.   The dk->0
limit problem makes it especially hard to implement in a meaningful
way.

Ifeffit uses n_idp = 2*delta_k * delta_R / pi, but is cautious (and
probably overly so) in determining delta_k and delta_R.  First, it
puts these on the k- and R- grids (0.05 Ang^-1 and ~0.03067Ang).  It
is also works to avoid rounding incorrectly.  Thus, currently your
ranges:
   kmin=2.0, kmax=12, rmin=1.0, rmax=6.0
get mapped to
   kmin = 2.05, kmax=12.00 , rmin=1.012427, rmax=5.98525

Thus you should have n_idp = 31.4824.  You can confirm that with
'print n_idp' from the command line.  It sounds like that's what
you're getting.

And yes, I do think that having "kmin=2.00" map to "kmin=2.05" for
this part of the calculation is overly cautious, and I will fix that
in the next release (and n_idp will equal ~31.64).  I could be
persuaded to just use the user-supplied values, but the round-off
errors shouldn't matter much.

In general, I agree with most of the answers to not take N_idp too
seriously.  I would say to report it to 1 decimal place, just to
emphasize that it is understood to be an estimate and not a discrete
number.  Saying  "n_idp=12" implies "I could fit up to 12 parameters
with confidence".  It's easy enough to show this is usually NOT the
case, but some people sure like a hard and fast rule.  And, sadly,
n_idp does influence the error bars some, so it seems wise to get it
as close as we can.

That's sort of rambly, but hope that helps

--Matt

PS: For autobk/spline(), the number of knots is actually chosen as
   n = 2 * int( rbkg * (kmax-kmin)/pi ) + 1

It turns out that for the b-spline algorithm used, the number of knots
used needs to be an odd number greater than 4.  So, the formula is
convenient. Is it right? Well, it seems to work pretty well. As
fitting variables, the spline y-values are somewhat correlated, so you
might argue that you could have a few more variables.  That's easy
enough to try with
   spline(data.chi, rbkg=1, nknots=17)
In my experience, this rarely helps but it might be worth re-visiting.

Re: [Ifeffit] statistical parameters

Matt Newville

Stefano Ciurli

tags

participants (2)