# [Ifeffit] Double dipping

Matt Newville newville at cars.uchicago.edu
Fri Nov 22 10:16:44 CST 2002

```Hi Shelly,

> I would like to take a bunch of parameters (all of them that do
> not vary significantly from the known, as determined by
> individual fits to each data set) from this "known" data set
> and then fit the series including the "original" to see how a
> particluar parameter (the number of Ca atoms) changes as the
> ratio of U to Ca changes.  I would like to hear a discussion on
> the fairness of this approach and how would you calculate the
> number of independent points?

I think what you propose is fair, but you might need to be a
little more specific....

In general, I think it's fair to use parameters from one fit to
to another data set.  Once one accepts the fact that there is
limited amount of information, getting hung up on trying to count
or enforce N_idp becomes pointless -- the data and estimated
uncertainties __WILL__ indicate how many parameters can be
determined with any confidence.

N_idp is one statistic, it should not be construed as an exact
value nor a hard upper limit on how much data can be extracted
from data.  Any attempt to use N_idp this way is abuse.

The classic test is whether 100 very noisy scans (say, because
data was collected for 0.01s per point), has 100x more
'independent points' as 1 scan that collected for 1.0s per point.
The 100 noisy scans __ARE__ independent, but when added together
with give exactly the same result as using the 1 clean scan. So
independent data does not mean you can determine you can
determine more parameters.

The other side of this is that I think it's more difficult than
generally acknowleged to over-interpret data.  Most of the cases
I've seen are due to blatantly ignore the confidence limits or do
completely unfair things like fitting So2 to get 0.9+/-0.1, then
fix So2 to 0.90, fitting N, and claiming N to better than 10%.
Those are serious, but are the normal mistakes in analysis that
can happen with any data.

--Matt

```