[Ifeffit] Question about statistics

Olga Kashurnikova okash at mail.ru
Fri Mar 6 14:46:56 CST 2015

Hello, Dr Matt Newville,
It is the second part about Bayesian analysis itself. I hope I answer in a thread, not only to you. May be it isn’t in a thread, I don’t know how to mend it now
The question was splitted because I was not sure if there are people who already made this thing with Bayesian statistics in program and I should ask about is it right at all before doing. The same is for the last paragraph of message you answered, I tried to say that there are some things I can’t do that simple with IFEFFIT and maybe they are worth of work even if the simple thing is already done. I was not sure I should make multiple threads for it because it is interconnected with the main question. For this part your answer is very informative and helpful. 
I thought of using Larch, but now there are not many new features for my task, and I didn’t find some old, such as mu0 co-fitting with XAFS function. Ifeffit does it for a fixed num of points, I think I need to vary it, but don’t know if it is possible. I will surely try to study how to use Larch but have seen it not a long ago and can’t say much of it now. I am thankful to you for the info it will be more extensible. I’d like to ask about it when I will learn more,if possible.

About what is Bayes for. I have the complicated system with nanodomains of fluorite and pyrochlore phase. The samples were grown in special conditions and the metastable phases are practically valuable in this case, that’s why we should understand how the structure evolves. The task was to find the structure from XAFS spectra, but there should be the complicated model for exact characterization. Diffraction gave the evidence of multicomponent system with evolving proportion (pyrochlore domains grow inside fluorite). One of the structure features is the splitting of oxygen coordination sphere, that have to be in pyrochlore and not to be in fluorite. In the same time, fluorite structure is defected in the sense of dopant structure – two cation elements are mixed in fcc sublattice and cation-oxygen distance is different for them, so we can get the splitting of oxygen sphere to 5 spheres, for instance, but they will be undistinguishable from the wide Gaussian sphere I think, only in cluster model with constraints. The pyrochlore splitting is more evident, but I think it needs statistical proving. Also if I could make the more complicated multicomponent-with-dopant model (I tried to do it, but in the different sense, because we didn’t have the full anomalous diffraction data that time), it should be proved to give the better results.
Some of this information is in correlations, but they can’t give the information of what parametra exactly should be eliminated from the model. And projections can.
The book of G.Larry Bretthorst ‘Bayesian Spectrum Analysis and Parameter Estimation’ from the series ‘Lecture notes in Statistics’ (Springer-Verlag,1988) is not very different by concept from Krappe and Rossner papers which you seem to be familiar to. It is not about XAFS but about different data modeling. Author makes many tests with sum of one or many frequency sine functions, with or without broadening, with trend lines and so on, trying to calculate it more theoretically, but with a little program that can be used only after minimization. He proves the thing that two frequencies can be statistically distinguished, which model is better, with one or two frequencies. I think it is very close to sphere splitting. Also there is a formula for relative probability of two models which seems to be useful (he proves it can distinguish Gauss and Lorentz broadening), and there is a method to define which number of polynomes should be in the trend line (as mu0 – and may be it works for knots num too, I tried to check it, but it needs co-fitting of mu0 and chi). I thought something like this can help say the more definite words about the structure of my samples, that’s why I tried it. For XAFS spectra, I have chosen the Krappe and Rossner method of calculus now because it is difficult to use tabulated amplitudes and phases in the different method, but they are interconnected.
What of the k-space, the math in Bayesian method is for minimization of (dat-model)^2/eps^2 itself, not FT of it. And statistical values (covariance matrix, projections et all) are calculated for it. Bretthorst says that FT works only for clear sine functions and gives the same as his analysis. But for complicated cases he insists on other analysis. But of course there is need to filter the high-frequency part from XAFS spectra, but I am not sure how to make it in Bayesian sense, and how to calculate errors, correlations and projections, and model comparison for this case. I thought I could filter high-frequency with rectangle windows (possibly with *k^2 and the result /k^2 because of XAFS formula) and will not lose any signal but the non-necessary high-frequency. And then to model it as usual. But of course it could be more right to use part of FT and to transform math, but I am not sure how it should be done now. I wanted to fit not-filtered and filtered functions and compare, because the high-frequency part is not much more than noise level and may it could be neglected in our case. I think the model takes into account that we have limited the R spectrum – the model consists only of the first coordination spheres? The k grid for this method needs not to be even, and the step is taken into account by the math of Bayesian method, or I mistake? The N_idp information in this case, I think, is given by parametra projections and other definition of “what part of parametra space is defined by spectra and what of a priori information”©Krappe and Rossner, I think?
I said ‘is hard to do’ Bayesian analysis with Fourier-filtered spectrum (may be missed a word) because I didn’t find in literature how it should be done, and couldn’t solve this problem myself. I’m not sure I have the experience for it, though I thought of it and will search the solution. The method of statistics IFEFFIT use and Bayesian statistics are somewhat different in concept I thought, and I can’t use it that simply here, but I can mistake. May be there are books I didn’t find on this stage, may be the chi_square for transformed signal can work as well? I’m a novice (that has found a useful algorythm in papers) and am not sure in it, but may be it can’t be done all at once, that’s why I try to simplify the task to get some valuable structure information and apply an existing method first.

May be it will answer why I need k-space and all these renormalizations and extra work.

Thankful for your assistance,
Olga Kashurnikova, MEPhI, Moscow

More information about the Ifeffit mailing list