[Ifeffit] database

Matt Newville newville at cars.uchicago.edu
Mon Aug 22 10:42:29 CDT 2005


Hi Michel,

> I may be just a nuisance, 

Not at all....

> but wouldn't it be interesting to contact people currently
> developping mineralogical databases (e.g., from the
> Mineralogical Society of America) to have some insight on how
> their database is organized? That may save a lot of time.

Good idea!  I started looking at the MinSocAm database: 
  http://www.minsocam.org/MSA/Crystal_Database.html
as well as
  http://www.crystallography.net/
and the Protein Data Bank:
  http://www.rcsb.org/pdb/

I'm definitely inclined to use a mysql database for storing 
data and a web interface for retrieving and adding data.  I 
believe that's what these other databases are doing as well. 

> I think the simplest way is to separate the "general
> information" field from the "data field". The general
> information may be entered as a standard fill-in submission
> page, and this information would then be entered as commented
> lines (e;g., startting with "#") at the beginning of the
> download files (pardon me if you think this is obvious!).

I do think that it may be best to forget about 'obvious' at this
point, so we don't accidentally get stuck in a rut of doing what
is easy but not good enough or what is close to what we're doing
now.  I do think we should think more carefully about what is
the important information to go into a file, and especially what 
needs to be understood by the computer.  I like the idea of 
having formal specification(s) for data file formats, but at 
this point I'm more concerned about what information should be 
stored rather than how to present it.  

Bruce's and Ken's IXSIF seems like a good first draft. I have
some minor issues that I sort of don't like, but it's a very
good start.  I wouldn't want to _store_ the data that way, but I
think it's definitely desirable to read and write to such a
Standard Format (once it's defined).

> Among the information to be added, I suggest a few more
> technical hints, like: - Synchrotron/beamline/station (or
> "calculated") - Crystal (e;g., Si(111)) - detection mode - raw
> spectrum/average of xxx spectra

Yep, I think so too.
 
> Other questions:
> - Is it advisable to accept only "raw" data (including glitches), or will
>    some kind of preprocessing be allowed?
> - Would it be advisable to rank the spectra according to their quality
>    (as for the powder diffraction files)?

I think that allowing pre-processing _is_ a good idea.  I also
think that having a quality ranking is a good idea.

I'll try to post more on this in the next couple days.

--Matt




More information about the Ifeffit mailing list