Hi Michel,
I may be just a nuisance,
Not at all....
but wouldn't it be interesting to contact people currently developping mineralogical databases (e.g., from the Mineralogical Society of America) to have some insight on how their database is organized? That may save a lot of time.
Good idea! I started looking at the MinSocAm database: http://www.minsocam.org/MSA/Crystal_Database.html as well as http://www.crystallography.net/ and the Protein Data Bank: http://www.rcsb.org/pdb/ I'm definitely inclined to use a mysql database for storing data and a web interface for retrieving and adding data. I believe that's what these other databases are doing as well.
I think the simplest way is to separate the "general information" field from the "data field". The general information may be entered as a standard fill-in submission page, and this information would then be entered as commented lines (e;g., startting with "#") at the beginning of the download files (pardon me if you think this is obvious!).
I do think that it may be best to forget about 'obvious' at this point, so we don't accidentally get stuck in a rut of doing what is easy but not good enough or what is close to what we're doing now. I do think we should think more carefully about what is the important information to go into a file, and especially what needs to be understood by the computer. I like the idea of having formal specification(s) for data file formats, but at this point I'm more concerned about what information should be stored rather than how to present it. Bruce's and Ken's IXSIF seems like a good first draft. I have some minor issues that I sort of don't like, but it's a very good start. I wouldn't want to _store_ the data that way, but I think it's definitely desirable to read and write to such a Standard Format (once it's defined).
Among the information to be added, I suggest a few more technical hints, like: - Synchrotron/beamline/station (or "calculated") - Crystal (e;g., Si(111)) - detection mode - raw spectrum/average of xxx spectra
Yep, I think so too.
Other questions: - Is it advisable to accept only "raw" data (including glitches), or will some kind of preprocessing be allowed? - Would it be advisable to rank the spectra according to their quality (as for the powder diffraction files)?
I think that allowing pre-processing _is_ a good idea. I also think that having a quality ranking is a good idea. I'll try to post more on this in the next couple days. --Matt