RE: [Ifeffit] database
Le 21:18 19/08/2005, vous avez écrit:
Carlo, Thanks for the offer of help!
Dave, thanks for the suggestions, I think those are all good.
In fact, I think none of this is that hard to do, but it will take some time. For what it's worth, I think the data should be stored in a 'proper' relational database, if for no other reason than to make it easier and faster to search. It will also make many of the 'fancier' features (user comments) easier to implement.
I may be just a nuisance, but wouldn't it be interesting to contact people currently developping mineralogical databases (e.g., from the Mineralogical Society of America) to have some insight on how their database is organized? That may save a lot of time.
At this point, I think the most immediate need is (as Dave mentioned yesterday) to identify a reasonable format for the data at both upload time and at download time. For upload, we'd certainly want a web form where fields could be filled in or parsed from a file following a 'Standard Data File Format'.
I think the simplest way is to separate the "general information" field from the "data field". The general information may be entered as a standard fill-in submission page, and this information would then be entered as commented lines (e;g., startting with "#") at the beginning of the download files (pardon me if you think this is obvious!). Among the information to be added, I suggest a few more technical hints, like: - Synchrotron/beamline/station (or "calculated") - Crystal (e;g., Si(111)) - detection mode - raw spectrum/average of xxx spectra Other questions: - Is it advisable to accept only "raw" data (including glitches), or will some kind of preprocessing be allowed? - Would it be advisable to rank the spectra according to their quality (as for the powder diffraction files)? Best regards, Michel -- Michel Schlegel Commissariat à l'énergie atomique CEN de Saclay, DEN/DPC/SCP/LRSI Bat 391 - Piece 205B F91 191 Gif-sur-Yvette Cedex, France Ph: +33 (0)1 69 08 93 84 Fax: +33 (0)1 69 08 54 11
Hi Michel,
I may be just a nuisance,
Not at all....
but wouldn't it be interesting to contact people currently developping mineralogical databases (e.g., from the Mineralogical Society of America) to have some insight on how their database is organized? That may save a lot of time.
Good idea! I started looking at the MinSocAm database: http://www.minsocam.org/MSA/Crystal_Database.html as well as http://www.crystallography.net/ and the Protein Data Bank: http://www.rcsb.org/pdb/ I'm definitely inclined to use a mysql database for storing data and a web interface for retrieving and adding data. I believe that's what these other databases are doing as well.
I think the simplest way is to separate the "general information" field from the "data field". The general information may be entered as a standard fill-in submission page, and this information would then be entered as commented lines (e;g., startting with "#") at the beginning of the download files (pardon me if you think this is obvious!).
I do think that it may be best to forget about 'obvious' at this point, so we don't accidentally get stuck in a rut of doing what is easy but not good enough or what is close to what we're doing now. I do think we should think more carefully about what is the important information to go into a file, and especially what needs to be understood by the computer. I like the idea of having formal specification(s) for data file formats, but at this point I'm more concerned about what information should be stored rather than how to present it. Bruce's and Ken's IXSIF seems like a good first draft. I have some minor issues that I sort of don't like, but it's a very good start. I wouldn't want to _store_ the data that way, but I think it's definitely desirable to read and write to such a Standard Format (once it's defined).
Among the information to be added, I suggest a few more technical hints, like: - Synchrotron/beamline/station (or "calculated") - Crystal (e;g., Si(111)) - detection mode - raw spectrum/average of xxx spectra
Yep, I think so too.
Other questions: - Is it advisable to accept only "raw" data (including glitches), or will some kind of preprocessing be allowed? - Would it be advisable to rank the spectra according to their quality (as for the powder diffraction files)?
I think that allowing pre-processing _is_ a good idea. I also think that having a quality ranking is a good idea. I'll try to post more on this in the next couple days. --Matt
On Monday 22 August 2005 10:42, Matt Newville wrote:
Bruce's and Ken's IXSIF seems like a good first draft. I have some minor issues that I sort of don't like, but it's a very good start. I wouldn't want to _store_ the data that way, but I think it's definitely desirable to read and write to such a Standard Format (once it's defined).
Matt makes a really important point in regard to my earlier post. The file format that Ken and I are suggesting should not reflect the internal structure of a good database application. The database should be ... well ... a database. The thing that we are suggesting is an interchange format. I.e. it is the file that the database app might write if the user requests a file. It might also be the file that data acquisition software writes when a scan is completed at the beamline. It might also be the file that a program like Athena writes when an output column data file is generated. As such, and with libraries written in several popular languages, it would be easy for data analysis software to import and it would be easy for the database application under discussion to import. B -- Bruce Ravel ----------------------------------- bravel@anl.gov -or- ravel@phys.washington.edu Environmental Research Division, Building 203, Room E-165 Argonne National Laboratory phone and voice mail: (1) 630 252 5033 Argonne IL 60439, USA fax: (1) 630 252 9793 My homepage: http://feff.phys.washington.edu/~ravel EXAFS software: http://feff.phys.washington.edu/~ravel/software/exafs/
participants (3)
-
Bruce Ravel
-
Matt Newville
-
Michel Schlegel