XDI: XAFS Data Interchange Format
The XAFS Data Interchange Format or XDI format was first conceived by Bruce Ravel, discussed Q2XAFS_2011 meeting, and in the [Ravel et al. (2012)] article from a working group formed at that meeting. The format was then further refined and presented at the Q2XAFS 2015 meeting (XDI Q2XAFS2015 presentation), and then published by [Ravel and Newville (2016)] in the proceedings of the 2015 XAFS Conference. Details of the format, including examples and code for reading data, and XDI metadata dictionary in the XDI format are give at the XDI github repository.
Overview of the XDI format
The XDI standard represents a single XAS spectrum along with relevant metadata in a text-based format with simple syntax for metadata describing the measurement and a single data table that is meant to be easily read and interpreted by both humans and software. The members of the working groups formed to discuss data formats for exchanging XAS data and the authors of the XDI format and support software encourage using XDI as the “standard format” for communicating XAS data whenever possible.
The XDI format is focused on representing the measured XAS spectrum
\(\mu(E)\), which will typically consist of a few 1-dimensional arrays of
numerical data that may include: energy
, the energy of the incident X-ray
beam, i0
, the measured intensity of the incident beam, itrans
, the
measured intensity of the transmitted beam, ifluor
, the measured intensity
of the fluorescent or emission beam, and irefer
, the measured intensity of
the beam after a reference sample often used as an energy calibration. All of
the intensity values are left to be in unspecified units, and may not all be
available for all spectra. In addition, the arrays mutrans
, mufluor
,
and murefer
may contain the measurement of \(mu(E)\) for the various
signals. The energy will typically have units of eV or keV but can also be
derived by an angle
array that represents the angle (in degrees) of the
monochromator as long as its d-spacing is available. With these arrays
available, XDI gives ample flexibility to represent a single XAS spectrum.
Some limitations of XDI
While XDI focuses on a single XAS spectrum of \(\mu(E)\), it is somewhat limited for describing all variations of XAS data. Most obviously, it contains only one spectrum, and cannot represent a series of related spectra. It is also not very specific about how to handle fluorescence or emission XAS data measured with multi-element detectors, especially that use pulse-counting electronics that may need dead-time corrections. And while it can be used for HERFD XAS data, it is not prescriptive about how to specify the energy window selected by the HERFD analyzers, or the ranges of \(q\) values for X-ray Raman data. XDI is also not really capable of describing multi-dimensional XAS data, say measured in full-field mode or by scanning through a RIXS plane.
While some of these limitations are understandable for a text-based format
aimed at representing a single XAS measurement of “good data on a
well-characterized sample”, it also means that XDI is not easily capable of
representing as-measured (or “Raw”) data. While many beamlines do use XDI-like
text files and even use its conventions for tagging metadata, it seems that
exact adherence to the XDI metadata tags is not universal (even at my own
beamline!). Importantly, for “Raw” as-measured data files. it is not always
well-known whether data is being best measured in Transmission or Emission mode,
as many (it appears even “most”) beamlines will measure all available channels
even when some of them contain poor or no measurements at all. For fluorescence
data measured with multi-element solid-state detectors, a single ifluor
channel may not be available, as multiple channels would need to be corrected
for dead-time and summed. While the “Raw” data may need some light processing
before identifying the proper arrays of data to communicate as “the spectrum”.
While that processing may not be too complex, it likely requires specific
knowledge of the columns in the data table (for example which columns hold
InputCountRate and OutputCountRate for each detector element). That means that
while the derived ifluor
or mufluor
in the XDI specification is still
preferred, the full data table of the “Raw” data is still interesting and worth
preserving and transmitting.
Even with some beamlines collecting “Raw” XAS data into “XDI-like” files, many of these regularly use multi-element fluorescence detectors and will record many channels (up to 100 is not unusual) for even energy point in a typical XAS scan. While there are certainly advantages to plain-text files (more on this below), a file with 100 columns is not easily described as “human readable”.
That is, while the XDI format gives a simple and very good way to represent a single \(\mu(E)\) spectrum, it is less obvious how to best archive and disseminate multiple raw XAS data, particularly those measured with multi-element fluorescence detectors. These two limitations do complicate the presentation of XAS data for on-line databases, supplemental material and downloadable archives of data.