XDI: XAFS Data Interchange Format

The XAFS Data Interchange Format or XDI format was first conceived by Bruce Ravel, discussed Q2XAFS_2011 meeting, and in the [Ravel et al. (2012)] article from a working group formed at that meeting. The format was then further refined and presented at the Q2XAFS 2015 meeting (XDI Q2XAFS2015 presentation), and then published by [Ravel and Newville (2016)] in the proceedings of the 2015 XAFS Conference. Details of the format, including examples and code for reading data, and XDI metadata dictionary in the XDI format are give at the XDI github repository.

Overview of the XDI format

The XDI standard represents a single XAS spectrum along with relevant metadata in a text-based format with simple syntax for metadata describing the measurement and a single data table that is meant to be easily read and interpreted by both humans and software. The members of the working groups formed to discuss data formats for exchanging XAS data and the authors of the XDI format and support software encourage using XDI as the “standard format” for communicating XAS data whenever possible.

The XDI format is focused on representing the measured XAS spectrum \(\mu(E)\), which will typically consist of a few 1-dimensional arrays of numerical data that may include: energy, the energy of the incident X-ray beam, i0, the measured intensity of the incident beam, itrans, the measured intensity of the transmitted beam, ifluor, the measured intensity of the fluorescent or emission beam, and irefer, the measured intensity of the beam after a reference sample often used as an energy calibration. All of the intensity values are left to be in unspecified units, and may not all be available for all spectra. In addition, the arrays mutrans, mufluor, and murefer may contain the measurement of \(mu(E)\) for the various signals. The energy will typically have units of eV or keV but can also be derived by an angle array that represents the angle (in degrees) of the monochromator as long as its d-spacing is available. With these arrays available, XDI gives ample flexibility to represent a single XAS spectrum.

Some limitations of XDI

While XDI focuses on a single XAS spectrum of \(\mu(E)\), it is somewhat limited for describing all variations of XAS data. Most obviously, it contains only one spectrum, and cannot represent a series of related spectra. It is also not very specific about how to handle fluorescence or emission XAS data measured with multi-element detectors, especially that use pulse-counting electronics that may need dead-time corrections. And while it can be used for HERFD XAS data, it is not prescriptive about how to specify the energy window selected by the HERFD analyzers, or the ranges of \(q\) values for X-ray Raman data. XDI is also not really capable of describing multi-dimensional XAS data, say measured in full-field mode or by scanning through a RIXS plane.

While some of these limitations are understandable for a text-based format aimed at representing a single XAS measurement of “good data on a well-characterized sample”, it also means that XDI is not easily capable of representing as-measured (or “Raw”) data. While many beamlines do use XDI-like text files and even use its conventions for tagging metadata, it seems that exact adherence to the XDI metadata tags is not universal (even at my own beamline!). Importantly, for “Raw” as-measured data files. it is not always well-known whether data is being best measured in Transmission or Emission mode, as many (it appears even “most”) beamlines will measure all available channels even when some of them contain poor or no measurements at all. For fluorescence data measured with multi-element solid-state detectors, a single ifluor channel may not be available, as multiple channels would need to be corrected for dead-time and summed. While the “Raw” data may need some light processing before identifying the proper arrays of data to communicate as “the spectrum”. While that processing may not be too complex, it likely requires specific knowledge of the columns in the data table (for example which columns hold InputCountRate and OutputCountRate for each detector element). That means that while the derived ifluor or mufluor in the XDI specification is still preferred, the full data table of the “Raw” data is still interesting and worth preserving and transmitting.

Even with some beamlines collecting “Raw” XAS data into “XDI-like” files, many of these regularly use multi-element fluorescence detectors and will record many channels (up to 100 is not unusual) for even energy point in a typical XAS scan. While there are certainly advantages to plain-text files (more on this below), a file with 100 columns is not easily described as “human readable”.

That is, while the XDI format gives a simple and very good way to represent a single \(\mu(E)\) spectrum, it is less obvious how to best archive and disseminate multiple raw XAS data, particularly those measured with multi-element fluorescence detectors. These two limitations do complicate the presentation of XAS data for on-line databases, supplemental material and downloadable archives of data.