Dataset Open Access
This data set contains annotated text versions of 2-page abstracts published at the Lunar and Planetary Science Conference in 2015 and 2016.
The original PDF abstracts are available at:
The text files in this archive were extracted using the Apache Tika PDF parsing tool. The text is provided here so that the annotations can be viewed. The text content remains copyright of the original abstract authors.
The annotations (entities and relations) are provided in the format used by the brat annotation tool. To view the annotations in a web-based graphical form, install the brat tool (http://brat.nlplab.org/). These annotations were generated using brat v1.3. The annotation files are also human-readable and can be parsed in to be used directly in code.
Each directory contains a .txt and .ann file for each abstract. The .ann file is in brat standoff format (http://brat.nlplab.org/standoff.html).
Additional .conf files are provided to generate color highlighting and keyboard shortcuts. These are used by the brat tool.
If you use this data set in your own work, please cite this DOI:
Please also cite this paper, which provides additional details about the data set.
Kiri L. Wagstaff, Raymond Francis, Thamme Gowda, You Lu, Ellen Riloff, Karanjeet Singh, and Nina Lanza. "Mars Target Encyclopedia: Rock and Soil Composition Extracted from the Literature." Proceedings of the Thirtieth Annual Conference on Innovative Applications of Artificial Intelligence, 2018.