Dataset Open Access

Mars Target Encyclopedia - LPSC abstracts labeled data set

Raymond Francis; Kiri Wagstaff

Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="" xmlns:oai_dc="" xmlns:xsi="" xsi:schemaLocation="">
  <dc:creator>Raymond Francis</dc:creator>
  <dc:creator>Kiri Wagstaff</dc:creator>
  <dc:description>This data set contains annotated text versions of 2-page abstracts published at the Lunar and Planetary Science Conference in 2015 and 2016.

The original PDF abstracts are available at:

The text files in this archive were extracted using the Apache Tika PDF parsing tool.  The text is provided here so that the annotations can be viewed.  The text content remains copyright of the original abstract authors.

The annotations (entities and relations) are provided in the format used by the brat annotation tool.  To view the annotations in a web-based graphical form, install the brat tool (  These annotations were generated using brat v1.3.  The annotation files are also human-readable and can be parsed in to be used directly in code.


	lpsc15/: 62 abstracts
	lpsc16/: 55 abstracts

Each directory contains a .txt and .ann file for each abstract.  The .ann file is in brat standoff format (

Additional .conf files are provided to generate color highlighting and keyboard shortcuts.  These are used by the brat tool.


If you use this data set in your own work, please cite this DOI:


Please also cite this paper, which provides additional details about the data set.

Kiri L. Wagstaff, Raymond Francis, Thamme Gowda, You Lu, Ellen Riloff, Karanjeet Singh, and Nina Lanza. "Mars Target Encyclopedia: Rock and Soil Composition Extracted from the Literature."  Proceedings of the Thirtieth Annual Conference on Innovative Applications of Artificial Intelligence, 2018.</dc:description>
  <dc:subject>information extraction</dc:subject>
  <dc:subject>named entity recognition</dc:subject>
  <dc:title>Mars Target Encyclopedia - LPSC abstracts labeled data set</dc:title>
All versions This version
Views 341341
Downloads 3637
Data volume 20.4 MB21.0 MB
Unique views 333333
Unique downloads 3637


Cite as