There is a newer version of this record available.

Dataset Open Access

MSL Curiosity Rover Images with Science and Engineering Classes

Steven Lu; Kiri L. Wagstaff


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Mars</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Mars Science Laboratory (MSL)</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Curiosity Rover</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Machine Learning</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Labeled Data Set</subfield>
  </datafield>
  <controlfield tag="005">20200917004100.0</controlfield>
  <controlfield tag="001">3892024</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="0">(orcid)0000-0003-4401-5506</subfield>
    <subfield code="a">Kiri L. Wagstaff</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Gary Doran</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Jake Lee</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Dominique Vaca</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Annie Didier</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Raymond Francis</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Brian Bue</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Kevin Shannon</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Doug Ellison</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Jackie Ryan</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Masha Liukis</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">CalTech</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Jesse Cai</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Aaron Roth</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Maryland</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Hannah Kerner</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="4">col</subfield>
    <subfield code="a">Mark Wronkiewicz</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">54850925</subfield>
    <subfield code="z">md5:cf771f5acb4a212b98d2427dea81bdd5</subfield>
    <subfield code="u">https://zenodo.org/record/3892024/files/msl-labeled-data-set.zip</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-06-12</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="p">user-computer-vision</subfield>
    <subfield code="o">oai:zenodo.org:3892024</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Jet Propulsion Laboratory</subfield>
    <subfield code="0">(orcid)0000-0001-6859-7737</subfield>
    <subfield code="a">Steven Lu</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">MSL Curiosity Rover Images with Science and Engineering Classes</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-computer-vision</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;&lt;strong&gt;Data Set Description&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The data set consists of 6,820 images that were collected by the Mars Science Laboratory (MSL) Curiosity Rover by three instruments: (1) the Mast Camera (Mastcam) Left Eye; (2) the Mast Camera Right Eye; (3)&amp;nbsp;&amp;nbsp;the Mars Hand Lens Imager (MAHLI). With the help from Dr. Raymond Francis, a member of the MSL operations team, we identified 19 classes with science and engineering interests (see the&amp;nbsp;&amp;quot;Classes&amp;quot; section for more information), and each image is assigned with 1 class label.&amp;nbsp;We split the data set into training, validation, and test sets in order to train and evaluate machine learning algorithms. The training set contains 5,920 images (including augmented images; see the &amp;quot;Image Augmentation&amp;quot; section for more information); the validation set contains 300 images; the test set contains 600 images. The training set images were randomly sampled from sol (Martian day) range 1 - 948; validation set images were randomly sampled from sol range 949 - 1920; test set images were randomly sampled from sol range 1921 - 2224. All images are resized to 227 x 227 pixels without preserving the original height/width aspect ratio.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Directory Contents&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;images - contains all 6,820 images&lt;/li&gt;
	&lt;li&gt;class_map.csv - string-integer class mappings&lt;/li&gt;
	&lt;li&gt;train-set.txt - label file for the training set&lt;/li&gt;
	&lt;li&gt;val-set.txt - label file for the validation set&lt;/li&gt;
	&lt;li&gt;test-set.txt - label file for the test set&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The label files are formatted as below:&lt;/p&gt;

&lt;p&gt;&amp;quot;Image-file-name class_in_integer_representation&amp;quot;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Labeling Process&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each image was labeled with help from&amp;nbsp;three different volunteers (see Contributor list). The final labels are determined using the following processes:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;If all three labels agree with each other, then use the label as the final label.&lt;/li&gt;
	&lt;li&gt;If the three labels do not agree with each other, then we manually review the labels and decide the final label.&lt;/li&gt;
	&lt;li&gt;We also performed error analysis to correct labels as a post-processing step in order to remove noisy/incorrect labels&amp;nbsp;in the data set.&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Classes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There are 19 classes identified in this data set. In order to simplify our training and evaluation algorithms, we mapped the class names from string to integer representations. The names of classes, string-integer mappings, distributions are shown below:&lt;/p&gt;

&lt;p&gt;Class name, counts (training set), counts (validation set), counts (test set), integer representation&lt;/p&gt;

&lt;p&gt;Arm cover, 10, 1, 4, 0&lt;/p&gt;

&lt;p&gt;Other rover part, 188, 11, 10, 1&lt;/p&gt;

&lt;p&gt;Artifact, 664, 60, 132, 2&lt;/p&gt;

&lt;p&gt;Nearby surface, 1524, 72, 187, 3&lt;/p&gt;

&lt;p&gt;Close-up rock, 1456, 52, 84, 4&lt;/p&gt;

&lt;p&gt;DRT, 8, 4, 6, 5&lt;/p&gt;

&lt;p&gt;DRT spot, 196, 0, 7, 6&lt;/p&gt;

&lt;p&gt;Distant landscape, 348, 14, 34, 7&lt;/p&gt;

&lt;p&gt;Drill hole, 252, 5, 12, 8&lt;/p&gt;

&lt;p&gt;Night sky, 40, 5, 4, 9&lt;/p&gt;

&lt;p&gt;Float, 154, 5, 1, 10&lt;/p&gt;

&lt;p&gt;Layers, 178, 21, 17, 11&lt;/p&gt;

&lt;p&gt;Light-toned veins, 48, 4, 27, 12&lt;/p&gt;

&lt;p&gt;Mastcam cal target, 124, 12, 29, 13&lt;/p&gt;

&lt;p&gt;Sand, 234, 19, 16, 14&lt;/p&gt;

&lt;p&gt;Sun, 190, 5, 19, 15&lt;/p&gt;

&lt;p&gt;Wheel, 212, 5, 5, 16&lt;/p&gt;

&lt;p&gt;Wheel joint, 62, 1, 5, 17&lt;/p&gt;

&lt;p&gt;Wheel tracks, 32, 4, 1, 18&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Image Augmentation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Only the training set contains augmented images. 3,920 of the 5,920 images in the training set are augmented versions of the remaining 2000 original training images. Images taken by different instruments were augmented differently. As shown below, we employed&amp;nbsp;5 different&amp;nbsp;methods to augment images. Images taken by the Mastcam left and right eye cameras&amp;nbsp;were augmented using a horizontal flipping method, and images taken by the MAHLI camera were augmented using all 5 methods. Note that one can filter based on the file names listed in the train-set.txt file&amp;nbsp;to obtain a set of non-augmented images.&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;90 degrees clockwise rotation (file name ends with -r90.jpg)&lt;/li&gt;
	&lt;li&gt;180 degrees clockwise rotation (file name ends with -r180.jpg)&lt;/li&gt;
	&lt;li&gt;270 degrees clockwise rotation (file name ends with -r270.jpg)&lt;/li&gt;
	&lt;li&gt;Horizontal flip (file name ends with -fh.jpg)&lt;/li&gt;
	&lt;li&gt;Vertical flip (file name ends with -fv.jpg)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Acknowledgment&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The authors would like to thank the&amp;nbsp;volunteers (as in the Contributor list) who provided annotations for this data set. We would also like to thank the PDS Imaging Note for the continuous support of this work.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isIdenticalTo</subfield>
    <subfield code="a">10.5281/zenodo.1049137</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.3892023</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.3892024</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
162
15
views
downloads
All versions This version
Views 16283
Downloads 159
Data volume 823.2 MB493.7 MB
Unique views 13170
Unique downloads 148

Share

Cite as