Dataset Open Access

CAD 120 affordance dataset

Sawatzky, Johann; Srikantha, Abhilash; Gall, Juergen


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-3" xsi:schemaLocation="http://datacite.org/schema/kernel-3 http://schema.datacite.org/meta/kernel-3/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.495570</identifier>
  <creators>
    <creator>
      <creatorName>Sawatzky, Johann</creatorName>
      <affiliation>University of Bonn</affiliation>
    </creator>
    <creator>
      <creatorName>Srikantha, Abhilash</creatorName>
      <affiliation>Carl Zeiss AG</affiliation>
    </creator>
    <creator>
      <creatorName>Gall, Juergen</creatorName>
      <affiliation>University of Bonn</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Cad 120 Affordance Dataset</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2017</publicationYear>
  <subjects>
    <subject>computer vision</subject>
    <subject>affordances</subject>
    <subject>attributes</subject>
    <subject>semantic image segmentation</subject>
    <subject>robotics</subject>
    <subject>weakly supervised learning</subject>
    <subject>convolutional neural network</subject>
    <subject>anticipating human behavior</subject>
    <subject>mapping on demand</subject>
  </subjects>
  <dates>
    <date dateType="Issued">2017-04-07</date>
  </dates>
  <resourceType resourceTypeGeneral="Dataset"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/495570</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsSupplementTo">https://pages.iai.uni-bonn.de/gall_juergen/download/jgall_affordancedetection_cvpr17.pdf</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsSupplementTo">https://github.com/ykztawas/Weakly-Supervised-Affordance-Detection</relatedIdentifier>
  </relatedIdentifiers>
  <rightsList>
    <rights rightsURI="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;% ==============================================================================&lt;br&gt;
% CAD 120 Affordance Dataset&lt;br&gt;
% Version 1.0&lt;br&gt;
% ------------------------------------------------------------------------------&lt;br&gt;
% If you use the dataset please cite:&lt;br&gt;
%&lt;br&gt;
% Johann Sawatzky, Abhilash Srikantha, Juergen Gall.&lt;br&gt;
% Weakly Supervised Affordance Detection.&lt;br&gt;
% IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17)&lt;br&gt;
%&lt;br&gt;
% and&lt;br&gt;
%&lt;br&gt;
% H. S. Koppula and A. Saxena.&lt;br&gt;
% Physically grounded spatio-temporal object affordances.&lt;br&gt;
% European Conference on Computer Vision (ECCV'14)&lt;br&gt;
%&lt;br&gt;
% Any bugs or questions, please email sawatzky AT iai DOT uni-bonn DOT de.&lt;br&gt;
% ==============================================================================&lt;/p&gt;

&lt;p&gt;This is the CAD 120 Affordance Segmentation Dataset based on the Cornell Activity&lt;br&gt;
Dataset CAD 120 (see http://pr.cs.cornell.edu/humanactivities/data.php).&lt;/p&gt;

&lt;p&gt;Content&lt;/p&gt;

&lt;p&gt;frames/*.png:&lt;br&gt;
RGB frames selected from Cornell Activity Dataset. To find out the location of the frame&lt;br&gt;
in the original videos, see video_info.txt.&lt;/p&gt;

&lt;p&gt;object_crop_images/*.png&lt;br&gt;
image crops taken from the selected frames and resized to 321*321. Each crop is a padded&lt;br&gt;
bounding box of an object the human interacts with in the video. Due to the padding,&lt;br&gt;
the crops may contain background and other objects.&lt;br&gt;
In each selected frame, each bounding box was processed. The bounding boxes are already&lt;br&gt;
given in the Cornell Activity Dataset.&lt;br&gt;
The 5-digit number gives the frame number, the second number gives the bounding box number&lt;br&gt;
within the frame.&lt;/p&gt;

&lt;p&gt;segmentation_mat/*.mat&lt;br&gt;
321*321*6 segmentation masks for the image crops. Each channel corresponds to an&lt;br&gt;
affordance (openabe, cuttable, pourable, containable, supportable, holdable, in this order).&lt;br&gt;
All pixels belonging to a particular affordance are labeled 1 in the respective channel,&lt;br&gt;
otherwise 0.  &lt;/p&gt;

&lt;p&gt;segmentation_png/*.png&lt;br&gt;
321*321 png images, each containing the binary mask for one of the affordances.&lt;/p&gt;

&lt;p&gt;lists/*.txt&lt;br&gt;
Lists containing the train and test sets for two splits. The actor split ensures that&lt;br&gt;
train and test images stem from different videos with different actors while the object split ensures&lt;br&gt;
that train and test data have no (central) object classes in common.&lt;br&gt;
The train sets are additionally subdivided into 3 subsets A,B and C. For the actor split,&lt;br&gt;
the subsets stem from different videos. For the object split, each subset contains&lt;br&gt;
every third crop of the train set.&lt;/p&gt;

&lt;p&gt;crop_coordinate_info.txt&lt;br&gt;
Maps image crops to their coordinates in the frames.&lt;/p&gt;

&lt;p&gt;hpose_info.txt&lt;br&gt;
Maps frames to 2d human pose coordinates. Hand annotated by us.&lt;/p&gt;

&lt;p&gt;object_info.txt&lt;br&gt;
Maps image crops to the (central) object it contains.&lt;/p&gt;

&lt;p&gt;visible_affordance_info.txt&lt;br&gt;
Maps image crops to affordances visible in this crop&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%55&lt;br&gt;
The crops contain the following object classes:&lt;br&gt;
1.table&lt;br&gt;
2.kettle&lt;br&gt;
3.plate&lt;br&gt;
4.bottle&lt;br&gt;
5.thermal cup&lt;br&gt;
6.knife&lt;br&gt;
7.medicine box&lt;br&gt;
8.can&lt;br&gt;
9.microwave&lt;br&gt;
10.paper box&lt;br&gt;
11.bowl&lt;br&gt;
12.mug&lt;/p&gt;

&lt;p&gt;Affordances in our set:&lt;br&gt;
1.openable&lt;br&gt;
2.cuttable&lt;br&gt;
3.pourable&lt;br&gt;
4.containable&lt;br&gt;
5.supportable&lt;br&gt;
6.holdable&lt;/p&gt;

&lt;p&gt;Note that our object affordance labeling differs from the Cornell Activity Dataset:&lt;br&gt;
E.g. the cap of a pizza box is considered to be supportable.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;</description>
    <description descriptionType="Other">Acknowledgments. The work has been financially sup-
ported by the DFG projects GA 1927/5-1 (DFG Research
Unit FOR 2535 Anticipating Human Behavior) and GA
1927/2-2 (DFG Research Unit FOR 1505 Mapping on De-
mand).</description>
    <description descriptionType="Other">{"references": ["Sawatzky, J., Srikantha, A., Gall, J.: Weakly supervised affordance detection.  CVPR (2017)"]}</description>
  </descriptions>
</resource>

Share

Cite as