Dataset Open Access

CAD 120 affordance dataset

Sawatzky, Johann; Srikantha, Abhilash; Gall, Juergen

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Sawatzky, J., Srikantha, A., Gall, J.: Weakly supervised affordance detection.  CVPR (2017)</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">computer vision</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">affordances</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">attributes</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">semantic image segmentation</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">robotics</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">weakly supervised learning</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">convolutional neural network</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">anticipating human behavior</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">mapping on demand</subfield>
  <controlfield tag="005">20170908080900.0</controlfield>
  <datafield tag="500" ind1=" " ind2=" ">
    <subfield code="a">Acknowledgments. The work has been financially sup-
ported by the DFG projects GA 1927/5-1 (DFG Research
Unit FOR 2535 Anticipating Human Behavior) and GA
1927/2-2 (DFG Research Unit FOR 1505 Mapping on De-
  <controlfield tag="001">495570</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">21-26 July 2017</subfield>
    <subfield code="g">CVPR</subfield>
    <subfield code="a">Computer Vision and Pattern Recognition</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Carl Zeiss AG</subfield>
    <subfield code="a">Srikantha, Abhilash</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Bonn</subfield>
    <subfield code="a">Gall, Juergen</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2570322318</subfield>
    <subfield code="z">md5:be832f74fa4a3db7644f9e47175d4bc3</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2017-04-07</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o"></subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">University of Bonn</subfield>
    <subfield code="a">Sawatzky, Johann</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">CAD 120 affordance dataset</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution 4.0</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;% ==============================================================================&lt;br&gt;
% CAD 120 Affordance Dataset&lt;br&gt;
% Version 1.0&lt;br&gt;
% ------------------------------------------------------------------------------&lt;br&gt;
% If you use the dataset please cite:&lt;br&gt;
% Johann Sawatzky, Abhilash Srikantha, Juergen Gall.&lt;br&gt;
% Weakly Supervised Affordance Detection.&lt;br&gt;
% IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17)&lt;br&gt;
% and&lt;br&gt;
% H. S. Koppula and A. Saxena.&lt;br&gt;
% Physically grounded spatio-temporal object affordances.&lt;br&gt;
% European Conference on Computer Vision (ECCV'14)&lt;br&gt;
% Any bugs or questions, please email sawatzky AT iai DOT uni-bonn DOT de.&lt;br&gt;
% ==============================================================================&lt;/p&gt;

&lt;p&gt;This is the CAD 120 Affordance Segmentation Dataset based on the Cornell Activity&lt;br&gt;
Dataset CAD 120 (see;/p&gt;


RGB frames selected from Cornell Activity Dataset. To find out the location of the frame&lt;br&gt;
in the original videos, see video_info.txt.&lt;/p&gt;

image crops taken from the selected frames and resized to 321*321. Each crop is a padded&lt;br&gt;
bounding box of an object the human interacts with in the video. Due to the padding,&lt;br&gt;
the crops may contain background and other objects.&lt;br&gt;
In each selected frame, each bounding box was processed. The bounding boxes are already&lt;br&gt;
given in the Cornell Activity Dataset.&lt;br&gt;
The 5-digit number gives the frame number, the second number gives the bounding box number&lt;br&gt;
within the frame.&lt;/p&gt;

321*321*6 segmentation masks for the image crops. Each channel corresponds to an&lt;br&gt;
affordance (openabe, cuttable, pourable, containable, supportable, holdable, in this order).&lt;br&gt;
All pixels belonging to a particular affordance are labeled 1 in the respective channel,&lt;br&gt;
otherwise 0.  &lt;/p&gt;

321*321 png images, each containing the binary mask for one of the affordances.&lt;/p&gt;

Lists containing the train and test sets for two splits. The actor split ensures that&lt;br&gt;
train and test images stem from different videos with different actors while the object split ensures&lt;br&gt;
that train and test data have no (central) object classes in common.&lt;br&gt;
The train sets are additionally subdivided into 3 subsets A,B and C. For the actor split,&lt;br&gt;
the subsets stem from different videos. For the object split, each subset contains&lt;br&gt;
every third crop of the train set.&lt;/p&gt;

Maps image crops to their coordinates in the frames.&lt;/p&gt;

Maps frames to 2d human pose coordinates. Hand annotated by us.&lt;/p&gt;

Maps image crops to the (central) object it contains.&lt;/p&gt;

Maps image crops to affordances visible in this crop&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

The crops contain the following object classes:&lt;br&gt;
5.thermal cup&lt;br&gt;
7.medicine box&lt;br&gt;
10.paper box&lt;br&gt;

&lt;p&gt;Affordances in our set:&lt;br&gt;

&lt;p&gt;Note that our object affordance labeling differs from the Cornell Activity Dataset:&lt;br&gt;
E.g. the cap of a pizza box is considered to be supportable.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;</subfield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">url</subfield>
    <subfield code="i">isSupplementTo</subfield>
    <subfield code="a"></subfield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">url</subfield>
    <subfield code="i">isSupplementTo</subfield>
    <subfield code="a"></subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.495570</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>


Cite as