Dataset Open Access
Haque, Albert;
Peng, Boya;
Luo, Zelun;
Alahi, Alexandre;
Yeung, Serena;
Fei-Fei, Li
<?xml version='1.0' encoding='utf-8'?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:adms="http://www.w3.org/ns/adms#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dct="http://purl.org/dc/terms/" xmlns:dctype="http://purl.org/dc/dcmitype/" xmlns:dcat="http://www.w3.org/ns/dcat#" xmlns:duv="http://www.w3.org/ns/duv#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:frapo="http://purl.org/cerif/frapo/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:gsp="http://www.opengis.net/ont/geosparql#" xmlns:locn="http://www.w3.org/ns/locn#" xmlns:org="http://www.w3.org/ns/org#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:schema="http://schema.org/" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:vcard="http://www.w3.org/2006/vcard/ns#" xmlns:wdrs="http://www.w3.org/2007/05/powder-s#"> <rdf:Description rdf:about="https://doi.org/10.5281/zenodo.3932973"> <rdf:type rdf:resource="http://www.w3.org/ns/dcat#Dataset"/> <dct:type rdf:resource="http://purl.org/dc/dcmitype/Dataset"/> <dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">https://doi.org/10.5281/zenodo.3932973</dct:identifier> <foaf:page rdf:resource="https://doi.org/10.5281/zenodo.3932973"/> <dct:creator> <rdf:Description rdf:about="http://orcid.org/0000-0001-6769-6370"> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#string">0000-0001-6769-6370</dct:identifier> <foaf:name>Haque, Albert</foaf:name> <foaf:givenName>Albert</foaf:givenName> <foaf:familyName>Haque</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Stanford University</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Peng, Boya</foaf:name> <foaf:givenName>Boya</foaf:givenName> <foaf:familyName>Peng</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Stanford University</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Luo, Zelun</foaf:name> <foaf:givenName>Zelun</foaf:givenName> <foaf:familyName>Luo</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Stanford University</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Alahi, Alexandre</foaf:name> <foaf:givenName>Alexandre</foaf:givenName> <foaf:familyName>Alahi</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Stanford University</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description rdf:about="http://orcid.org/0000-0003-0529-0628"> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#string">0000-0003-0529-0628</dct:identifier> <foaf:name>Yeung, Serena</foaf:name> <foaf:givenName>Serena</foaf:givenName> <foaf:familyName>Yeung</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Stanford University</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description rdf:about="http://orcid.org/0000-0002-7481-0810"> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#string">0000-0002-7481-0810</dct:identifier> <foaf:name>Fei-Fei, Li</foaf:name> <foaf:givenName>Li</foaf:givenName> <foaf:familyName>Fei-Fei</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Stanford University</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:title>ITOP Dataset</dct:title> <dct:publisher> <foaf:Agent> <foaf:name>Zenodo</foaf:name> </foaf:Agent> </dct:publisher> <dct:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear">2016</dct:issued> <dcat:keyword>depth sensor</dcat:keyword> <dcat:keyword>human pose estimation</dcat:keyword> <dcat:keyword>computer vision</dcat:keyword> <dcat:keyword>3D vision</dcat:keyword> <dct:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2016-10-08</dct:issued> <dct:language rdf:resource="http://publications.europa.eu/resource/authority/language/ENG"/> <owl:sameAs rdf:resource="https://zenodo.org/record/3932973"/> <adms:identifier> <adms:Identifier> <skos:notation rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">https://zenodo.org/record/3932973</skos:notation> <adms:schemeAgency>url</adms:schemeAgency> </adms:Identifier> </adms:identifier> <dct:relation rdf:resource="http://arxiv.org/abs/1603.07076"/> <dct:isVersionOf rdf:resource="https://doi.org/10.5281/zenodo.3932972"/> <owl:versionInfo>1.0</owl:versionInfo> <dct:description><p><strong>Summary</strong></p> <p>The ITOP dataset (Invariant Top View) contains 100K depth images from side and top views of a person in a scene. For each image, the location of 15 human body parts are labeled with 3-dimensional (x,y,z) coordinates, relative to the sensor&#39;s position. Read the full paper for more context [<a href="https://arxiv.org/pdf/1603.07076.pdf">pdf</a>].</p> <p><strong>Getting Started</strong></p> <p>Download then decompress the h5.gz file.</p> <pre><code class="language-bash">gunzip ITOP_side_test_depth_map.h5.gz</code></pre> <p>Using Python and <a href="https://www.h5py.org/">h5py</a> (<em>pip install h5py</em> or <em>conda install h5py</em>), we can load the contents:</p> <pre><code class="language-python">import h5py import numpy as np f = h5py.File('ITOP_side_test_depth_map.h5', 'r') data, ids = f.get('data'), f.get('id') data, ids = np.asarray(data), np.asarray(ids) print(data.shape, ids.shape) # (10501, 240, 320) (10501,)</code></pre> <p><strong>Note:</strong> For any of the <em>*_images.h5.gz</em> files, the underlying file is a tar file and not a h5 file. Please rename the file extension from <em>h5.gz</em> to <em>tar.gz</em> before opening. The following commands will work:</p> <pre><code class="language-bash">mv ITOP_side_test_images.h5.gz ITOP_side_test_images.tar.gz tar xf ITOP_side_test_images.tar.gz</code></pre> <p><strong>Metadata</strong></p> <p>File sizes for images, depth maps, point clouds, and labels refer to the uncompressed size.</p> <pre><code>+-------+--------+---------+---------+----------+------------+--------------+---------+ | View | Split | Frames | People | Images | Depth Map | Point Cloud | Labels | +-------+--------+---------+---------+----------+------------+--------------+---------+ | Side | Train | 39,795 | 16 | 1.1 GiB | 5.7 GiB | 18 GiB | 2.9 GiB | | Side | Test | 10,501 | 4 | 276 MiB | 1.6 GiB | 4.6 GiB | 771 MiB | | Top | Train | 39,795 | 16 | 974 MiB | 5.7 GiB | 18 GiB | 2.9 GiB | | Top | Test | 10,501 | 4 | 261 MiB | 1.6 GiB | 4.6 GiB | 771 MiB | +-------+--------+---------+---------+----------+------------+--------------+---------+</code></pre> <p><strong>Data Schema</strong></p> <p>Each file contains several HDF5 datasets at the root level. Dimensions, attributes, and data types are listed below. The key refers to the (HDF5) dataset name. Let <span class="math-tex">\(n\)</span> denote the number of images.<br> <br> <strong>Transformation</strong></p> <p>To convert from point clouds to a&nbsp;<span class="math-tex">\(240 \times 320\)</span> image, the following transformations were used. Let&nbsp;<span class="math-tex">\(x_{\textrm{img}}\)</span> and&nbsp;<span class="math-tex">\(y_{\textrm{img}}\)</span> denote the&nbsp;<span class="math-tex">\((x,y)\)</span> coordinate in the image plane. Using the raw point cloud&nbsp;<span class="math-tex">\((x,y,z)\)</span> real world coordinates, we compute the depth map as follows:&nbsp;<span class="math-tex">\(x_{\textrm{img}} = \frac{x}{Cz} + 160\)</span> and&nbsp;<span class="math-tex">\(y_{\textrm{img}} = -\frac{y}{Cz} + 120\)</span> where <span class="math-tex">\(C\approx 3.50×10^{−3} = 0.0035\)</span> is the intrinsic camera calibration parameter. This results in the depth map:&nbsp;<span class="math-tex">\((x_{\textrm{img}}, y_{\textrm{img}}, z)\)</span>.</p> <p><strong>Joint ID (Index) Mapping</strong></p> <pre><code>joint_id_to_name = { 0: 'Head', 8: 'Torso', 1: 'Neck', 9: 'R Hip', 2: 'R Shoulder', 10: 'L Hip', 3: 'L Shoulder', 11: 'R Knee', 4: 'R Elbow', 12: 'L Knee', 5: 'L Elbow', 13: 'R Foot', 6: 'R Hand', 14: 'L Foot', 7: 'L Hand', }</code></pre> <p><strong>Depth Maps</strong></p> <ul> <li><em>Key:</em> id <ul> <li><em>Dimensions:</em> <span class="math-tex">\((n,)\)</span></li> <li><em>Data Type:</em> uint8</li> <li><em>Description:</em> Frame identifier in the form XX_YYYYY where XX is the person&#39;s ID number and YYYYY is the frame number.</li> </ul> </li> <li><em>Key: </em>data <ul> <li><em>Dimensions: </em><span class="math-tex">\((n,240,320)\)</span></li> <li><em>Data Type:</em> float16</li> <li><em>Description:</em> Depth map (i.e. mesh) corresponding to a single frame. Depth values are in real world meters (m).</li> </ul> </li> </ul> <p><strong>Point Clouds</strong></p> <ul> <li><em>Key:</em> id <ul> <li><em>Dimensions:</em> <span class="math-tex">\((n,)\)</span></li> <li><em>Data Type:</em> uint8</li> <li><em>Description:</em> Frame identifier in the form XX_YYYYY where XX is the person&#39;s ID number and YYYYY is the frame number.</li> </ul> </li> <li><em>Key: </em>data <ul> <li><em>Dimensions: </em><span class="math-tex">\((n,76800,3)\)</span></li> <li><em>Data Type: float16</em></li> <li><em>Description:</em> Point cloud containing 76,800 points (240x320). Each point is represented by a 3D tuple measured in real world meters (m).</li> </ul> </li> </ul> <p><strong>Labels</strong></p> <ul> <li><em>Key: </em>id <ul> <li><em>Dimensions: </em><span class="math-tex">\((n,)\)</span></li> <li><em>Data Type: </em>uint8</li> <li><em>Description:</em> Frame identifier in the form XX_YYYYY where XX is the person&#39;s ID number and YYYYY is the frame number.</li> </ul> </li> <li><em>Key: </em>is_valid <ul> <li><em>Dimensions: </em><span class="math-tex">\((n,)\)</span></li> <li><em>Data Type: </em>uint8</li> <li><em>Description:</em> Flag corresponding to the result of the human labeling effort. This is a boolean value (represented by an integer) where a one (1) denotes clean, human-approved data. A zero (0) denotes noisy human body part labels. If is_valid is equal to zero, you should not use any of the provided human joint locations for the particular frame.</li> </ul> </li> <li><em>Key: </em>visible_joints <ul> <li><em>Dimensions: </em><span class="math-tex">\((n,15)\)</span></li> <li><em>Data Type: </em>int16</li> <li><em>Description:</em> Binary mask indicating if each human joint is visible or occluded. This is denoted by&nbsp;<span class="math-tex">\(\alpha\)</span> in the paper. If&nbsp;<span class="math-tex">\(\alpha_j=1\)</span> then the&nbsp;<span class="math-tex">\(j^{th}\)</span> joint is visible (i.e. not occluded). Otherwise, if&nbsp;<span class="math-tex">\(\alpha_j = 0\)</span> then the <span class="math-tex">\(j^{th}\)</span> joint is occluded.</li> </ul> </li> <li><em>Key: </em>image_coordinates <ul> <li><em>Dimensions: </em><span class="math-tex">\((n,15,2)\)</span></li> <li><em>Data Type: </em>int16</li> <li><em>Description:</em> Two-dimensional&nbsp;<span class="math-tex">\((x,y)\)</span> points corresponding to the location of each joint in the depth image or depth map.</li> </ul> </li> <li><em>Key: </em>real_world_coordinates <ul> <li><em>Dimensions: </em><span class="math-tex">\((n,15,3)\)</span></li> <li><em>Data Type: </em>float16</li> <li><em>Description:</em> Three-dimensional&nbsp;<span class="math-tex">\((x,y,z)\)</span> points corresponding to the location of each joint in real world meters (m).</li> </ul> </li> <li><em>Key: </em>segmentation <ul> <li><em>Dimensions: </em><span class="math-tex">\((n,240,320)\)</span></li> <li><em>Data Type: </em><em>int8</em></li> <li><em>Description:</em> Pixel-wise assignment of body part labels. The background class (i.e. no body part) is denoted by &minus;1.</li> </ul> </li> </ul> <p><strong>Citation</strong></p> <p>If you would like to cite our work, please use the following.</p> <p><strong>Haque A, Peng B, Luo Z, Alahi A, Yeung S, Fei-Fei L. (2016). Towards Viewpoint Invariant 3D Human Pose Estimation. European Conference on Computer Vision. Amsterdam, Netherlands. Springer.</strong></p> <pre>@inproceedings{haque2016viewpoint, title={Towards Viewpoint Invariant 3D Human Pose Estimation}, author={Haque, Albert and Peng, Boya and Luo, Zelun and Alahi, Alexandre and Yeung, Serena and Fei-Fei, Li}, booktitle = {European Conference on Computer Vision}, month = {October}, year = {2016} }</pre> <ul> </ul></dct:description> <dct:accessRights rdf:resource="http://publications.europa.eu/resource/authority/access-right/PUBLIC"/> <dct:accessRights> <dct:RightsStatement rdf:about="info:eu-repo/semantics/openAccess"> <rdfs:label>Open Access</rdfs:label> </dct:RightsStatement> </dct:accessRights> <dcat:distribution> <dcat:Distribution> <dct:license rdf:resource="https://creativecommons.org/licenses/by/4.0/legalcode"/> <dcat:accessURL rdf:resource="https://doi.org/10.5281/zenodo.3932973"/> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>245104261</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_side_test_depth_map.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>257980348</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_side_test_images.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>3699135</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_side_test_labels.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>2061701631</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_side_test_point_cloud.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>926228035</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_side_train_depth_map.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>1010377751</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_side_train_images.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>16833112</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_side_train_labels.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>7840345186</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_side_train_point_cloud.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>245493889</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_top_test_depth_map.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>246678932</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_top_test_images.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>9280299</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_top_test_labels.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>2020245383</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_top_test_point_cloud.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>917859800</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_top_train_depth_map.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>923855225</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_top_train_images.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>32165804</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_top_train_labels.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>7620649272</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/ITOP_top_train_point_cloud.h5.gz</dcat:downloadURL> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>20450</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/sample_front.jpg</dcat:downloadURL> <dcat:mediaType>image/jpeg</dcat:mediaType> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>22911</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/sample_front_labeled.jpg</dcat:downloadURL> <dcat:mediaType>image/jpeg</dcat:mediaType> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>18689</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/sample_top.jpg</dcat:downloadURL> <dcat:mediaType>image/jpeg</dcat:mediaType> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL>https://doi.org/10.5281/zenodo.3932973</dcat:accessURL> <dcat:byteSize>17461</dcat:byteSize> <dcat:downloadURL>https://zenodo.org/record/3932973/files/sample_top_labeled.jpg</dcat:downloadURL> <dcat:mediaType>image/jpeg</dcat:mediaType> </dcat:Distribution> </dcat:distribution> </rdf:Description> </rdf:RDF>
All versions | This version | |
---|---|---|
Views | 2,401 | 2,401 |
Downloads | 7,728 | 7,728 |
Data volume | 20.0 TB | 20.0 TB |
Unique views | 1,988 | 1,988 |
Unique downloads | 1,401 | 1,401 |