Journal article Open Access

HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media

Anargyros Chatzitofis et al.


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Dataset, 4D, multi-view, motion capture, RGBD, volumetric video, pose estimation, 3D compression, 4D capture, visual evaluation, benchmarking, depth sensing, audio, social activities.</subfield>
  </datafield>
  <controlfield tag="005">20201008002655.0</controlfield>
  <controlfield tag="001">4066367</controlfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">5844620</subfield>
    <subfield code="z">md5:43dd863c419a1bf357a04219a2a99060</subfield>
    <subfield code="u">https://zenodo.org/record/4066367/files/35_HUMAN4D_A_Human-Centric Multimodal Dataset.pdf</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-09-23</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-vrtogether-h2020</subfield>
    <subfield code="o">oai:zenodo.org:4066367</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="4">
    <subfield code="c">176241-176262</subfield>
    <subfield code="v">8</subfield>
    <subfield code="p">IEEE ACCESS</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Anargyros Chatzitofis et al.</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-vrtogether-h2020</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">762111</subfield>
    <subfield code="a">An end-to-end system for the production and delivery of photorealistic social immersive virtual reality experiences</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;We introduce HUMAN4D, a large and multimodal 4D dataset that contains a variety of human activities simultaneously captured by a professional marker-based MoCap, a volumetric capture and an audio recording system. By capturing 2 female and 2 male professional actors performing various full-body movements and expressions, HUMAN4D provides a diverse set of motions and poses encountered as part of single- and multi-person daily, physical and social activities (jumping, dancing, etc.), along with multi-RGBD (mRGBD), volumetric and audio data. Despite the existence of multi-view color datasets captured with the use of hardware (HW) synchronization, to the best of our knowledge, HUMAN4D is the first and only public resource that provides volumetric depth maps with high synchronization precision due to the use of intra- and inter-sensor HW-SYNC. Moreover, a spatio-temporally aligned scanned and rigged 3D character complements HUMAN4D to enable joint research on time-varying and high-quality dynamic meshes. We provide evaluation baselines by benchmarking HUMAN4D with state-of-the-art human pose estimation and 3D compression methods. We apply OpenPose and AlphaPose reaching 70.02% and 82.95% mAPPCKh-0.5 on single- and 68.48% and 73.94% mAPPCKh-0.5 on two-person 2D pose estimation, respectively. In 3D pose, a recent multi-view approach named Learnable Triangulation, achieves 80.26% mAPPCK3D-10cm. For 3D compression, we benchmark Draco, Corto and CWIPC open-source 3D codecs, respecting online encoding and steady bit-rates between 7-155 and 2-90 Mbps for mesh- and point-based volumetric video, respectively. Qualitative and quantitative visual comparison between mesh-based volumetric data reconstructed in different qualities and captured RGB, showcases the available options with respect to 4D representations. HUMAN4D is introduced to enable joint research on spatio-temporally aligned pose, volumetric, mRGBD and audio data cues. The dataset and its code are available online.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1109/ACCESS.2020.3026276</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">article</subfield>
  </datafield>
</record>
30
39
views
downloads
Views 30
Downloads 39
Data volume 227.9 MB
Unique views 30
Unique downloads 38

Share

Cite as