Conference paper Open Access

Deep 3D Flow Features for Human Action Recognition

Psaltis, Athanasios; Papadopoulos, Th. Georgios; Daras, Petros

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution Non Commercial 4.0 International</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2018-11-01</subfield>
  <controlfield tag="005">20200120151820.0</controlfield>
  <controlfield tag="001">2551020</controlfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="o"></subfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">4-6 September 2018</subfield>
    <subfield code="g">CBMI</subfield>
    <subfield code="a">IEEE Int. Conf. on Content-Based Multimedia Indexing</subfield>
    <subfield code="c">La Rochelle, France</subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;The present work investigates the use of 3D flow information for performing Deep Learning (DL)-based human action recognition. Generally, 3D flow fields include rich and fine-grained information, regarding the motion dynamics of the observed human actions. However, despite the great potentials present, 3D flow has not been widely used, mainly due to challenges related to the efficient modeling of the flow information and the addressing of the respective computational complexity issues. In this paper, different techniques are investigated for incorporating 3D flow information in DL action recognition schemes. In particular, a novel sequence modeling approach is introduced, which combines the advantageous characteristics for spatial correlation estimation of Convolutional Neural Networks (CNNs) with the increased temporal modeling capabilities of Long Short Term Memory (LSTM) models. Additionally, an extended CNN - based deep flow model is proposed that extracts features from both the spatial and temporal domains, by applying 3D convolutions; hence, modeling the action dynamics within consecutive frames. Moreover, for compact and efficient 3D motion feature extraction, the combined use of CNNs with a `flow colorization&amp;#39; approach is adopted. The proposed methods significantly outperform similar DL and hand-crafted 3D flow approaches, and compare favorably with most skeleton-based techniques in the currently most challenging public dataset, namely the NTU RGB-D.&lt;/p&gt;</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Centre for Research and Technology, Hellas (ITI-CERTH)</subfield>
    <subfield code="a">Papadopoulos, Th. Georgios</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Centre for Research and Technology, Hellas (ITI-CERTH)</subfield>
    <subfield code="a">Daras, Petros</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">587559</subfield>
    <subfield code="z">md5:b869a188ffd539b17d0f1c800218f23e</subfield>
    <subfield code="u"> 3D Flow Features for Human Action Recognition.pdf</subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">conferencepaper</subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Centre for Research and Technology, Hellas (ITI-CERTH)</subfield>
    <subfield code="0">(orcid)0000-0002-6896-3124</subfield>
    <subfield code="a">Psaltis, Athanasios</subfield>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Action recognition</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">3D flow</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Deep Learning</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1109/CBMI.2018.8516470</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Deep 3D Flow Features for Human Action Recognition</subfield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">700367</subfield>
    <subfield code="a">Detecting and ANalysing TErrorist-related online contents and financing activities</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
Views 64
Downloads 66
Data volume 38.8 MB
Unique views 48
Unique downloads 60


Cite as