Conference paper Open Access

Attention-enhanced Sensorimotor Object Recognition

Thermos, S; Papadopoulos, GT; Daras, P; Potamianos, G

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Sensorimotor object recognition, attention mechanism, stream fusion, deep neural networks</subfield>
  <controlfield tag="005">20200417122735.0</controlfield>
  <controlfield tag="001">3727849</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">2018 October 7-10</subfield>
    <subfield code="g">IEEE ICIP 2018</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Papadopoulos, GT</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Daras, P</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Potamianos, G</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">777718</subfield>
    <subfield code="z">md5:1dacb0ee42fe6a8eb8fdc71f9b4bc26f</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2018-10-10</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-vrtogether-h2020</subfield>
    <subfield code="o"></subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Thermos, S</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Attention-enhanced Sensorimotor Object Recognition</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-vrtogether-h2020</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Sensorimotor learning, namely the process of understanding the physical world by combining visual and motor information, has been recently investigated, achieving promising results for the task of 2D/3D object recognition. Following the recent trend in computer vision, powerful deep neural networks (NNs) have been used to model the &amp;ldquo;sensory&amp;rdquo; and &amp;ldquo;motor&amp;rdquo; information, namely the object appearance and affordance. However, the existing implementations cannot efficiently address the spatio-temporal nature of the humanobject interaction. Inspired by recent work on attention-based learning, this paper introduces an attention-enhanced NN-based model that learns to selectively focus on parts of the physical interaction where the object appearance is corrupted by occlusions and deformations. The model&amp;rsquo;s attention mechanism relies on the confidence of classifying an object based solely on its appearance. Three metrics are used to measure the latter, namely the prediction entropy, the average N-best likelihood difference, and the N-best likelihood dispersion. Evaluation of the attention-enhanced model on the SOR3D dataset reports 33% and 26% relative improvement over the appearance-only and the spatio-temporal fusion baseline models, respectively.&lt;/p&gt;</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1109/ICIP.2018.8451158</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">conferencepaper</subfield>
Views 38
Downloads 137
Data volume 106.5 MB
Unique views 32
Unique downloads 135


Cite as