Journal article Open Access

Continual learning for recurrent neural networks: An empirical evaluation

Andrea Cossu; Antonio Carta; Vincenzo Lomonaco; Davide Bacciu


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">continual learning; recurrent neural networks</subfield>
  </datafield>
  <controlfield tag="005">20210810084728.0</controlfield>
  <controlfield tag="001">5164245</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Pisa</subfield>
    <subfield code="a">Antonio Carta</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Pisa</subfield>
    <subfield code="a">Vincenzo Lomonaco</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Pisa</subfield>
    <subfield code="a">Davide Bacciu</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1584864</subfield>
    <subfield code="z">md5:e168aef7dad0e2f8f6d5d8711223b168</subfield>
    <subfield code="u">https://zenodo.org/record/5164245/files/2103.07492.pdf</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2021-08-05</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-teaching-h2020</subfield>
    <subfield code="o">oai:zenodo.org:5164245</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="4">
    <subfield code="v">143</subfield>
    <subfield code="p">Neural Networks</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">University of Pisa</subfield>
    <subfield code="a">Andrea Cossu</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Continual learning for recurrent neural networks: An empirical evaluation</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-teaching-h2020</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">871385</subfield>
    <subfield code="a">A computing toolkit for building efficient autonomous applications leveraging humanistic intelligence</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Learning continuously during all model lifetime is fundamental to deploy &lt;a href="https://www.sciencedirect.com/topics/computer-science/machine-learning"&gt;machine learning&lt;/a&gt; solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with &lt;a href="https://www.sciencedirect.com/topics/engineering/recurrent-neural-network"&gt;recurrent neural networks&lt;/a&gt; could pave the way to a large number of applications where incoming data is non stationary, like &lt;a href="https://www.sciencedirect.com/topics/engineering/natural-language-processing"&gt;natural language processing&lt;/a&gt; and robotics. However, the existing body of work on the topic is still fragmented, with approaches which are application-specific and whose assessment is based on heterogeneous learning protocols and datasets. In this paper, we organize the literature on CL for sequential data processing by providing a categorization of the contributions and a review of the benchmarks. We propose two new benchmarks for CL with sequential data based on existing datasets, whose characteristics resemble real-world applications.&lt;/p&gt;

&lt;p&gt;We also provide a broad empirical evaluation of CL and Recurrent Neural Networks in class-incremental scenario, by testing their ability to mitigate forgetting with a number of different strategies which are not specific to sequential data processing. Our results highlight the key role played by the sequence length and the importance of a clear specification of the CL scenario.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1016/j.neunet.2021.07.021</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">article</subfield>
  </datafield>
</record>
24
22
views
downloads
Views 24
Downloads 22
Data volume 34.9 MB
Unique views 22
Unique downloads 21

Share

Cite as