There is a newer version of this record available.

Report Open Access

Testing the Plasticity of Reinforcement Learning Based Systems

Tonella, Paolo; Biagiola, Matteo

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <controlfield tag="005">20220315070804.0</controlfield>
  <controlfield tag="001">6026649</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Università della Svizzera italiana</subfield>
    <subfield code="a">Biagiola, Matteo</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">3031655</subfield>
    <subfield code="z">md5:86d268817978c0b9d6b03490d02164a5</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2022-02-09</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="o"></subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Università della Svizzera italiana</subfield>
    <subfield code="a">Tonella, Paolo</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Testing the Plasticity of Reinforcement Learning Based Systems</subfield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">787703</subfield>
    <subfield code="a">Self-assessment Oracles for Anticipatory Testing</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;The data set available for pre-release training of a machine learning based system is often not representative of all possible execution contexts that the system will encounter in the field. Reinforcement Learning (RL) is a prominent approach among those that support continual learning, i.e., learning continually in the field, in the post-release phase. No study has so far investigated any method to test the plasticity of RL based systems, i.e., their capability to adapt to an execution context that may deviate from the training one.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;We propose an approach to test the plasticity of&amp;nbsp; RL based systems. The output of our approach is a quantification of the adaptation and anti-regression capabilities of the system, obtained by computing&amp;nbsp; the adaptation frontier of the system in a changed environment. We visualize such frontier as an adaptation/anti-regression heatmap in two dimensions, or as a clustered projection when more than two dimensions are involved. In this way, we provide developers with information on the amount of changes that can be accommodated by the continual learning component of the system, which is&amp;nbsp; key&amp;nbsp; to decide if online, in-the-field learning can be safely enabled or not.&lt;/p&gt;</subfield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.6026648</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.6026649</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">report</subfield>
All versions This version
Views 10835
Downloads 179110
Data volume 543.1 MB333.5 MB
Unique views 9431
Unique downloads 165105


Cite as