Planned intervention: On Thursday March 28th 07:00 UTC Zenodo will be unavailable for up to 5 minutes to perform a database upgrade.
Published January 1, 2021 | Version v1
Report Open

A Review and Refinement of Surprise Adequacy

  • 1. Università della Svizzera italiana

Description

Surprise Adequacy (SA) is one of the emerging and most promising adequacy criteria for Deep Learning (DL) testing. As an adequacy criterion, it has been used to assess the strength of DL test suites. In addition, it has also been used to find inputs to a Deep Neural Network (DNN) which were not sufficiently represented in the training data, or to select samples for DNN retraining. However, computation of the SA metric for a test suite can be prohibitively expensive, as it involves a quadratic number of distance calculations. Hence, we  developed and released a performance-optimized, but functionally equivalent, implementation of SA, reducing the evaluation time by up to 97%. We also  propose refined variants of the SA computation algorithm, aiming to further increase the evaluation speed. We then performed an empirical study on MNIST, focused on the out-of-distribution detection capabilities of SA, which allowed us to reproduce parts of the results presented when SA was first released. The experiments show that our refined variants are substantially faster than plain SA, while producing comparable outcomes. Our experimental results exposed also an overlooked issue of SA: it can be highly sensitive to the non-determinism associated with the DNN training procedure.

Files

TR-Precrime-2021-01.pdf

Files (747.9 kB)

Name Size Download all
md5:3f106e3e365bf3d179f497a3bea754b0
747.9 kB Preview Download

Additional details

Funding

PRECRIME – Self-assessment Oracles for Anticipatory Testing 787703
European Commission