Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published January 30, 2023 | Version v1
Conference paper Open

Data Augmentation On-the-fly and Active Learning in Data Stream Classification

Description

There is an emerging need for predictive models to be trained on-the-fly, since in numerous machine learning applications data are arriving in an online fashion. A critical challenge encountered is that of limited availability of ground truth information (e.g., labels in classification tasks) as new data are observed one-by-one online, while another significant challenge is that of class imbalance. This work introduces the novel Augmented Queues method, which addresses the dual-problem by combining in a synergistic manner online active learning, data augmentation, and a multi-queue memory to maintain separate and balanced queues for each class. We perform an extensive experimental study using image and time-series augmentations, in which we examine the roles of the active learning budget, memory size, imbalance level, and neural network type. We demonstrate two major advantages of Augmented Queues. First, it does not reserve additional memory space as the generation of synthetic data occurs only at training times. Second, learning models have access to more labelled data without the need to increase the active learning budget and / or the original memory size. Learning on-the-fly poses major challenges which, typically, hinder the deployment of learning models. Augmented Queues significantly improves the performance in terms of learning quality and speed. Our code is made publicly available.

Notes

This work has been supported by the European Research Council (ERC) under grant agreement No 951424 (Water-Futures), by the European Union's Horizon 2020 research and innovation programme under grant agreements No 883484 (PathoCERT) and No 739551 (TEAMING KIOS CoE), and from the Republic of Cyprus through the Deputy Ministry of Research, Innovation and Digital Policy. © 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Files

IEEE_SSCI_2022_AugmentedQueues_camera-ready.pdf

Files (1.5 MB)

Additional details

Funding

Water-Futures – Smart Water Futures: designing the next generation of urban drinking water systems 951424
European Commission
KIOS CoE – KIOS Research and Innovation Centre of Excellence 739551
European Commission
PathoCERT – Pathogen Contamination Emergency Response Technologies 883484
European Commission