Published June 17, 2022 | Version v1
Journal article Open

ICS: Total Freedom in Manual Text Classification Supported by Unobtrusive Machine Learning

  • 1. ISTI-CNR

Description

We present the Interactive Classification System (ICS), a web-based application that supports the activity of manual text classification. The application uses machine learning to continuously fit automatic classification models that are in turn used to actively support its users with classification suggestions. The key requirement we have established for the development of ICS is to give its users total freedom of action: they can at any time modify any classification schema and any label assignment, possibly reusing any relevant information from previous activities. We investigate how this requirement challenges the typical scenarios faced in machine learning research, which instead give no active role to humans or place them into very constrained roles, e.g., on-demand labeling in active learning processes, and always assume some degree of batch processing of data. We satisfy the “total freedom” requirement by designing an unobtrusive machine learning model, i.e., the machine learning component of ICS acts as an unobtrusive observer of the users, that never interrupts them, continuously adapts and updates its models in response to their actions, and it is always available to perform automatic classifications. Our efficient implementation of the unobtrusive machine learning model combines various machine learning methods and technologies, such as hash-based feature mapping, random indexing, online learning, active learning, and asynchronous processing.

Files

ICS_Total_Freedom_in_Manual_Text_Classification_Supported_by_Unobtrusive_Machine_Learning.pdf

Additional details

Funding

AI4Media – A European Excellence Centre for Media, Society and Democracy 951911
European Commission