There is a newer version of the record available.

Published June 14, 2022 | Version 1.0.0
Software Open

Small-Text: Active Learning for Text Classification in Python

  • 1. Leipzig University
  • 2. Leipzig University; Institute for Applied Informatics (InfAI), Leipzig

Description

We present small-text, an easy-to-use active learning library, which offers pool-based active learning for single- and multi-label text classification in Python. It features many pre-implemented state-of-the-art query strategies, including some that leverage the GPU. Standardized interfaces allow the combination of a variety of classifiers, query strategies, and stopping criteria, facilitating a quick mix and match, and enabling a rapid development of both active learning experiments and applications. To make various classifiers and query strategies accessible in a unified way, small-text integrates the well-known machine learning libraries scikit-learn, PyTorch, and huggingface transformers. The latter integrations are available as optionally installable extensions, making the availability of a GPU competely optional. The library is publicly available under the MIT License at https://github.com/webis-de/small-text.

Files

small-text-1.0.0.zip

Files (286.1 kB)

Name Size Download all
md5:101c0335793f2e55ae5735b9a44e5810
286.1 kB Preview Download

Additional details

Related works

Is documented by
Preprint: arXiv:2107.10314 (arXiv)
Is supplemented by
Software: https://github.com/webis-de/small-text/tree/v1.0.0 (URL)

References

  • Christopher Schröder, Lydia Müller, Andreas Niekler, and Martin Potthast. 2021. Small-Text: Active Learning for Text Classification in Python. arXiv preprint arXiv:2107.10314.