Small-Text: Active Learning for Text Classification in Python
- 1. Leipzig University
- 2. Leipzig University; Institute for Applied Informatics (InfAI), Leipzig
Description
We present small-text, an easy-to-use active learning library, which offers pool-based active learning for single- and multi-label text classification in Python. It features many pre-implemented state-of-the-art query strategies, including some that leverage the GPU. Standardized interfaces allow the combination of a variety of classifiers, query strategies, and stopping criteria, facilitating a quick mix and match, and enabling a rapid development of both active learning experiments and applications. To make various classifiers and query strategies accessible in a unified way, small-text integrates the well-known machine learning libraries scikit-learn, PyTorch, and huggingface transformers. The latter integrations are available as optionally installable extensions, making the availability of a GPU competely optional. The library is publicly available under the MIT License at https://github.com/webis-de/small-text.
Files
small-text-1.0.0.zip
Files
(286.1 kB)
Name | Size | Download all |
---|---|---|
md5:101c0335793f2e55ae5735b9a44e5810
|
286.1 kB | Preview Download |
Additional details
Related works
- Is documented by
- Preprint: arXiv:2107.10314 (arXiv)
- Is supplemented by
- Software: https://github.com/webis-de/small-text/tree/v1.0.0 (URL)
References
- Christopher Schröder, Lydia Müller, Andreas Niekler, and Martin Potthast. 2021. Small-Text: Active Learning for Text Classification in Python. arXiv preprint arXiv:2107.10314.