Small-Text: Active Learning for Text Classification in Python
- 1. Leipzig University
- 2. Leipzig University; Institute for Applied Informatics (InfAI), Leipzig
Description
We present small-text, an easy-to-use active learning library, which offers pool-based active learning for single- and multi-label text classification in Python. It features many pre-implemented state-of-the-art query strategies, including some that leverage the GPU. Standardized interfaces allow the combination of a variety of classifiers, query strategies, and stopping criteria, facilitating a quick mix and match, and enabling a rapid development of both active learning experiments and applications. To make various classifiers and query strategies accessible in a unified way, small-text integrates the well-known machine learning libraries scikit-learn, PyTorch, and huggingface transformers. The latter integrations are available as optionally installable extensions, making the availability of a GPU competely optional. The library is publicly available under the MIT License at https://github.com/webis-de/small-text.
Files
small-text-2.0.0.dev3.zip
Files
(704.1 kB)
Name | Size | Download all |
---|---|---|
md5:92efce8f3c3fcb122f4c1f8d98755abb
|
704.1 kB | Preview Download |
Additional details
Related works
- Is documented by
- Preprint: arXiv:2107.10314 (arXiv)
- Publication: 10.18653/v1/2023.eacl-demo.11 (DOI)
- Is supplemented by
- Software: https://github.com/webis-de/small-text/tree/v1.4.1 (URL)
Software
- Repository URL
- https://github.com/webis-de/small-text
- Programming language
- Python
- Development Status
- Active
References
- Christopher Schröder, Lydia Müller, Andreas Niekler, and Martin Potthast. 2021. Small-Text: Active Learning for Text Classification in Python. arXiv preprint arXiv:2107.10314.
- Christopher Schröder, Lydia Müller, Andreas Niekler, and Martin Potthast. 2023. Small-Text: Active Learning for Text Classification in Python. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 84–95, Dubrovnik, Croatia. Association for Computational Linguistics.