Published March 10, 2022 | Version v2
Dataset Open

Multi-label Datasets used in "Adapting Transformers for Multi-Label Text Classification"

  • 1. Fallah
  • 2. Bellot
  • 3. Bruno
  • 4. Murisasco

Description

The three Multi-Label datasets used in the article "Adapting Transformers for Multi-Label Text Classification".

- AAPD Dataset  (ArXiv Academic Paper Dataset) [Yang et al. 2018]1

- Reuters-21578 Dataset: https://archive.ics.uci.edu/ml/datasets/reuters-21578+text+categorization+collection

- MFHAD (Multilabel French HAL Abstracts Dataset)

 

1Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma, Wei Wu, and Houfeng Wang. 2018.
SGM: Sequence Generation Model for Multi-label Classification. In Proceedings
of the 27th International Conference on Computational Linguistics. Association for
Computational Linguistics, Santa Fe, New Mexico, USA, 3915–3926.

Files

AAPD.zip

Files (27.3 MB)

Name Size Download all
md5:11b9a8782c3b017f31b932afcb2a1eeb
18.1 MB Preview Download
md5:bd3e4f97480144f35fc3c0ae7bb58ea2
6.0 MB Preview Download
md5:47e9dc181f1446e742c91d519987531e
3.3 MB Preview Download