There is a newer version of the record available.

Published March 10, 2022 | Version v1
Dataset Open

Multi-label Datasets used in "Adapting Transformers for Multi-Label Text Classification"

  • 1. Fallah
  • 2. Bellot
  • 3. Bruno
  • 4. Murisasco

Description

The three Multi-Label datasets used in the article "Adapting Transformers for Multi-Label Text Classification".

- AAPD Dataset  (ArXiv Academic Paper Dataset) [Yang et al. 2018]1

- Reuters-21578 Dataset: https://archive.ics.uci.edu/ml/datasets/reuters-21578+text+categorization+collection

- MFHAD (Multilabel French HAL Abstracts Dataset)

 

1Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma, Wei Wu, and Houfeng Wang. 2018.
SGM: Sequence Generation Model for Multi-label Classification. In Proceedings
of the 27th International Conference on Computational Linguistics. Association for
Computational Linguistics, Santa Fe, New Mexico, USA, 3915–3926.

Files

AAPD.zip

Files (27.3 MB)

Name Size Download all
md5:e1419b5d03be572115287fcb80577c48
18.1 MB Preview Download
md5:bd3e4f97480144f35fc3c0ae7bb58ea2
6.0 MB Preview Download
md5:47e9dc181f1446e742c91d519987531e
3.3 MB Preview Download