Published March 8, 2022 | Version v1
Dataset Open

FiNER-139: A Financial Numeric Entity Recognition Dataset

  • 1. AI Centre of Excellence in Document Intelligence, NCSR "Demokritos" and Department of Informatics, Athens University of Economics and Business
  • 2. Department of Computer Science, University of Copenhagen and Department of Informatics, Athens University of Economics and Business
  • 3. AI Centre of Excellence in Document Intelligence, NCSR "Demokritos"

Description

FiNER-139 is published with the article: FiNER: Financial Numeric Entity Recognition for XBRL Tagging

In the Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022) (Long Papers), Dublin, Republic of Ireland, May 22 - 27, 2022.

We release FiNER-139, a dataset of 1.1M sentences with gold XBRL tags. Unlike typical entity extraction datasets, FiNER-139 uses a much larger label set of 139 entity types. Most annotated tokens are numeric, with the correct tag per token depending mostly on context, rather than the token itself.

Files

finer-139.zip

Files (103.2 MB)

Name Size Download all
md5:372c3618db499a47967e1306591218b3
103.2 MB Preview Download