Published August 15, 2024 | Version v1
Dataset Open

RealKIE: Five Novel Datasets for Enterprise Key Information Extraction

Description

These are datasets to acompany the paper "RealKIE: Five Novel Datasets for Enterprise Key Information Extraction"

We recommend following the instructions in our Github Repo https://github.com/IndicoDataSolutions/realkie to download from Wasabi. This copy on Zenodo is to increase accessibility and ensure that the data is available indefinitely.

Resource Contracts is in 7z format due to it's size. Others are in Zip format. The dataset formats are described in depth in our Paper and the Github Repo.

Files

charities.zip

Files (49.4 GB)

Name Size Download all
md5:d28218c584f91989826727e455e5a4fb
5.6 GB Preview Download
md5:3fde2cac39b550d3eb4aaf6434dc587a
559.4 MB Preview Download
md5:15a833ee4785f6f16a14b89a76d4cab3
888.3 MB Preview Download
md5:2f1f5f1fefbc4f14153717b2b3612599
38.4 GB Download
md5:34131e6b409b66402013b641196390d0
3.9 GB Preview Download

Additional details

Software

Repository URL
https://github.com/IndicoDataSolutions/realkie
Programming language
Python
Development Status
Active