Published November 4, 2022 | Version 1.1
Dataset Open

ArXiV-Entity/Relation annotated dataset

  • 1. LIPN, Université Sorbonne Paris Nord
  • 2. Université Sorbonne Paris Nord
  • 3. INAOE

Description

This dataset is a collection of abstracts from the CS section of ArXiV, each annotated with DyGIE++ (SciERC model)

The dataset can be used to train triple extractors or to cluster triples (in the Computer Science and AI domains).

Supersedes the ArXiV-AIKG dataset as these triples are unconstrained (so they don't forcibly appear in AIKG)

Files

Files (3.4 GB)

Name Size Download all
md5:77052ca3cdfa5150313f9bfeb42e2a23
1.4 GB Download
md5:5a96fccc7692062e78455372efdcacb8
449.4 MB Download
md5:e29762edb161c8946a86bc883e26a6a9
289.4 MB Download
md5:df5a3076033af8c3d22671022440bc6e
207.1 MB Download
md5:7ae540202a5c7f11a1940bce9780d682
182.4 MB Download
md5:448d093caeedec47f2d85c556fb55df6
156.4 MB Download
md5:e95e682d5196412f1fd24105530beb8e
196.4 MB Download
md5:ed84e61aaa0a29e47e9db5e21ba04986
210.1 MB Download
md5:e4a9afdd3dd7e4af885d7bd7a9112b86
230.1 MB Download
md5:15f8217d0cc9bae0c9cae5bbea794dcb
157.6 MB Download