There is a newer version of this record available.

Dataset Open Access

Bibliometric-Enhanced arXiv: A Data Set for Paper-Based and Citation-Based Tasks

Saier, Tarek; Färber, Michael


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Saier, Tarek</dc:creator>
  <dc:creator>Färber, Michael</dc:creator>
  <dc:date>2019-02-01</dc:date>
  <dc:description>We propose a new data set based on all publications from all scientific fields available on arXiv.org. Apart from providing the papers' plain text, in-text citations were annotated via global identifiers. As far as possible, cited publications were linked to the Microsoft Academic Graph. Our data set consists of over one million documents and 29.2 million citation contexts. The data set, which is made freely available for research purposes, not only can enhance the future evaluation of researchpaper-based and citation context-based approaches but also serve as a basis for novel ideas to analyze papers.

More information can be found in our paper Bibliometric-Enhanced arXiv: A Data Set for Paper-Based and Citation-Based Tasks.

See https://github.com/IllDepence/unarXive for the source code which has been used for creating the data set.</dc:description>
  <dc:identifier>https://zenodo.org/record/2609187</dc:identifier>
  <dc:identifier>10.5281/zenodo.2609187</dc:identifier>
  <dc:identifier>oai:zenodo.org:2609187</dc:identifier>
  <dc:relation>url:http://ceur-ws.org/Vol-2345/paper2.pdf</dc:relation>
  <dc:relation>doi:10.5281/zenodo.2553522</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/bibliometrics</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/natural-language-processing</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/scholarly-data</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:subject>scholarly data</dc:subject>
  <dc:subject>citations</dc:subject>
  <dc:subject>papers</dc:subject>
  <dc:subject>arXiv.org</dc:subject>
  <dc:subject>digital libraries</dc:subject>
  <dc:subject>dataset</dc:subject>
  <dc:title>Bibliometric-Enhanced arXiv: A Data Set for Paper-Based and Citation-Based Tasks</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>
2,550
23,983
views
downloads
All versions This version
Views 2,550450
Downloads 23,98311,628
Data volume 490.3 TB266.2 TB
Unique views 2,044397
Unique downloads 2,893889

Share

Cite as