Published October 22, 2023
| Version 1.0.0
Dataset
Open
Webis-Context-SciSumm-2023
Description
The Webis-Context-SciSumm-2023 is a large scale dataset suitable for studying contextualized summarization of scientific papers. The corpus contains approximately 540K computer science papers encompassing 4.6M citation texts and relevant information for these citations from the cited papers. The subset (approximately 25K papers) provided contains abstractive summaries of the relevant content from LLaMA (V1) and Vicuna (13B) models.
The summaries for the completed dataset will be updated on completion (due to computational constraints).
Files
context-scisumm-subset.zip
Files
(10.1 GB)
Name | Size | Download all |
---|---|---|
md5:aecb4f141dd50698d587ad70dea3c873
|
240.6 MB | Preview Download |
md5:54a52ca41b56fcc91914f6498849f92d
|
9.9 GB | Preview Download |