Published February 7, 2020 | Version v1
Dataset Open

Webis Abstractive Snippet Corpus 2020

  • 1. Paderborn University
  • 2. Leipzig University
  • 3. Martin-Luther-Universität Halle-Wittenberg
  • 4. Bauhaus-Universität Weimar

Description

The Webis Abstractive Snippet 2020 (Webis-Snippete-20) comprises four abstractive snippet dataset from ClueWeb09, Clueweb12, and DMOZ descriptions. More than 10 million <webpage, abstractive snippet> pairs / 3.5 million <query, webpage, abstractive snippet> pairs were collected.

Files

released-snippet-ac-qb.zip

Files (11.2 GB)

Name Size Download all
md5:41696d93df837a53f871c0e402eb0a22
4.3 GB Preview Download
md5:f36a9c50a117d5bee91831b4a23c7bb0
6.6 GB Preview Download
md5:ed30087f82080000ac5e67b23a8d8c98
23.2 MB Preview Download
md5:511694a7c9e4794364eaff89838e0039
244.4 MB Preview Download