Published May 26, 2021 | Version v1
Dataset Open

Long document similarity dataset, Wikipedia excerptions for video games collections

Description

Video games-related articles extracted from Wikipedia.

For all articles, the figures and tables have been filtered out, as well as the categories and "see also" sections.

The article structure, and particularly the sub-titles and paragraphs are kept in these datasets

Video games

The Wikipedia video games dataset consists of 21,935 articles reviewing video games from all genres and consoles. Each article may consist of a different combination of sections, including summary, gameplay, plot, production, etc. Examples for ground-truth expert-based recommendations are:

  • Grand Theft Auto - Mafia
  • Burnout Paradise - Forza Horizon 3

Files

video_games.txt

Files (178.6 MB)

Name Size Download all
md5:a1db0c906d2d31fc359c234ec57c6929
178.6 MB Preview Download