Published January 4, 2017 | Version v1

English Wikipedia

Authors/Creators

  • 1. Language Technology Group, TU Darmstadt, Germany

Description

This text corpus is composed of texts of English Wikipedia extracted from the Wikipedia dump of 26th September 2015 using the WikiExtractor tool (https://github.com/attardi/wikiextractor). 

Files

Files (4.5 GB)

Name Size
md5:c1fefd53b73798459e8623e2b965f75f
4.5 GB Download