Dataset Open Access

Text of Wikisource pages of German magazine 'Die Gartenlaube'

Erlinger, Christian

Text of all Gartenlaube pages transcribed in German Wikisource. Text parsed on 2021-12-13, the output is combinend in separeted json files, each file per volume, starting 1853 and ending 1899. All 47 json files are compressed into one *.tar.xz file.

The syntax of the json looks like:

  [{"pageid" : {PAGEID},
 "title"   : {PAGETITLE},
 "lastrevid" : {REVISIONID}, 
 "proofread" : {{JSON_OBJECT_Proofread_Status}}
 "html"    : {HTML_OUTPUT},
 "wikitext": {WIKI_MARKUP},
 "plaintxt": {mwparserfromhell(WIKI_MARKUP).strip_code)}
 }]
Files (105.1 MB)
Name Size
GartenlaubeSeitenText_Kategorie:Die Gartenlaube.tar.xz
md5:05bff99558d17f29e7c569e2c78bcb0d
105.1 MB Download
291
121
views
downloads
All versions This version
Views 29193
Downloads 12111
Data volume 3.1 GB1.2 GB
Unique views 21381
Unique downloads 2210

Share

Cite as