Published December 15, 2021
| Version v4
Dataset
Open
Text of Wikisource pages of German magazine 'Die Gartenlaube'
Creators
Description
Text of all Gartenlaube pages transcribed in German Wikisource. Text parsed on 2021-12-13, the output is combinend in separeted json files, each file per volume, starting 1853 and ending 1899. All 47 json files are compressed into one *.tar.xz file.
The syntax of the json looks like:
[{"pageid" : {PAGEID},
"title" : {PAGETITLE},
"lastrevid" : {REVISIONID},
"proofread" : {{JSON_OBJECT_Proofread_Status}}
"html" : {HTML_OUTPUT},
"wikitext": {WIKI_MARKUP},
"plaintxt": {mwparserfromhell(WIKI_MARKUP).strip_code)}
}]
Files
GartenlaubeSeitenText_Kategorie:Die Gartenlaube (1895)_1639447433.json
Additional details
Related works
- Is derived from
- Software: https://github.com/DieDatenlaube/DieDatenlaube/blob/master/Get_Gartenlaube_SeitenText.ipynb (URL)
- Is documented by
- Other: https://diedatenlaube.github.io/Get_Gartenlaube_SeitenText (URL)