Published June 25, 2019 | Version v1
Other Open

German Word Embeddings for ShiCo based on historic newspapers

Creators

  • 1. Universität Stuttgart

Description

We provide word embeddings models that have been computed on historic German newspapers. The models are computed for time spans of 10 years and can be used with ShiCo, a visualization tool for word embeddings.  We provide models for three different corpora and also have links to the ShiCo demos:

  • SBB (State library of Berlin): Newspaper collection from Germany from 1872 to 1912 (demo).
  • Chonicling America: German-written newspaper pages from 1840 to 1908 that have been published in the United States (demo).
  • Europeana: German-written newspapers that have been published in Europe from 1840 to 1912 (demo).

For each model a configuration is required both for the frontend and the backend (see config.shico.tar.gz). In order to setup a ShiCo instance you can either follow the description of the ShiCo GitHub page or follow the instructions for running ShiCo using Docker as described in the README.docker.txt file.

Files

README.docker.txt

Files (42.7 GB)

Name Size Download all
md5:c3954dab3d0dc85261dba9001d9ee1bd
636 Bytes Download
md5:2fc14eedc99952fafe6421ffcb2dcfd4
2.7 kB Preview Download
md5:2ceb2f6a9611d70aeaf0f44f70ca530f
5.5 GB Download
md5:905a13a06f87e5c58e008d9860b057d8
24.2 GB Download
md5:0677b7cb214fbc5a90e58d8c5f7f59fe
13.0 GB Download