Published May 11, 2024 | Version v6
Dataset Open

YidTakNL corpus: 18th-19th centuries regulations of the High German Jewish community in Holland

  • 1. Department of Jewish Studies, Yiddish Culture, Language and Literature, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
  • 2. Department of Business-Society Management, Rotterdam School of Management, Erasmus University, Rotterdam, the Netherlands

Description

The YidTakNL corpus comprises Yiddish regulations and announcements by Amsterdam's Ashkenazi Jewish community between 1708 and 1846. All items are related to social, political and administrative aspects of community life. The YidTakNL corpus was produced as part of an ongoing PhD dissertation in Erasmus University Rotterdam (2021-2026) by Ronny Reshef. The bibliography in the Excel-sheet "Zenodo statutes for YidTakNL" is based on Mirjam Gutschow's extensive bibliography, Inventory of Yiddish publications from the Netherlands: c. 1650 - c. 1950 (Leiden: Brill, 2007). However, it only focuses on regulations written and published by community leaders or charitable organisations for the benefit of the community. Gutschow's bibliography was rechecked, corrected, and four items were added, resulting in 64 sources. Most of the listed items are fully available online (reference is provided).

Texts from the corpus were used to train a Transkribus PyLaia HTR model. This model is dedicated to Yiddish texts printed in the Vaybertaytsh typeface, as printed in Amsterdam in the 18th century. This YidTakNL.1 Transkribus PyLaia HTR model enables researchers to decipher and read the well-known Yiddish Vaybertaytsh typeface. This Ashkenazi semi-cursive typeface (also known as Tkhine-ksav, Tsene-(u)rene-ksav, mashket/mesheyt, taytsh, vayberksav, ivre-taytsh and kleyn-taytsh), was broadly used for printing Yiddish documents between the 16th and the beginning of the 19th century across Europe and the Yiddish speaking world. As a base model, The Dybbuk for Yiddish Handwriting was used. The latest version of YidTakNL.1 is based on a training set size of 26331, and the CER is 0.10%. The model was based on ground-truth only [10% of the validation data], and will be made public during 2023.

Two preliminary editions of Yiddish takones (takanot) are also already published here: Takanot 1711 - YidTakNL and Takanot 1737 - YidTakNL. These will be turned into full editions, other texts will follow. The texts for these editions were rendered using Transkribus, applying the Yiddish-Amsterdam-Baseline model for layout & TakYidNL model for text recognition.

For additional information, see:

Reshef, R., & Gutschow, M. (2024). Text Recognition Model for Yiddish in Vaybertaytsh Typeface, Based on Community Regulations. Journal of Open Humanities Data, 10(1), 35. DOI: https://doi.org/10.5334/johd.194

Reshef, R., & Gutschow, M. (2023). YidTakNL Corpus: 18th– 19th Centuries Regulations of the High German Jewish Community in Holland. Journal of Open Humanities Data, 9: 29, pp. 1–6. DOI: https://doi.org/10.5334/johd.161 

For the complete dataset, with transcriptions and images of the texts used for training the Vaybertaytsh model on Transkribus, see: doi.org/10.6084/m9.figshare.25422844.

Files

Takanot 1737 - YidNed 291 & YidNed 291.[1] - YidTakNL.pdf

Files (7.2 MB)

Name Size Download all
md5:073d3bfb4117496657ece3dfccc6d87e
60.9 kB Download
md5:b3d268d8330b1205516a62e57d2cc007
202.2 kB Preview Download
md5:877db05fdea3af2f8ec6f693d44cb7be
64.5 kB Download
md5:d0cc4d2b205928fd541af2fdfbc59881
209.8 kB Preview Download
md5:cc0f1de7ca799af80181343f7758bfb2
57.5 kB Download
md5:75530024148085680e31bb3ebc0679af
310.8 kB Preview Download
md5:20cda2566e3a93784b2a7a8fa5215b34
69.3 kB Download
md5:21e0adc363482c7f196b617be3a8635a
229.6 kB Preview Download
md5:9682ea5507d593233032e000ca7db11e
64.8 kB Download
md5:a4a0502973469bbd46c8dc6827f466b1
225.6 kB Preview Download
md5:4ac2523c7c5c20ec1469cce8ec7f53af
600.5 kB Download
md5:546ffeb86e094d9b62e08b6e76681139
309.3 kB Preview Download
md5:6f26c88f36fed13b579c41d43449a7e6
60.5 kB Download
md5:36261ab35e0b3b6f17354e18b26d5873
239.7 kB Preview Download
md5:286518784e3deeccb7ca8909e0198403
290.7 kB Download
md5:074d90a6d57f5607e4b11e6c759a513d
399.6 kB Preview Download
md5:81d9a0b12fcf41fecb30f007aac4cbf9
437.9 kB Download
md5:74614bd75088d27df769f49782d252cc
312.0 kB Preview Download
md5:bcabd2b8202ac37eced909560b0aafb3
122.3 kB Download
md5:09f8f92b80481d532906a90070a3f37c
501.1 kB Preview Download
md5:f6db3777ebb3a66fed573e848b1ee812
188.5 kB Download
md5:04e1f5f6ad04bb2d687e1a955c50eb93
712.1 kB Preview Download
md5:378f792adb6b293b492abf2544d2bd24
664.9 kB Download
md5:c6e0c3c9d017927b4f7c68b9d102bb98
526.0 kB Preview Download
md5:3d168dd893767aca9d731c1629be4c88
60.0 kB Download
md5:127c5f929346c9ece1aeeef540071f1f
196.6 kB Preview Download
md5:13282d1e90ef5639fb70359faa491a97
94.7 kB Download