Published January 21, 2025 | Version v3
Dataset Open

The Digital Stoic Library

  • 1. EDMO icon Iowa State University

Description

 Project Description: The Digital Stoic Library

This dataset contains a computationally enriched edition of three foundational texts of Roman Stoicism: Epictetus’s Enchiridion, Epictetus's Discourses, and Marcus Aurelius’s Meditations. The data was produced using a novel "Deep Matching" framework that bridges the gap between raw Ancient Greek source material and technical English philosophical translation.

The dataset is provided in a cleaned, machine-readable JSON format, suitable for immediate use in thematic lexical analysis, natural language processing (NLP), or educational application development.

Methodology

The data was generated through a four-stage pipeline:

  1. Source Acquisition: Greek text (edition perseus-grc2) was harvested from the Scaife Viewer CTS API. A robust parsing methodology using .itertext() was employed to ensure that internal XML markers (such as line and page breaks) did not truncate the primary source text.

  2. Contextual Translation: English translations and commentary were generated using the Gemini 3 Pro reasoning model, employing a sliding context window to maintain narrative continuity across chapter boundaries.

  3. Lexical Lemmatization: Passages were enriched with a thematic_lexicon to map core Stoic technical terms (e.g., prohairesis, logos, eph’ hēmin) back to their original Greek lemmas, regardless of grammatical inflection.

  4. Refinement & Audit: Automated post-processing removed model artifacts, followed by a bit-for-bit audit against the Scaife Viewer API to verify that the final Greek strings match the authoritative primary source exactly.

Contents

  • discourses_final_clean.json: The complete Discourses of Epictetus (Books 1–4, plus Arrian's Preface) with verified Greek text, English translation, scholarly notes, and thematic lexical tags.

  • enchiridion_final_clean.json: The complete Enchiridion (53 chapters) with verified Greek text, English translation, Stoic notes, and thematic lexical tags.

  • meditations_final_clean.json: The complete Meditations (12 books) with verified Greek text, reflective English translation, commentary, and thematic lexical tags.

Use Cases

  • Thematic Research: Quantitative analysis of technical Stoic vocabulary distribution across different authors and works.

  • Educational Tools: Development of interactive readers that highlight the relationship between original Greek and modern translation through the integrated thematic lexicon.

Files

discourses_final_clean.json

Files (2.7 MB)

Name Size Download all
md5:eb5058ded184f5b2cc6e8fcdeaee7648
1.5 MB Preview Download
md5:eb370cfc8c1ff0cd6063a4880a3d6454
151.5 kB Preview Download
md5:1c1d1c73fde113550b49244af631d496
1.0 MB Preview Download

Additional details

Dates

Created
2026-01-21