The Digital Stoic Library
Description
Project Description: The Digital Stoic Library
This dataset contains a computationally enriched edition of three foundational texts of Roman Stoicism: Epictetus’s Enchiridion, Epictetus's Discourses, and Marcus Aurelius’s Meditations. The data was produced using a novel "Deep Matching" framework that bridges the gap between raw Ancient Greek source material and technical English philosophical translation.
The dataset is provided in a cleaned, machine-readable JSON format, suitable for immediate use in thematic lexical analysis, natural language processing (NLP), or educational application development.
Methodology
The data was generated through a four-stage pipeline:
-
Source Acquisition: Greek text (edition
perseus-grc2) was harvested from the Scaife Viewer CTS API. A robust parsing methodology using.itertext()was employed to ensure that internal XML markers (such as line and page breaks) did not truncate the primary source text. -
Contextual Translation: English translations and commentary were generated using the Gemini 3 Pro reasoning model, employing a sliding context window to maintain narrative continuity across chapter boundaries.
-
Lexical Lemmatization: Passages were enriched with a
thematic_lexiconto map core Stoic technical terms (e.g., prohairesis, logos, eph’ hēmin) back to their original Greek lemmas, regardless of grammatical inflection. -
Refinement & Audit: Automated post-processing removed model artifacts, followed by a bit-for-bit audit against the Scaife Viewer API to verify that the final Greek strings match the authoritative primary source exactly.
Contents
-
discourses_final_clean.json: The complete Discourses of Epictetus (Books 1–4, plus Arrian's Preface) with verified Greek text, English translation, scholarly notes, and thematic lexical tags. -
enchiridion_final_clean.json: The complete Enchiridion (53 chapters) with verified Greek text, English translation, Stoic notes, and thematic lexical tags. -
meditations_final_clean.json: The complete Meditations (12 books) with verified Greek text, reflective English translation, commentary, and thematic lexical tags.
Use Cases
-
Thematic Research: Quantitative analysis of technical Stoic vocabulary distribution across different authors and works.
-
Educational Tools: Development of interactive readers that highlight the relationship between original Greek and modern translation through the integrated thematic lexicon.
Files
discourses_final_clean.json
Additional details
Dates
- Created
-
2026-01-21