There is a newer version of the record available.

Published January 17, 2026 | Version v2
Dataset Open

The Digital Stoic Library

  • 1. EDMO icon Iowa State University

Description

This dataset contains a computationally enriched edition of two foundational texts of Roman Stoicism: Epictetus’s Enchiridion and Marcus Aurelius’s Meditations. The data was produced using a novel "Deep Matching" framework that bridges the gap between raw Ancient Greek source material and technical English philosophical translation.

The dataset is provided in a cleaned, machine-readable JSON format, suitable for immediate use in thematic lexical analysis, natural language processing (NLP), or educational application development.

Methodology

The data was generated through a four-stage pipeline:

Source Acquisition: Greek text (edition perseus-grc2) was harvested from the Scaife Viewer CTS API.

Contextual Translation: English translations and commentary were generated using the Gemini 3 Pro reasoning model, employing a 150-character sliding context window to maintain narrative continuity across chapter boundaries.

Lexical Lemmatization: Passages were enriched using the Classical Language Toolkit (CLTK) to map 45 core Stoic technical terms (e.g., prohairesis, logos, eph’ hēmin) back to their original Greek lemmas, regardless of grammatical inflection.

Refinement: Automated post-processing was applied to remove model artifacts and ensure structural integrity for production environments.

Contents

enchiridion_final_clean.json: The complete Enchiridion (53 chapters) with Greek text, English translation, Stoic notes, and thematic lexical tags.

meditations_final_clean.json: The complete Meditations (12 books) with Greek text, reflective English translation, commentary, and thematic lexical tags.

Processing Scripts: The suite of Python scripts used for fetching, translating, tagging, and cleaning the data.

Use Cases

Thematic Research: Quantitative analysis of technical Stoic vocabulary distribution.

Educational Tools: Development of interactive readers that highlight the relationship between original Greek and modern translation.

Citation

Moss, W. (2026). Data Paper: The Digital Stoic Library. Knowledge Commons. https://doi.org/10.5281/zenodo.18273320

Files

discourses_final_clean.json

Files (1.7 MB)

Name Size Download all
md5:a710931daa8abf5aa3e851da4d25ef79
1.4 MB Preview Download
md5:ed262e76a8aec91c2539d59ceb9e0199
141.6 kB Preview Download
md5:cb11d76439143409e1b79233958e8d51
148.0 kB Preview Download

Additional details

Dates

Created
2026-01-15