ArmEpiC – Armenian Epigraphic Corpus (ArtsakhEpiC Sub-Corpus, v1.0)

Tamrazyan, Hamest; Hovhannisyan, Gayane; Harutyunyan, Arsen; Boros, Emanuela

doi:10.5281/zenodo.18206503

Published January 10, 2026 | Version v2

Dataset Open

ArmEpiC – Armenian Epigraphic Corpus (ArtsakhEpiC Sub-Corpus, v1.0)

1. Formation Continue UNIL-EPFL
2. Dragomanov Ukrainian State University
3. Yerevan Brusov State University of Languages and Social Sciences
4. Institute of Archeology and Ethnography of the NAS RA
5. Mesrop Mashtots Institute of Ancient Manuscripts,
6. École Polytechnique Fédérale de Lausanne

Contributors

Other:

Cornamusaz, Emile¹

1. École Polytechnique Fédérale de Lausanne

ArmEpiC: Methodology and Data Description

Abstract

ArmEpiC (Armenian Epigraphic Corpus) is a digital scholarly dataset comprising diplomatically transcribed Armenian lapidary inscriptions encoded in TEI/EpiDoc (v9.7), together with a system of authority files designed to preserve epigraphic evidence while enabling analytical interoperability. The dataset is intended for reuse by epigraphers, historians, linguists, and digital heritage researchers requiring transparent, machine-readable epigraphic data.

Scope of the Dataset

The Zenodo deposit includes ten TEI/EpiDoc inscription files, authority files (ListPlace, ListMonument, ListSubMonument, ListMaterial, ListPreservation, ListScript, ListAbbreviationType, ListChronology, ListBibl), this methodology document, a README, and a licensing statement.

Conceptual Separation of Evidence and Interpretation

ArmEpiC enforces a strict separation between epigraphic evidence, editorial observation, and interpretive layers. The diplomatic transcription constitutes the primary evidentiary layer; all analytical and interpretive interventions are explicitly encoded and remain reversible.

Diplomatic Transcription Policy

Original orthography is preserved, lineation follows the stone, and no silent normalization is introduced. Editorial intervention is restricted to explicit expansion of abbreviations, explicit supply of omitted letters, and explicit marking of damage or loss.

Graphic Phenomena and Linguistic Structure

Ligatures are treated as graphic phenomena and do not determine linguistic segmentation. Ligatures across word boundaries are encoded graphically while preserving separate lexical units.

Abbreviations and Omitted Letters

A strict distinction is maintained between abbreviations (intentional and conventional) and omitted letters (context-driven loss). Ambiguous cases are flagged rather than silently resolved. Honorific and graphic abbreviations are distinguished analytically via a controlled vocabulary.

Word Segmentation and Lemmatization

Each lexical unit is encoded as an independent word. Lemmatization is an analytical layer supplied in normalized Classical Armenian and does not imply correction of the original spelling.

Names, Prosopography, and Places

Personal names are encoded structurally without imposing prosopographic identification. Place names are preserved as attested and linked to external authorities via ListPlace.

Dating and Chronology

Dates are recorded as transmitted in the inscription, with Gregorian equivalents supplied as scholarly interpretation. The evidentiary basis of each date is made explicit.

Functional Classification

Each inscription is assigned a single dominant functional category as a heuristic analytical label.

Translation Strategy

Translations into Modern Armenian and English are provided as interpretive aids, prioritizing semantic accuracy. They do not replace the original text.

Authority Files

Each authority entity is assigned a persistent URN that is immutable once published. Authorities are aligned conceptually with international vocabularies to support interoperability.

XML Structure and Validation

All XML files were validated using the official TEI/EpiDoc 9.7 Relax NG and Schematron schemas with standard XML validation tools prior to Zenodo deposition. All xml:id values conform to NCName constraints.

Licensing and Versioning

The dataset is released under a Creative Commons Attribution–NonCommercial 4.0 International (CC BY-NC 4.0) license. This Zenodo deposit represents a fixed release; future revisions will receive new DOIs.

Conclusion

ArmEpiC provides a transparent, reversible, and interoperable digital epigraphic dataset grounded in Armenian scholarly tradition and international standards, enabling analytical reuse across disciplines.

The project has been funded by the National Association for Armenian Studies and Research (NAASR) and the Knights of Vartan Fund for Armenian Studies.

*ArmEpiC (Armenian Epigraphic Corpus) is a scholarly research project initiated and curated under the chief editorship of Hamest Tamrazyan, with Gayane Hovhannisyan and Arsen Arutyunyan as editors.

*ArmEpiC is an evolving corpus. Authority files, identifiers, and encoding practices may be refined between versions.

Files

ArmEpiC_ListAbbreviationType_crm.xml

Files (24.0 MB)

Name	Size	Download all
ArmEpiC_ListAbbreviationType_crm.xml md5:aeca85170b30fae5d9b9f14ff9edf9d2	5.6 kB	Preview Download
ArmEpiC_ListBibl_crm.xml md5:8095cce2565ccfb6994f70d481ae3a4a	10.1 kB	Preview Download
ArmEpiC_ListChronology_crm.xml md5:f4095cfa16b3cde9741c58e049bd1ce0	14.2 kB	Preview Download
ArmEpiC_ListEditors.xml md5:ba302438178fc4580e0a8ff72e26e2c5	6.1 kB	Preview Download
ArmEpiC_ListInscriptionType_crm.xml md5:d58f9b9130a0cff178a9d0a88f520358	4.9 kB	Preview Download
ArmEpiC_ListMaterial_crm.xml md5:bd4d654252e43cf73cc415c448e8a7f2	9.5 kB	Preview Download
ArmEpiC_ListMonument_crm.xml md5:ff442b61c14b1272a6f47b45c1f34fc9	15.3 kB	Preview Download
ArmEpiC_ListObjectType_crm.xml md5:46b8702532253321327b7eab3b444788	13.4 kB	Preview Download
ArmEpiC_ListPlace._cmr.xml md5:5dd117b2777e215bf025acd837848834	19.8 kB	Preview Download
ArmEpiC_ListScript_crm.xml md5:49f8fe9e2c68bba5d344958652d975ab	18.8 kB	Preview Download
ArmEpiC_ListTechniques_crm.xml md5:f545bf88dab1de9b2d9764c6ef110117	5.4 kB	Preview Download
ART0001.xml md5:6b8e6541b09aa8a80a5ffae3428f997f	19.8 kB	Preview Download
ART0002.xml md5:971f3ec0fe2b80d5bde696fc624efaa3	20.7 kB	Preview Download
ART0003.xml md5:fec436add50454cfbee2ccd3f237af71	15.5 kB	Preview Download
ART0004.xml md5:dfa5852bce73e3961cb204f289dfe6cf	16.5 kB	Preview Download
ART0005.xml md5:de9ecfb1e0b9e509e0f0c9e59cbc636a	15.6 kB	Preview Download
ART0006.xml md5:95352fe03c9c1f3aaa5de9554249c635	21.9 kB	Preview Download
ART0007.xml md5:9c19f5c3c5bafb73fed16496a3f8ebae	17.7 kB	Preview Download
ART0008.xml md5:88276b39970a6bc325521cb4606102aa	12.9 kB	Preview Download
ART0009.xml md5:9996027258d93f54884136ab574aaf62	12.9 kB	Preview Download
ART0010.xml md5:bc5edcb009640b57e0707d3e25ee413b	13.6 kB	Preview Download
Fig. 1.JPG md5:4caefa5afe12f947e56152c6e92df8e2	5.6 MB	Preview Download
Fig. 10.JPG md5:2385e1bcc6e376e71072c565b860a7f5	3.5 MB	Preview Download
Fig. 2.JPG md5:7af68ec3d586cc96a8278e23c76b004f	3.5 MB	Preview Download
Fig. 3.JPG md5:60d021ce624a9d4a1ad7ddf7cdd89a8c	1.3 MB	Preview Download
Fig. 4.JPG md5:cafd8e25193bd1004ad53e483fa8d0ab	3.4 MB	Preview Download
Fig. 5.JPG md5:d702b82743b50b6585f2afad2bef2d4d	1.6 MB	Preview Download
Fig. 6.JPG md5:7e07928704049237d177347fec461332	3.4 MB	Preview Download
Fig. 7.jpg md5:c7033fc1d2f70eca41595d016180d06d	244.4 kB	Preview Download
Fig. 8.JPG md5:32337929c036b560a129713fa7e66408	1.1 MB	Preview Download
Fig. 9.jpg md5:3a7ea6c1e84a855327b6b2371ae405e2	65.0 kB	Preview Download

Additional details

Available: 2026-01-09

Repository URL: https://github.com/dhlab-epfl/autoepidoc

Tamrazyan, Hamest; Hovhannisyan, Gayane; Harutyunyan, Arsen; Boros, Emanuela. (2025). ArmEpiC – Armenian Epigraphic Corpus (ArtsakhEpiC Sub-Corpus, v1.0) [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.18198118

	All versions	This version
Views	359	228
Downloads	245	155
Data volume	83.0 MB	81.8 MB

Contributors

Other:

ArmEpiC_ListAbbreviationType_crm.xml

Files (24.0 MB)

Dates

Software

References

ArmEpiC – Armenian Epigraphic Corpus (ArtsakhEpiC Sub-Corpus, v1.0)

Authors/Creators

Contributors

Other:

Description

Files

ArmEpiC_ListAbbreviationType_crm.xml

Files (24.0 MB)

Additional details

Dates

Software

References