Buddhist Classics AI Translation Series Vol.2: The Collected Tantras of the Nyingma (Nyingma Gyubum) - Tibetan-Chinese-English Parallel Corpus (Version 4.1) | 佛典AI译丛第二卷:宁玛十万续(4.1版)
Authors/Creators
- 1. Independent Research Collective
Description
This is **Volume 2** of the comprehensive *Buddhist Classics AI Translation Series*, featuring the **Collected Tantras of the Nyingma** (*Nyingma Gyubum* / *rNying ma rgyud 'bum*, 宁玛十万续), one of the most important and mysterious collections in Tibetan Buddhist literature.
---
## About the Nyingma Collected Tantras
The **Nyingma Gyubum** (literally "100,000 Tantras of the Ancient Ones") is the definitive canonical collection of tantric texts transmitted by the Nyingma school, the oldest school of Tibetan Buddhism. This corpus comprises:
- **Sanskrit-origin tantras**: Translated from Sanskrit during the 8th-9th centuries by masters including:
- **Vairocana** (Tib. *Bai ro tsa na*): Chief Tibetan translator
- **Nubchen Sangye Yeshe** (*gNubs chen Sangs rgyas ye shes*): Early Nyingma master
- Guided by Indian masters **Padmasambhava**, **Vimalamitra**, and Chinese master **Wujigong** (Song Huishou, 宋惠寿)
- **Treasure texts** (*terma*, 伏藏): Revealed teachings rediscovered by later masters
- **Indigenous Tibetan tantras**: Compositions reflecting Himalayan cultural integration
**Historical Significance**:
- **First complete Chinese translation** of this 950+ text corpus (~30 million characters)
- **75% previously untranslated** into any modern language
- **First scholarly access** to comprehensive English translations
- **Represents lost civilizations**: Many texts preserve Indus Valley, Dravidian (*lu yig*, "spiral script"), and ancient Urdu (*mkha' 'gro yig*, "ḍākiṇī script") traditions now extinct in their regions of origin
---
## Dataset Scope and Structure
### **Source Editions**
This volume integrates multiple textual traditions:
**1. Thimphu Edition (廷布版, 46 volumes)**
- Modern Bhutanese compilation (mid-20th century)
- Standard reference for contemporary scholarship
- Organized by tantra class (Mahāyoga, Anuyoga, Atiyoga)
**2. Peltsek Edition (Gpb版, 58 volumes)**
- Based on older manuscript traditions
- Includes variant readings and colophons
- Preserves editorial notes indicating complex transmission history
- Source: http://www.rkts.org/ and https://texts.thdl.org/
**3. Degé Edition (德格版)**
- 18th-century woodblock edition
- Compiled under Jigme Lingpa's guidance (1730-1798)
- Derived from the 25-volume Mindrolling catalog
**4. Kangyur Three Volumes (甘珠尔三函)**
- Earliest canonical Nyingma tantras included in Tibetan Buddhist Canon
- Integrated for comparison
**Version 4.0 Enhancements**:
- **47-volume Peltsek edition**: Complete Gemini 2.0 translations (Tibetan-Chinese-English)
- **Degé edition**: Parallel version for textual criticism
- **Cross-edition concordance**: Linked references across all versions
- **Expanded supplements**: Longchenpa's complete works (Vol. 49), Tengyur Mahāmudrā texts (Vol. 48)
---
## Translation Methodology
### **AI Models and Quality Tiers**
**Primary Translation (Highest Quality)**:
- **Claude 3.5-3.7 Sonnet** (Anthropic): Used for major treatises and poetic sections
- **Continuous narrative style**: Superior handling of verse and liturgical texts
- **Terminology consistency**: Cross-validated with Longchenpa's commentaries
**Comparative Translations**:
- **Gemini 2.0** (Google DeepMind): Complete corpus coverage
- **GPT-4o** (OpenAI): Supplementary translations for ambiguous passages
**Technical Features**:
- **Automatic segmentation** with validation overlaps (1-2 sentences at boundaries)
- **Manual annotation**: Critical passages reviewed by project team
- **Cross-model validation**: Ambiguous terms checked across 2-3 AI systems
**Quality Disclaimer**:
- Gemini 2.0 translations have **higher error rates** than Claude versions
- **For scholarly citation**: Always verify against Tibetan originals
- **For practice guidance**: Consult qualified teachers
- **Best approach**: Compare multiple AI versions + original text
---
## Historical and Cultural Context
### **Transmission History**
**Tang Dynasty (8th-9th centuries)**:
- Initial translation phase under royal Tibetan patronage
- Collaborative work of Tibetan, Indian, and Chinese masters
- Some texts translated from Chinese sources (e.g., *Bodhicitta Lamp of Three Realms*: "汉地三卷已毕" / "Three Chinese volumes complete")
**Yuan Dynasty (13th-14th centuries)**:
- Longchenpa's systematization and commentary
- Compilation of *Seven Treasuries* providing interpretive framework
- Recognition of Nyingma corpus as distinct tradition
**Qing Dynasty (18th century)**:
- Jigme Lingpa's catalog and Degé printing
- Wider dissemination through woodblock technology
**Modern Era (20th-21st centuries)**:
- Thimphu edition (mid-20th century, Bhutan)
- First public catalogs in mainland China (1990s-2000s)
- Digital editions (2010s-2020s)
- Complete Chinese translation (2024-2025, AI-assisted)
---
### **Cultural and Philosophical Significance**
**Lost Civilizations Preserved**:
- **Indus Valley tantric traditions**: Texts in ancient scripts (*lu yig*, *mkha' 'gro yig*)
- **Dravidian Buddhist practices**: South Indian ritual systems
- **Himalayan indigenous beliefs**: Pre-Buddhist shamanic elements integrated into Buddhist framework
**Philosophical Richness**:
- **Dzogchen (Great Perfection)**: Highest yoga tantra teachings on primordial awareness
- **Anuyoga**: Perfection-stage practices emphasizing subtle body yogas
- **Mahāyoga**: Generation-stage deity yoga and maṇḍala systems
**Phenomenological Insights**:
- **Embodied cognition**: Practices linking physical postures, breath, and mental states
- **Non-dual time-space**: Descriptions of consciousness beyond linear temporality
- **Linguistic ontology**: Mantras as constitutive of reality (not merely descriptive)
**Anthropological Value**:
- **Ritual technology**: Detailed instructions for ceremonies, dances, amulets, divination
- **Medical knowledge**: Plant-based medicines encoded in symbolic language
- **Social history**: Reflects tribal conflicts, shamanic practices, gender dynamics in ancient Tibet
---
## Critical Content Notice
**Important Reader Advisories**:
This corpus contains materials that:
1. **Reflect historical worldviews** (8th-14th centuries) incompatible with modern ethics
2. **Include tribal warfare elements**: Curses, protective rituals, substitution techniques
3. **Contain gender-biased language**: Patriarchal assumptions of ancient societies
4. **Describe non-vegetarian offerings**: Symbolic or literal use of animal products
5. **Use shocking rhetoric**: Designed to challenge conventional thinking (upāya / skillful means)
**Editorial Position**:
- **No censorship**: Texts preserved in full for scholarly integrity
- **Historical context required**: Readers must situate content in original cultural milieu
- **Not prescriptive**: Descriptions ≠ endorsements for contemporary practice
- **Symbolic interpretation**: Many "literal" descriptions are metaphors (e.g., "human offerings" = internal subtle-body channels, per Longchenpa's commentaries)
**Recommended Approach**:
- **Critical reading**: Apply modern ethical frameworks while studying
- **Consult teachers**: Qualified guides can clarify symbolic meanings
- **Cross-reference**: Compare with mainstream Buddhist ethics in other texts
- **Academic distance**: Treat as historical documents, not lifestyle manuals
**Parallel to Traditional Medicine**:
- Just as ancient medical texts contain harmful prescriptions (mercury, lead), these tantras contain outdated social prescriptions
- **Value lies in understanding historical thought**, not literal application
---
## Longchenpa's Interpretive Framework
### **Essential Companion Text**
**Longchenpa (龙钦巴, 1308-1364)** wrote the definitive commentarial framework for the Nyingma Gyubum, particularly in:
- ***Seven Treasuries*** (*mDzod bdun*): Systematic exposition of Dzogchen philosophy
- ***Dispelling Darkness of the Ten Directions***: Comprehensive guide to tantra classifications
- ***Trilogy of Natural Freedom***: Practice instructions synthesizing tantric systems
**Why Read Longchenpa First**:
- **Philosophical grounding**: Explains Madhyamaka and Yogācāra foundations
- **Symbolic decoding**: Clarifies metaphorical meanings (e.g., "五毒" / five poisons = five wisdoms)
- **Practice hierarchy**: Guides gradual approach vs. sudden realization
- **Cultural translation**: Interprets Indian-origin concepts for Tibetan (and now global) audiences
**Included in This Volume**:
- **Volume 49**: Longchenpa Complete Works (26-volume edition)
- **Volume 48**: Tengyur Mahāmudrā texts (frequently cited references)
- **Volume 47 (Supplement)**: Additional tantras not in Thimphu edition
---
## Textual Challenges and Research Value
### **For Scholars**
**Manuscript Studies**:
- **Version comparison**: Thimphu vs. Degé vs. Peltsek reveals editorial choices
- **Colophon analysis**: Transmission lineages and translation teams documented
- **Linguistic archaeology**: Preserves 8th-century Tibetan, Sanskrit loans, indigenous terms
**Historical Research**:
- **Tang-Tibet relations**: Chinese Buddhist master Wujigong's role
- **Gender studies**: Mother tantras reflect matriarchal societies (pre-10th century Tibet)
- **Religious evolution**: Integration of Bon, Mahāyāna, and Vajrayāna elements
**Comparative Religion**:
- **Shamanic continuities**: Parallels with Central Asian, Siberian traditions
- **Apocalyptic literature**: Eschatological themes (cf. Zoroastrian, Abrahamic texts)
- **Esotericism**: Secret language, initiation rites, graded disclosure
---
### **For Practitioners**
**Meditation Instructions**:
- **Subtle body yogas**: Detailed *tsa-lung-tigle* (channel-wind-essence) practices
- **Maṇḍala visualization**: Step-by-step generation-stage protocols
- **Guru yoga**: Devotional practices for lineage connection
**Liturgical Resources**:
- **Daily practice texts**: Short sādhanas for householder practitioners
- **Feast offerings** (*gaṇacakra*): Community ritual manuals
- **Empowerment ceremonies**: Templates for tantric initiations
**Philosophical Study**:
- **Madhyamaka-Dzogchen synthesis**: Non-dual view grounded in emptiness
- **Appearance-emptiness indivisibility**: Phenomenology of luminous clarity
- **Self-liberation**: Instant recognition of primordial awareness
---
## Phenomenological and Philosophical Analysis
### **Embodied Cognition Perspectives**
**Soma-Psychic Integration**:
- Practices directly link body postures (*āsana*), breath control (*prāṇāyāma*), and mental states
- **Empirical testability**: Modern neuroscience can investigate claims about altered states
**Environmental Cognition**:
- Sacred geography (pilgrimage sites, cave hermitages) as cognitive extensions
- **Place-based practice**: Location affects consciousness (cf. environmental psychology)
---
### **Husserlian Phenomenology**
**Intentionality Structures**:
- Mantra-deities as *noematic* objects (intentional contents of consciousness)
- **Transcendental reduction**: Meditation as bracketing (*epoché*) of natural attitude
**Time-Consciousness**:
- **Eternal now**: Descriptions of awareness beyond past-future duality
- **Protention-retention collapse**: In Dzogchen, temporal flow dissolves into timeless presence
---
### **Heideggerian Existentialism**
**Being-toward-Death**:
- **Chöd practice** (cutting ego): Ritual enactment of death-anxiety confrontation
- **Authentic existence**: Liberation through facing impermanence
**Language and Being**:
- Mantras as **constitutive of reality** (not representational)
- **Primordial language**: Seed-syllables (*bīja*) as ontological foundations
---
### **Possible Worlds Semantics**
**Modal Logic Applications**:
- **Necessity statements**: "One must realize primordial awareness" (deontic modality)
- **Counterfactuals**: "If one practices X, then Y arises" (subjunctive conditionals)
- **Epistemic modalities**: "It is knowable that..." vs. "It is hidden that..."
**Alternative Cosmologies**:
- Tantric universe as *possible world* with distinct causal laws (mantras affect reality)
- **Accessibility relations**: Initiation creates access to otherwise-inaccessible states
---
## Technical Specifications
- **Total size**: ~30 million characters (Tibetan + Chinese + English)
- **Text count**: 950+ individual tantras
- **Volume structure**:
- Vols. 1-46: Thimphu edition (primary)
- Vol. 47: Peltsek-exclusive texts
- Vol. 48: Tengyur Mahāmudrā supplements
- Vol. 49: Longchenpa Complete Works
- **File formats**: Plain text (.txt), Markdown (.md), compressed archives (.7z)
- **Encoding**: UTF-8
- **Metadata**: Source edition, page numbers, translation model, editorial notes
---
## Intended Use Cases
### **Academic Research**
- Tantric Buddhism studies
- Tibetan literature and linguistics
- History of religions
- Manuscript studies and textual criticism
- Gender studies in ancient Tibet
- Comparative mythology
### **Religious Practice**
- Meditation and yoga instruction
- Ritual liturgy
- Guru yoga and devotion
- Philosophical contemplation
### **AI and NLP Applications**
- Training specialized language models on Tibetan Buddhist domain
- Cross-lingual translation research (Tibetan-Chinese-English)
- Semantic analysis of religious symbolism
- Computational liturgy (generating practice texts)
### **Digital Humanities**
- Corpus linguistics (mantra analysis, vocabulary distribution)
- Topic modeling (philosophical themes across tantras)
- Network analysis (citation patterns, lineage maps)
---
## Copyright and Licensing
### **Public Domain Status**
**Historical Texts** (8th-14th centuries):
- Authors deceased >650 years ago
- International copyright expired 1414-1464 CE
- Treated as shared cultural heritage
**Modern Editions**:
- Thimphu, Degé, Peltsek compilations: Editorial work in public domain (50+ years post-publication)
- Digital transcriptions: Derived from openly accessible sources
### **Translation Copyright**
- **AI-generated translations**: Creative Commons Attribution 4.0 International (CC BY 4.0)
- **Permitted uses**:
- Academic research and publication
- AI training and model development
- Commercial applications (with attribution)
- Modification and redistribution
- **Attribution format**: "Buddhist Classics AI Translation Project (2025)"
---
## Version History
- **Version 1.0** (2024 early): Microsoft Translator + Chinese tool (deprecated)
- **Version 2.0** (2024 August): Claude 3.0 Opus, manual processing
- **Version 2.1** (2024 August): Claude 3.5 Sonnet upgrades
- **Version 3.0** (2025 August): Gemini 2.0 Peltsek edition
- **Version 4.0** (2025 September): Complete 47-volume Peltsek + Degé parallel edition
---
In the G2.0 translation edition (volumes 1, 2, 3, 5, 6, 7, 8, 11, 12, etc.) produced between July and November 2025, approximately 1% (a very small proportion) of the text contains entire paragraphs that were accidentally omitted in translation.To address this issue, we have written a dedicated program to perform electronic collation and supplementary translation. As of November 26, 2025, this remedial work has not yet been fully completed.Under normal circumstances, the upgraded complete volumes will first be released at:
https://huggingface.co/datasets/ospx1u/buddhist-classics-vol1-12/tree/main and subsequently published on zenodo.org.
Other data repositories will be updated on a case-by-case basis or may not be updated at all.
Notes (Jinyu Chinese)
Files
Files
(132.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:8168fd0b5dfb24f74159c9497a69786b
|
132.0 MB | Download |
Additional details
Dates
- Collected
-
2024/2025-11