Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.

There is a newer version of the record available.

Published February 24, 2023 | Version 2022.2.7
Dataset Open

OpenITI: a Machine-Readable Corpus of Islamicate Texts

  • 1. ROR icon Aga Khan University
  • 2. ROR icon Universität Hamburg
  • 3. ROR icon Leipzig University

Description

Co-PIs: Matthew Thomas Miller (University of Maryland, College Park), Maxim G. Romanov (University of Hamburg), Sarah Bowen Savant (Aga Khan University—ISMC, London).

Open Islamicate Texts Initiative (OpenITI, see https://openiti.org/) is a multi-institutional effort to construct the first machine-actionable scholarly corpus of premodern Islamicate texts. Led by researchers at the Aga Khan University, Institute for the Study of Muslim Civilisations (AKU-ISMC), University of Hamburg (UH), and the Roshan Institute for Persian Studies at the University of Maryland (College Park) and an interdisciplinary advisory board of leading digital humanists and Islamic, Persian, and Arabic studies scholars, OpenITI aims to provide the essential textual infrastructure in Arabic, Persian and other Islamicate languages for new forms of textual analysis and digital scholarship. In the process, OpenITI will enable new synergies between Digital Humanities and the inter-related Islamicate fields of Islamic, Persian, and Arabic Studies. In addition to support from the researchers’ home institutions, it is supported by funding from the European Research Council under the European Union’s Horizon 2020 research and innovation programme, awarded to the KITAB project (Grant Agreement No. 772989, PI Sarah Bowen Savant) and the Qatar National Library.

Currently, OpenITI contains almost exclusively Arabic texts, which were first assembled into a corpus within the OpenArabic project, developed first at Tufts University (at The Perseus Project, 2013–2015) and then at Leipzig University (at the Alexander von Humboldt Chair for Digital Humanities, 2015–2017)—in both cases with the support and under the patronage of Prof. Gregory Crane. The much more limited number of Persian texts were compiled during 2015–2016 in the Persian Digital Library (PDL) pilot (see Persian Digital Library by PersDigUMD) at Roshan Institute for Persian Studies at the University of Maryland. These texts have not been made fully compatible with OpenITI mARkdown yet and will be made fully available in next releases.

Note on Release Numbering: Version 2019.1.1—where 2019 is the year of the release, the first dotted number—.1—is the ordinal release number in 2019, and the second dotted number—.1—is the overall release number; the first dotted number will reset every year, while the second one will continue on increasing.

For more details: https://github.com/OpenITI/RELEASE

Note: In case of any issues with unzipping the files on Windows using built-in utilities, please use free softwares, such as WinRAR and 7zip.

 

 

Files

data.zip

Files (5.7 GB)

Name Size Download all
md5:9ff72190f7a94878834c131326d4f0b4
5.7 GB Preview Download
md5:39376415fea35e52cabd9292a202fcee
2.7 MB Preview Download
md5:02df73ee30a631da57c6d3d08fa9c9c7
92.7 kB Preview Download

Additional details

Related works

Funding

KITAB – Exploring Cultural Memory in the Pre-Modern Islamic World (700–1500): Knowledge, Information Technology, and the Arabic Book 772989
European Commission