Published March 1, 2025 | Version 0.0
Dataset Open

Enki: AI for Archaeology – Datasets for Automatic Site Recognition

  • 1. ROR icon Sapienza University of Rome

Description

Enki: AI for Archaeology – Datasets for Automatic Site Recognition

Artificial intelligence meets archaeology with Enki, a deep learning supervised system designed to support the discovery of hidden archaeological sites. This first version of Enki dataset, released under the CC-BY-SA 4.0 license, enables the exploration and analysis of satellite and aerial images to identify potential traces of possible ancient settlements.

Available Datasets

We have released three high-quality datasets for training, validating, and testing automatic recognition models:

Normalized_Tell_combined_350 (Training Set)

• 21,654 images

• Resolution: 350×350 px

• Size: 1.70 GB

 

Val_norm (Validation Set)

• 1,290 images

• Resolution: 350×350 px

• Size: 287 MB

 

Test_2025_02_05 (Test Set)

• 1,459 images

• Resolution: 3023×3023 px

• Size: 12.05 GB

 

These datasets have been specifically designed to optimize the training of computer vision models specializing in the identification of tells, anthropogenic settlements, and other archaeological structures that are invisible to the naked eye.

Usage and Contributions

The datasets can be used to develop and test machine learning models for computational archaeology, remote sensing, and landscape analysis. The source code and pre-trained models of Enki are available on GitHub:

🔗 GitHub: archeorosati/enki

We encourage researchers to contribute, improve, and apply these tools to expand the potential of artificial intelligence in uncovering the past.

📌 License: CC-BY-SA 4.0

Main scientific Credits: The images of many of the sites used to train the model in these datasets are based on mappings from the Ancient Near East (ANE) Project, a comprehensive spatial database of archaeological sites in the Near East. The ANE Project, licensed under CC-BY-SA 4.0, provides an essential foundation for research in computational archaeology and remote sensing. We extend our gratitude to the creators of ANE for their valuable contribution to archaeological research. More details can be found here: ANE on Zenodo.

Fair Use Statement for Satellite Imagery

The datasets provided in this repository include image data sourced from Google Satellite and Bing Satellite for research purposes. The use of these images is justified under the principles of fair use based on the following considerations:

1. Educational and Research Purpose

• The images are utilized exclusively for scientific research, archaeological analysis, and non-commercial academic study. The primary goal is to advance machine learning applications in computational archaeology and automated site detection.

2. Transformative Use

• The satellite imagery has been processed, modified, and analyzed to create training datasets that significantly differ from the original raw images. These transformations include normalization, segmentation, labeling, and AI-driven feature extraction, making them part of a novel and distinct dataset for archaeological research.

3. Limited and Non-Commercial Use

• The datasets do not redistribute raw or unaltered satellite images. Instead, they consist of processed image datasetsspecifically tailored for AI-based archaeological site detection.

• No commercial benefit is derived from the use of these images. The datasets are shared under a Creative Commons CC-BY-SA 4.0 license to promote open research.

4. Attribution and Compliance with Terms of Service

• Google and Bing Satellite imagery remain the property of their respective providers.

• Users of these datasets are encouraged to adhere to the terms of service of Google Maps, Google Earth, Bing Maps, and other satellite data providers when using or referencing the original sources.

• Any publication or derivative work based on these datasets should acknowledge the original satellite data providerswhere applicable.

5. Public Interest and Scientific Advancement

• The research aims to enhance archaeological discovery methods using AI, contributing to the scientific community and cultural heritage preservation.

• By sharing these processed datasets openly, we enable greater transparency, reproducibility, and collaborative research in the fields of remote sensing, machine learning, and digital archaeology.

📌 For further information on satellite imagery terms of use, refer to:

Google Maps & Google Earth Terms of Service

Microsoft Bing Maps Terms of Use

Files

Normalized_Tell_combined_350.zip

Files (14.0 GB)

Name Size Download all
md5:1d4ea398ea5669f9bca4539416847206
1.7 GB Preview Download
md5:1649154e68e8a05b8b90c1d07cb666d9
12.0 GB Preview Download
md5:fbd56ed8039b946eadb6481badf80c95
287.1 MB Preview Download

Additional details

Dates

Available
2025-03-01
Dataset published
Created
2024-08-04
Dataset creation