Archaeoscape: LiDAR archaeology ML dataset
Creators
Description
Archaeoscape
We present Archaeoscape, a novel open-access dataset for archaeological research, spanning 888 km² in Cambodia with 31,141 expert-annotated archaeological features from the Angkorian period. Archaeoscape is over four times larger than comparable datasets, and the first ALS archaeology resource with open-access data, annotations, and models.
This work has been presented at NeurIPS 2024 Track Datasets and Benchmarks as a Spotlight Poster.
- arXiv: https://arxiv.org/abs/2412.05203
- openreview: https://openreview.net/forum?id=QpF3DFP3Td
- website: https://archaeoscape.ai/data/2024/
Description
The 888 km² dataset is split into 23 non-overlapping parcels assigned to:
- Training set: 623 km², 16 parcels.
- Validation set: 97 km², 3 parcels.
- Test set: 168 km², 4 parcels.
It includes high-resolution (0.5m) orthophotos and LiDAR-derived normalized Digital Terrain Models (nDTM), encompassing over 3.5 billion pixels with RGB values, elevation data, and polygonal annotations.
The annotations cover 5 classes:
- Temple (827 instances, 0.2% pixels). From monumental complexes to small shrines.
- Mound (14,400, 8.6%). Earthen features indicating habitation, embankments, crafting sites.
- Hydrology (16,184, 10.4%). Hydro-engineering features like rivers, ponds, canals and reservoirs.
- Void (3,145, 2.5%). Ambiguous areas, excluded from evaluation.
- Background (78.3%). Regions lacking distinguishable archaeological features.
To protect sensitive archaeological sites, the data is distributed without georeferencing and released through credentialized open access - users must provide their credentials and explicitly agree to the license terms prohibiting re-georeferencing, commercial use, and redistribution.