Published April 17, 2026 | Version 1.0
Dataset Open

SUPPLEMENTARY MATERIAL: Pairwise Similarity Learning for Chronological Attribution of Archaeological Assemblages: A Siamese Neural Network Approach

  • 1. ROR icon Universitat Politècnica de València
  • 2. ROR icon Universidad de La Laguna

Description

Abstract

We present a similarity-based framework for the chronological attribution of undated archaeological assemblages, grounded in Siamese Neural Networks (SNNs) and specifically designed for the small sample sizes, compositional uncertainty, and diffuse phase boundaries that characterize most prehistoric datasets. Unlike conventional supervised classifiers, which require sufficient per-class examples to define stable decision boundaries, SNNs reformulate chronological inference as a pairwise relational problem: the model estimates the probability that two assemblages belong to the same chronological phase, mimicking the comparative reasoning implicit in typological analysis while rendering it reproducible and transferable. The quadratic growth of training pairs with sample size substantially amplifies the effective training set without additional radiocarbon data, a critical advantage in data-scarce contexts.

The framework is evaluated on 185 radiocarbon-dated bifacial flint arrowhead assemblages from eastern Iberia (ca. 3500–1900 cal. BC), organized into six chronological phases spanning the Late Neolithic to the Early Bronze Age. Multiple Siamese configurations — logistic regression, MLP, random forest, and deep learning encoders — are compared against standard MLP and SVM baselines. The best-performing configuration (DL with bootstrap augmentation and Dirichlet-multinomial compositional variables) achieves a macro F1 of 21.4% and balanced accuracy of 21.8%, representing a consistent improvement over both baselines and over a stratified random classifier (expected macro F1: ~17%) across all class-balanced metrics. Per-type analysis reveals that predictive accuracy correlates with morphological distinctiveness: foliaceous types reach 96.5% agreement with expert assignments, while pedunculate forms — whose typological boundaries are inherently contested — present the greatest classification challenge, a pattern that mirrors the gradient of archaeological interpretive confidence.

The input structure of the framework — assemblage-level frequency vectors of artifact types — is standard across the main material categories used for chronological inference in prehistoric archaeology, including ceramics, lithic industries, and faunal assemblages. The approach is therefore directly transferable to other material traditions and regional sequences, and its outputs are formally compatible with Bayesian chronological modelling pipelines. The framework requires only typological frequency data routinely collected during excavation and post-excavation analysis, involves negligible computational cost relative to radiometric methods, and can be deployed at early stages of site investigation to generate provisional chronological attributions — providing an operational basis for decisions about sampling strategy, resource allocation, and the prioritization of contexts for absolute dating. As radiocarbon coverage and regional reference databases expand, the framework scales accordingly, functioning as a cost-effective first-pass instrument within a broader chronological workflow rather than as a replacement for higher-resolution analytical methods.

Notes (English)

Computational Reproducibility Guide

Article: A Machine Learning Framework for Chronological Classification of Archaeological Samples Based on Lithic Typology Distributions Journal: Journal of Archaeological Science Complies with: JAS Data & Code Availability / Transparency & Replicability Policy

Overview

This repository contains all code and data required to fully reproduce the quantitative results reported in the article. The pipeline is containerised with Docker so that any researcher can obtain identical outputs on any operating system (Windows, macOS, Linux) without manual dependency installation.

Repository structure

scripts/
├── Dockerfile                        ← Container definition (pinned base image + dependencies)
├── requirements.txt                  ← Exact Python package versions
├── run_pipeline.sh                   ← Entrypoint: runs training → validation → predictions
├── .dockerignore                     ← Files excluded from the Docker build context
│
├── main.py                           ← Pipeline orchestrator (CLI entry point)
├── train_validate.py                 ← Model training & validation (all classifiers)
├── benchmark_runner.py               ← Runs all model × augmentation combinations
├── dirichlet_predictive_model.py     ← Bayesian Dirichlet-multinomial classifier
├── dirichlet_pipeline.py             ← Dirichlet feature extraction pipeline
├── data_loader.py                    ← ODS / CSV data loading & cleaning
├── data_pipeline.py                  ← Unified train / valid / predict data bundle
├── augmentation.py                   ← Bootstrap & Poisson jitter augmentation
├── ensemble.py                       ← Ensemble consensus & King-model selection
├── model_loader.py                   ← Deserialise saved joblib models
├── collect_predictions_by_id.py      ← Aggregate per-model CSVs into wide table
├── report_html.py                    ← HTML report generation
├── generate_figures.py               ← Generates publication figures (Sankey, confusion matrix, type portrait, stacked bar)
├── generate_global_training_report.py← Cross-experiment global summary
├── io_paths.py                       ← Output directory layout helpers
│
└── data/
    └── Puntas_TODOS.ods              ← Input dataset (APRIORI / MODEL_CHECKING / APOSTERIORI sheets)

Prerequisites

Software Minimum version Download
Docker Desktop (Windows / macOS) 24.x https://www.docker.com/products/docker-desktop
Docker Engine (Linux) 24.x https://docs.docker.com/engine/install/

No Python installation is required on the host machine.

Step-by-step reproduction

1 — Clone / download the code

# If distributed via a repository:
git clone <repository-url>
cd <repository>/scripts

# Or simply unzip the supplementary archive and enter the scripts/ folder.

2 — Build the Docker image

Run this command from inside the scripts/ directory (where the Dockerfile lives):

docker build -t puntas-ml .

This will:

  • Pull the pinned base image (python:3.11.9-slim-bookworm)
  • Install all Python packages at their exact pinned versions (see requirements.txt)
  • Copy all source files and the input dataset into the image

Expected build time: 2–5 minutes (depends on internet speed; packages are ~500 MB).

3 — Create an output directory on your machine

Windows (PowerShell):

mkdir output

macOS / Linux:

mkdir output

4 — Run the full pipeline

Windows (PowerShell):

docker run --rm -v "${PWD}\output:/app/output" puntas-ml

macOS / Linux:

docker run --rm -v "$(pwd)/output:/app/output" puntas-ml

The container will execute three sequential steps:

  1. Training & validation — trains all classifiers (Logistic Regression, SVM, Random Forest, MLP, Prototypical, four Siamese variants) under all augmentation strategies (none, bootstrap, bootstrap+Poisson) and both cross-validation modes (holdout, LOOCV), with and without Dirichlet features.
  2. Consensus prediction — aggregates per-model predictions into a wide table with majority-vote consensus and King-model selection (by F1-macro).
  3. Output collection — copies all artefacts to /app/output (bound to your local output/ folder).

Expected runtime: 20–60 minutes on a modern laptop (CPU only; no GPU required).

5 — Inspect the results

After the run completes, the output/ directory will contain:

output/
├── ml_train/
│   ├── index.html                            ← Interactive benchmark overview
│   ├── benchmark_report_*.html               ← Per-strategy HTML reports
│   ├── summary_*.csv                         ← Metrics table (accuracy, F1, etc.)
│   ├── <run_id>/
│   │   ├── models/model.joblib               ← Serialised trained model
│   │   ├── metrics/metrics.json              ← Validation metrics (JSON)
│   │   ├── confusion_matrix.png              ← Confusion matrix plot
│   │   └── roc_curve.png                     ← ROC curve plot
│   └── ...
├── predictions/
│   ├── <run_id>.csv                          ← Per-model APOSTERIORI predictions
│   ├── aposteriori_predictions_wide.csv      ← Wide consensus table
│   ├── aposteriori_predictions_wide.html     ← Human-readable consensus report
│   └── predictions_summary.csv              ← King + Matched + per-model columns
└── figures/
    ├── fig_sankey_600dpi.png                 ← Sankey: expert type → model prediction
    ├── fig_confusion_norm_600dpi.png         ← Normalised confusion matrix
    ├── fig_breakdown_stacked_600dpi.png      ← Stacked bar prediction breakdown
    └── fig_type_portrait_600dpi.png          ← Per-type prediction portrait
    ├── predictions/
    │   ├── dirichlet_posteriors_train.csv
    │   ├── dirichlet_posteriors_valid.csv
    │   └── dirichlet_posteriors_aposteriori.csv
    └── plots/
        └── Dirichlet_*.svg                   ← Per-site Dirichlet probability plots

Reproducibility guarantees

Mechanism Implementation
Fixed random seed --random-state 42 passed to all stochastic estimators
Pinned Python version python:3.11.9-slim-bookworm (Dockerfile FROM)
Pinned library versions All packages fixed in requirements.txt
Disabled thread-level parallelism OMP_NUM_THREADS=1, OPENBLAS_NUM_THREADS=1, MKL_NUM_THREADS=1
Disabled Python hash randomisation PYTHONHASHSEED=0
Headless matplotlib backend MPLBACKEND=Agg — no display required
Non-root container execution Runs as unprivileged archaeo user

Note on hardware variation: Floating-point arithmetic is deterministic given the same CPU instruction set. Minor numerical differences (< 1e-10) may appear if comparing results between x86-64 and ARM64 (Apple Silicon) hosts, but these do not affect any classification outcome or reported metric.

Running individual scripts (advanced)

You can also enter the container interactively and run scripts manually:

docker run --rm -it -v "$(pwd)/output:/app/output" puntas-ml bash

Inside the container:

# Train only specific models
python main.py data/Puntas_TODOS.ods \
    --models rf svm \
    --augmentations none \
    --outdir ml_train \
    --preddir predictions \
    --random-state 42

# Run Dirichlet model only
python dirichlet_predictive_model.py data/Puntas_TODOS.ods \
    --plot --verbose

# Collect and summarise predictions
python collect_predictions_by_id.py \
    --pred-dir predictions \
    --outdir ml_train \
    --metric f1_macro

Software dependencies

All packages are installed from PyPI at the exact versions listed below:

Package Version Purpose
numpy 1.26.4 Array operations
pandas 2.2.2 Tabular data / ODS reading
scipy 1.13.1 Statistical functions (Dirichlet log-likelihood)
scikit-learn 1.5.1 Classical ML classifiers, CV, metrics
joblib 1.4.2 Model serialisation & parallelism control
torch 2.3.1 Deep Siamese network (CPU build)
odfpy 1.4.1 ODS spreadsheet engine
openpyxl 3.1.5 Excel/ODS compatibility layer
matplotlib 3.9.1 Figure generation

Input data

File Format Description
data/Puntas_TODOS.ods OpenDocument Spreadsheet Lithic assemblage dataset with three sheets: APRIORI (training set of dated assemblages), MODEL_CHECKING (validation set), APOSTERIORI (undated assemblages for chronological prediction)

Feature columns used: Type.1, Type.2, Type.3, Type.4, Type.5, Type.7, Metal, Campanif Target column: Phase (integer, chronological period) Identifier column: Yacimiento (site name, used as observation ID)

Type.6 is intentionally excluded from all analyses — see Methods section of the article.

Licence

The source code is released under the MIT Licence (see LICENSE file if included). The dataset (Puntas_TODOS.ods) is released under CC BY 4.0.

Citation

If you use this code or data, please cite the original article:

[Author et al. (in press). A Machine Learning Framework for Chronological 
Classification of Archaeological Samples Based on Lithic Typology Distributions. 
Journal of Archaeological Science. DOI: XXXX]

Contact

For questions about the code or data, please open an issue in the repository or contact

Files

aposteriori_predictions_wide.csv

Files (414.0 kB)

Name Size Download all
md5:f236076006e1a92b3592a4bead60354b
757 Bytes Download
md5:3dc1a5c67494a3d04cbbdcdc2cabdf1f
51.3 kB Preview Download
md5:a2822a5984d3c49d76d2f735c99bd486
2.5 kB Download
md5:6196d16b3e3811297488557ea00ec785
5.5 kB Download
md5:0ba57539b262886b83ba4f242adb1f59
5.0 kB Download
md5:c5702c7a2a946c6d5a02f6e3463335b8
7.0 kB Download
md5:f6beb6805d0705dbeeea3bbd6397df01
2.7 kB Download
md5:43da751543df4c4fb751abd6fb175190
7.9 kB Download
md5:b8bf56f76f29dc5718a0be883e03c2c7
10.3 kB Download
md5:d0256e83ac8202fee6aa7890b0f3242d
7.7 kB Preview Download
md5:3c345f6a43bfa286ab771f7340317d77
4.6 kB Download
md5:5e278da55caf37e724d7045493b1ef06
3.9 kB Download
md5:69e16fe6f359906964c32b200f0fdef7
9.7 kB Download
md5:13aa71828ae8dd27ee24fc763a3fcd05
2.6 kB Download
md5:c4e40b5407859dd11f97cb61e03086ca
798 Bytes Download
md5:3a05dc4ff81239da99218276308967dd
28.9 kB Download
md5:1e64c3ab075f6f507cf2126251d45651
48.0 kB Download
md5:ec32727565230a0502f487eb6a18392a
754 Bytes Download
md5:aed28557250045a0ac2cccc82e020aad
80.5 kB Preview Download
md5:eae25d7c69463288ca1a0b365983eb63
28.1 kB Download
md5:fc6a00d0f4ac098b0389b67526fa0b63
10.2 kB Preview Download
md5:ed2889ee0404a1bb89c37f4880351719
24.8 kB Download
md5:b221831699a65c976aec7cc135b7d641
780 Bytes Preview Download
md5:5e03646405ba248ca8e7badfc213a9a2
3.7 kB Download
md5:782dc160b3f8fd8cb582ab3ee031ba99
12.6 kB Preview Download
md5:9622d6f05b17dbaff62730e73410a220
9.9 kB Download
md5:d47a00a76f80e10693a188d7b134311b
43.3 kB Download

Additional details

Software

Programming language
Python