SUPPLEMENTARY MATERIAL: Pairwise Similarity Learning for Chronological Attribution of Archaeological Assemblages: A Siamese Neural Network Approach

Jiménez-Puerto, Joaquín; Trull, Oscar; Troncoso, Alicia; Pardo-Gordó, Salvador

doi:10.5281/zenodo.19628745

Published April 17, 2026 | Version 1.0

Dataset Open

SUPPLEMENTARY MATERIAL: Pairwise Similarity Learning for Chronological Attribution of Archaeological Assemblages: A Siamese Neural Network Approach

1. Universitat Politècnica de València
2. Universidad de La Laguna

Abstract

We present a similarity-based framework for the chronological attribution of undated archaeological assemblages, grounded in Siamese Neural Networks (SNNs) and specifically designed for the small sample sizes, compositional uncertainty, and diffuse phase boundaries that characterize most prehistoric datasets. Unlike conventional supervised classifiers, which require sufficient per-class examples to define stable decision boundaries, SNNs reformulate chronological inference as a pairwise relational problem: the model estimates the probability that two assemblages belong to the same chronological phase, mimicking the comparative reasoning implicit in typological analysis while rendering it reproducible and transferable. The quadratic growth of training pairs with sample size substantially amplifies the effective training set without additional radiocarbon data, a critical advantage in data-scarce contexts.

The framework is evaluated on 185 radiocarbon-dated bifacial flint arrowhead assemblages from eastern Iberia (ca. 3500–1900 cal. BC), organized into six chronological phases spanning the Late Neolithic to the Early Bronze Age. Multiple Siamese configurations — logistic regression, MLP, random forest, and deep learning encoders — are compared against standard MLP and SVM baselines. The best-performing configuration (DL with bootstrap augmentation and Dirichlet-multinomial compositional variables) achieves a macro F1 of 21.4% and balanced accuracy of 21.8%, representing a consistent improvement over both baselines and over a stratified random classifier (expected macro F1: ~17%) across all class-balanced metrics. Per-type analysis reveals that predictive accuracy correlates with morphological distinctiveness: foliaceous types reach 96.5% agreement with expert assignments, while pedunculate forms — whose typological boundaries are inherently contested — present the greatest classification challenge, a pattern that mirrors the gradient of archaeological interpretive confidence.

The input structure of the framework — assemblage-level frequency vectors of artifact types — is standard across the main material categories used for chronological inference in prehistoric archaeology, including ceramics, lithic industries, and faunal assemblages. The approach is therefore directly transferable to other material traditions and regional sequences, and its outputs are formally compatible with Bayesian chronological modelling pipelines. The framework requires only typological frequency data routinely collected during excavation and post-excavation analysis, involves negligible computational cost relative to radiometric methods, and can be deployed at early stages of site investigation to generate provisional chronological attributions — providing an operational basis for decisions about sampling strategy, resource allocation, and the prioritization of contexts for absolute dating. As radiocarbon coverage and regional reference databases expand, the framework scales accordingly, functioning as a cost-effective first-pass instrument within a broader chronological workflow rather than as a replacement for higher-resolution analytical methods.

Notes (English)

Computational Reproducibility Guide

Article: A Machine Learning Framework for Chronological Classification of Archaeological Samples Based on Lithic Typology Distributions Journal: Journal of Archaeological Science Complies with: JAS Data & Code Availability / Transparency & Replicability Policy

Overview

This repository contains all code and data required to fully reproduce the quantitative results reported in the article. The pipeline is containerised with Docker so that any researcher can obtain identical outputs on any operating system (Windows, macOS, Linux) without manual dependency installation.

Repository structure

scripts/
├── Dockerfile                        ← Container definition (pinned base image + dependencies)
├── requirements.txt                  ← Exact Python package versions
├── run_pipeline.sh                   ← Entrypoint: runs training → validation → predictions
├── .dockerignore                     ← Files excluded from the Docker build context
│
├── main.py                           ← Pipeline orchestrator (CLI entry point)
├── train_validate.py                 ← Model training & validation (all classifiers)
├── benchmark_runner.py               ← Runs all model × augmentation combinations
├── dirichlet_predictive_model.py     ← Bayesian Dirichlet-multinomial classifier
├── dirichlet_pipeline.py             ← Dirichlet feature extraction pipeline
├── data_loader.py                    ← ODS / CSV data loading & cleaning
├── data_pipeline.py                  ← Unified train / valid / predict data bundle
├── augmentation.py                   ← Bootstrap & Poisson jitter augmentation
├── ensemble.py                       ← Ensemble consensus & King-model selection
├── model_loader.py                   ← Deserialise saved joblib models
├── collect_predictions_by_id.py      ← Aggregate per-model CSVs into wide table
├── report_html.py                    ← HTML report generation
├── generate_figures.py               ← Generates publication figures (Sankey, confusion matrix, type portrait, stacked bar)
├── generate_global_training_report.py← Cross-experiment global summary
├── io_paths.py                       ← Output directory layout helpers
│
└── data/
    └── Puntas_TODOS.ods              ← Input dataset (APRIORI / MODEL_CHECKING / APOSTERIORI sheets)

Prerequisites

Software	Minimum version	Download
Docker Desktop (Windows / macOS)	24.x	https://www.docker.com/products/docker-desktop
Docker Engine (Linux)	24.x	https://docs.docker.com/engine/install/

No Python installation is required on the host machine.

Step-by-step reproduction

1 — Clone / download the code

# If distributed via a repository:
git clone <repository-url>
cd <repository>/scripts

# Or simply unzip the supplementary archive and enter the scripts/ folder.

2 — Build the Docker image

Run this command from inside the scripts/ directory (where the Dockerfile lives):

docker build -t puntas-ml .

This will:

Pull the pinned base image (python:3.11.9-slim-bookworm)
Install all Python packages at their exact pinned versions (see requirements.txt)
Copy all source files and the input dataset into the image

Expected build time: 2–5 minutes (depends on internet speed; packages are ~500 MB).

3 — Create an output directory on your machine

Windows (PowerShell):

mkdir output

macOS / Linux:

mkdir output

4 — Run the full pipeline

Windows (PowerShell):

docker run --rm -v "${PWD}\output:/app/output" puntas-ml

macOS / Linux:

docker run --rm -v "$(pwd)/output:/app/output" puntas-ml

The container will execute three sequential steps:

Training & validation — trains all classifiers (Logistic Regression, SVM, Random Forest, MLP, Prototypical, four Siamese variants) under all augmentation strategies (none, bootstrap, bootstrap+Poisson) and both cross-validation modes (holdout, LOOCV), with and without Dirichlet features.
Consensus prediction — aggregates per-model predictions into a wide table with majority-vote consensus and King-model selection (by F1-macro).
Output collection — copies all artefacts to /app/output (bound to your local output/ folder).

Expected runtime: 20–60 minutes on a modern laptop (CPU only; no GPU required).

5 — Inspect the results

After the run completes, the output/ directory will contain:

output/
├── ml_train/
│   ├── index.html                            ← Interactive benchmark overview
│   ├── benchmark_report_*.html               ← Per-strategy HTML reports
│   ├── summary_*.csv                         ← Metrics table (accuracy, F1, etc.)
│   ├── <run_id>/
│   │   ├── models/model.joblib               ← Serialised trained model
│   │   ├── metrics/metrics.json              ← Validation metrics (JSON)
│   │   ├── confusion_matrix.png              ← Confusion matrix plot
│   │   └── roc_curve.png                     ← ROC curve plot
│   └── ...
├── predictions/
│   ├── <run_id>.csv                          ← Per-model APOSTERIORI predictions
│   ├── aposteriori_predictions_wide.csv      ← Wide consensus table
│   ├── aposteriori_predictions_wide.html     ← Human-readable consensus report
│   └── predictions_summary.csv              ← King + Matched + per-model columns
└── figures/
    ├── fig_sankey_600dpi.png                 ← Sankey: expert type → model prediction
    ├── fig_confusion_norm_600dpi.png         ← Normalised confusion matrix
    ├── fig_breakdown_stacked_600dpi.png      ← Stacked bar prediction breakdown
    └── fig_type_portrait_600dpi.png          ← Per-type prediction portrait
    ├── predictions/
    │   ├── dirichlet_posteriors_train.csv
    │   ├── dirichlet_posteriors_valid.csv
    │   └── dirichlet_posteriors_aposteriori.csv
    └── plots/
        └── Dirichlet_*.svg                   ← Per-site Dirichlet probability plots

Reproducibility guarantees

Mechanism	Implementation
Fixed random seed	`--random-state 42` passed to all stochastic estimators
Pinned Python version	`python:3.11.9-slim-bookworm` (Dockerfile `FROM`)
Pinned library versions	All packages fixed in `requirements.txt`
Disabled thread-level parallelism	`OMP_NUM_THREADS=1`, `OPENBLAS_NUM_THREADS=1`, `MKL_NUM_THREADS=1`
Disabled Python hash randomisation	`PYTHONHASHSEED=0`
Headless matplotlib backend	`MPLBACKEND=Agg` — no display required
Non-root container execution	Runs as unprivileged `archaeo` user

Note on hardware variation: Floating-point arithmetic is deterministic given the same CPU instruction set. Minor numerical differences (< 1e-10) may appear if comparing results between x86-64 and ARM64 (Apple Silicon) hosts, but these do not affect any classification outcome or reported metric.

Running individual scripts (advanced)

You can also enter the container interactively and run scripts manually:

docker run --rm -it -v "$(pwd)/output:/app/output" puntas-ml bash

Inside the container:

# Train only specific models
python main.py data/Puntas_TODOS.ods \
    --models rf svm \
    --augmentations none \
    --outdir ml_train \
    --preddir predictions \
    --random-state 42

# Run Dirichlet model only
python dirichlet_predictive_model.py data/Puntas_TODOS.ods \
    --plot --verbose

# Collect and summarise predictions
python collect_predictions_by_id.py \
    --pred-dir predictions \
    --outdir ml_train \
    --metric f1_macro

Software dependencies

All packages are installed from PyPI at the exact versions listed below:

Package	Version	Purpose
`numpy`	1.26.4	Array operations
`pandas`	2.2.2	Tabular data / ODS reading
`scipy`	1.13.1	Statistical functions (Dirichlet log-likelihood)
`scikit-learn`	1.5.1	Classical ML classifiers, CV, metrics
`joblib`	1.4.2	Model serialisation & parallelism control
`torch`	2.3.1	Deep Siamese network (CPU build)
`odfpy`	1.4.1	ODS spreadsheet engine
`openpyxl`	3.1.5	Excel/ODS compatibility layer
`matplotlib`	3.9.1	Figure generation

Input data

File	Format	Description
`data/Puntas_TODOS.ods`	OpenDocument Spreadsheet	Lithic assemblage dataset with three sheets: `APRIORI` (training set of dated assemblages), `MODEL_CHECKING` (validation set), `APOSTERIORI` (undated assemblages for chronological prediction)

Feature columns used: Type.1, Type.2, Type.3, Type.4, Type.5, Type.7, Metal, Campanif Target column: Phase (integer, chronological period) Identifier column: Yacimiento (site name, used as observation ID)

Type.6 is intentionally excluded from all analyses — see Methods section of the article.

Licence

The source code is released under the MIT Licence (see LICENSE file if included). The dataset (Puntas_TODOS.ods) is released under CC BY 4.0.

Citation

If you use this code or data, please cite the original article:

[Author et al. (in press). A Machine Learning Framework for Chronological 
Classification of Archaeological Samples Based on Lithic Typology Distributions. 
Journal of Archaeological Science. DOI: XXXX]

Contact

For questions about the code or data, please open an issue in the repository or contact

Files

aposteriori_predictions_wide.csv

Files (414.0 kB)

Name	Size	Download all
.dockerignore md5:f236076006e1a92b3592a4bead60354b	757 Bytes	Download
aposteriori_predictions_wide.csv md5:3dc1a5c67494a3d04cbbdcdc2cabdf1f	51.3 kB	Preview Download
augmentation.py md5:a2822a5984d3c49d76d2f735c99bd486	2.5 kB	Download
benchmark_runner.py md5:6196d16b3e3811297488557ea00ec785	5.5 kB	Download
collect_predictions_by_id.py md5:0ba57539b262886b83ba4f242adb1f59	5.0 kB	Download
data_loader.py md5:c5702c7a2a946c6d5a02f6e3463335b8	7.0 kB	Download
data_pipeline.py md5:f6beb6805d0705dbeeea3bbd6397df01	2.7 kB	Download
dirichlet_pipeline.py md5:43da751543df4c4fb751abd6fb175190	7.9 kB	Download
dirichlet_predictive_model.py md5:b8bf56f76f29dc5718a0be883e03c2c7	10.3 kB	Download
DOCKER_GUIDE.md md5:d0256e83ac8202fee6aa7890b0f3242d	7.7 kB	Preview Download
Dockerfile md5:3c345f6a43bfa286ab771f7340317d77	4.6 kB	Download
ensemble.py md5:5e278da55caf37e724d7045493b1ef06	3.9 kB	Download
generate_figures.py md5:69e16fe6f359906964c32b200f0fdef7	9.7 kB	Download
generate_global_training_report.py md5:13aa71828ae8dd27ee24fc763a3fcd05	2.6 kB	Download
io_paths.py md5:c4e40b5407859dd11f97cb61e03086ca	798 Bytes	Download
Machine Learning Framework for Chronological Classification of Archaeological Samples Based on Lithic Typology Distributions.docx md5:3a05dc4ff81239da99218276308967dd	28.9 kB	Download
main.py md5:1e64c3ab075f6f507cf2126251d45651	48.0 kB	Download
model_loader.py md5:ec32727565230a0502f487eb6a18392a	754 Bytes	Download
predictions_summary.csv md5:aed28557250045a0ac2cccc82e020aad	80.5 kB	Preview Download
Puntas_TODOS.ods md5:eae25d7c69463288ca1a0b365983eb63	28.1 kB	Download
README_DOCKER.md md5:fc6a00d0f4ac098b0389b67526fa0b63	10.2 kB	Preview Download
report_html.py md5:ed2889ee0404a1bb89c37f4880351719	24.8 kB	Download
requirements.txt md5:b221831699a65c976aec7cc135b7d641	780 Bytes	Preview Download
run_pipeline.sh md5:5e03646405ba248ca8e7badfc213a9a2	3.7 kB	Download
siamese_coincidencia_por_yacimiento.csv md5:782dc160b3f8fd8cb582ab3ee031ba99	12.6 kB	Preview Download
Source.mat md5:9622d6f05b17dbaff62730e73410a220	9.9 kB	Download
train_validate.py md5:d47a00a76f80e10693a188d7b134311b	43.3 kB	Download

Additional details

Programming language: Python

	All versions	This version
Views	32	32
Downloads	0	0
Data volume	0 Bytes	0 Bytes

SUPPLEMENTARY MATERIAL: Pairwise Similarity Learning for Chronological Attribution of Archaeological Assemblages: A Siamese Neural Network Approach

Authors/Creators

Description

Abstract

Notes (English)

Overview

Repository structure

Prerequisites

Step-by-step reproduction

1 — Clone / download the code

2 — Build the Docker image

3 — Create an output directory on your machine

4 — Run the full pipeline

5 — Inspect the results

Reproducibility guarantees

Running individual scripts (advanced)

Software dependencies

Input data

Licence

Citation

Contact

Files

aposteriori_predictions_wide.csv

Files (414.0 kB)

Additional details

Software