One-Shot Catastrophic Constraint Learning: A Transparent Benchmark for Permanent Safety Learning

Shamim, Ryan

doi:10.5281/zenodo.18027900

Published December 23, 2025 | Version 1.0

Preprint Open

One-Shot Catastrophic Constraint Learning: A Transparent Benchmark for Permanent Safety Learning

Shamim, Ryan (Project leader)^{1, 2}

1. Anima Core Inc.
2. Shamim Institute of Soul Systems

This paper introduces a falsifiable evaluation protocol for testing whether an artificial agent can learn a permanent safety constraint from a single catastrophic event and generalize that constraint across unseen environments without further training, gradient updates, replay buffers, or parameter tuning.

The core contribution is an intentionally strict benchmark protocol, instantiated using the official MiniGrid LavaCrossing environments, which evaluates:

Whether an agent permanently avoids catastrophic hazards after a single failure
Whether the learned constraint generalizes across hundreds of unseen layouts
Whether safety is achieved without degrading task performance

The evaluation follows a three-stage protocol:

The agent is run until its first catastrophic failure (stepping into lava).
That single event is recorded as the only learning signal.
The agent is then evaluated on hundreds of unseen episodes with fixed seeds.

Performance is measured using transparent, auditable metrics, including post-death hazard violations, goal completion rate, and before/after failure statistics.

This record includes:

The complete paper (PDF)
A public, reproducible benchmark harness
A minimal demonstration agent implementing explicit constraint logic
Documentation and test scripts for independent verification

The included code is provided solely to document and reproduce the evaluation protocol described in the paper. It intentionally excludes proprietary algorithms, internal AN1 systems, or advanced learning mechanisms. Researchers are encouraged to plug in their own agents to evaluate whether true one-shot catastrophic constraint learning is achieved.

This work is intended as an honest capability test, not an optimization challenge, and is designed to support research in:

One-shot learning from catastrophic events
Safety constraints in reinforcement learning
Generalization of hazard avoidance
Non-gradient safety mechanisms
Transparent and reproducible AI safety evaluation

Files

oneshot_catastrophic_constraint_learning_numbered.pdf

Files (158.2 kB)

Name	Size	Download all
oneshot_catastrophic_constraint_learning_numbered.pdf md5:fa0e4887eeee079f06cb5384c19b539b	158.2 kB	Preview Download

Additional details

Submitted: 2025-12-22

Repository URL: https://github.com/Anima-Core/an1-lavacrossing-benchmark-public
Programming language: Python
Development Status: Active

	All versions	This version
Views	572	572
Downloads	109	109
Data volume	21.0 MB	21.0 MB

oneshot_catastrophic_constraint_learning_numbered.pdf

Files (158.2 kB)

Dates

Software

One-Shot Catastrophic Constraint Learning: A Transparent Benchmark for Permanent Safety Learning

Authors/Creators

Description

Files

oneshot_catastrophic_constraint_learning_numbered.pdf

Files (158.2 kB)

Additional details

Dates

Software