Published May 10, 2026 | Version v1
Preprint Open

HOPE Architecture Proof-of-Concept: Dual-Speed Continual Learning for Aviation Video Classification Using V-JEPA2 and Gemini as Slow/Fast Memory Systems

  • 1. Sovereign Machine Lab (SOMALA)

Description

This paper presents a proof-of-concept (PoC) for the HOPE (Hierarchical Orchestration of Perceptual Experience) architecture, a framework for continual learning inspired by Google DeepMind's "Nested Learning" research. The goal of the project is to address the "catastrophic forgetting" problem in AI by creating a system that learns at two different speeds, much like the human brain.

FULL CODE

Core Architecture and Components

The system utilizes a dual-speed memory hypothesis, orchestrating two distinct production-grade AI models to handle different types of information:

  • Slow Memory (Long-Term): This is powered by Meta's V-JEPA2. It acts as the visual perception system and long-term storage. When the system encounters novel or high-error situations, it triggers backpropagation through specific learnable "heads" (Classifier, Latent Projector, and Dynamics Predictor) to update the model's weights.

  • Fast Memory (Short-Term): This is powered by Google’s Gemini API. It provides language-based reasoning and context-level adaptation. In familiar, low-error situations, the system only updates Gemini’s context (a recurrent state dictionary) rather than modifying any underlying model weights.

  • HOPE Controller: This is the "brain" of the architecture. it computes a feedback novelty score (error score) to decide which memory system to engage. An error threshold of 0.6 serves as the gating boundary.

Application: Aviation Video Classification

The PoC is applied to the TartanAviation dataset, where the system must classify videos into 10 specific flight phases, such as:

  • Airplane takeoff and landing.

  • Ground operations (taxiing, pushback).

  • In-flight cruise and emergency landings.

  • Holding patterns and maintenance checks.

Demonstration Scenarios

To prove the architectural coherence, the paper details two specific scenarios:

  1. Familiar Situation: When the prediction is correct and the error score is low (below 0.6), the system performs Fast Adaptation. Only the Gemini context is updated, leaving the V-JEPA2 weights untouched.

  2. Novel Situation: When the prediction is incorrect (e.g., an emergency landing misidentified as a cruise), the high error score (above 0.6) triggers Slow Adaptation. This initiates a weight update in the V-JEPA2 heads to "learn" the new information.

Limitations and The "Gemini 4" Hypothesis

The author acknowledges several simplifications in this PoC, including a binary adaptation switch rather than a true multi-level continuum and the use of simulated context updates for Gemini.

A central hypothesis of the paper is that the future release of Gemini 4—expected to feature native, stateful persistent memory—will allow this architecture to move from a simulation to a fully validated experimental state. This would enable the fast memory path to become a genuine learning component that persists across sessions without manual intervention.

Technical Summary Table

Feature PoC Implementation Theoretical Goal
Primary Models V-JEPA2 & Gemini 3 Pro N-level Continuum Memory
Gating Mechanism Error threshold > 0.6 Surprise-driven gating
Learning Type Backprop (Slow) / Context Update (Fast) Multi-level optimization
Domain Aviation (TartanAviation) General Multimodal Learning
Code Access Available on GitHub Open Research

Files

HOPE_PoC_Paper (1).pdf

Files (26.0 kB)

Name Size Download all
md5:fe65f705ee5bf8c690fe4bb2c0974c54
26.0 kB Preview Download