H2E: A Perception-Reasoning Pipeline for Aviation World Models
Authors/Creators
Description
The H2E aviation world model research pipeline, detailed in WM.pdf, is designed to analyze aviation scenarios from recorded video by integrating the V-JEPA 2 perception model with Claude Opus 4.8 for reasoning. The system uses a modular, offline architecture that processes video to extract features, generate structured, expert-like action descriptions, and calculate the Semantic Reasoning Overlap Indicator (SROI) to measure the alignment between model outputs and expert intent.
The proof-of-concept evaluation performed on a held-out video segment yielded a primary Text SROI score of 0.4528. While the Visual SROI score reached 0.5177, it was identified as a memorization artifact stemming from the limited training data used in this initial stage, rather than a sign of true generalization. The pipeline incorporates an error-path guard to maintain result integrity by flagging failed API calls as invalid.
Findings from the evaluation indicate a context gap where the lack of visible aircraft in the input keyframes led the model to produce generic, safety-oriented reasoning. Future development of the framework will prioritize acquiring more diverse training data across varied conditions, validating the SROI metric against human expertise, and extending the model's capabilities to include temporal and causal reasoning.
Files
WM.pdf
Files
(166.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:f8935212f5291f267963d3c5ab77e8ef
|
166.4 kB | Preview Download |