When Agents Get Lost: Dissecting Failure Modes in Graph-Based Navigation Instruction Evaluation

Shami, Farzad; Abedini, Kimia; Hosseini, Seyed Hossein; Van de Weghe, Nico; Tenkanen, Henrikki

doi:10.5281/zenodo.20232766

Published May 16, 2026 | Version v1

Conference paper Open

When Agents Get Lost: Dissecting Failure Modes in Graph-Based Navigation Instruction Evaluation

1. Aalto University
2. Ghent University

Vision-and-Language Navigation (VLN) requires agents to interpret natural language instructions for spatial reasoning, yet evaluating instruction quality remains challenging when agents fail. This gap highlights a critical need for a principled understanding of why navigation instructions fail. Addressing this question requires a systematic analysis of failure patterns in spatial reasoning tasks. To address this, we first present a taxonomy of navigation instruction failures that clusters failure cases into four categories: (i) linguistic properties, (ii) topological constraints, (iii) agent limitations, and (iv) execution barriers. We then introduce a dataset of 492 annotated failure navigation traces collected from GROKE, a vision-free evaluation framework that utilizes OpenStreetMap (OSM) data. Our dataset outlines the failure dynamics in spatial grounding to guide the development of better instruction generation, evaluation systems, and navigation agents. Our analysis of failure traces across GROKE demonstrates that agent limitations (74.2%) constitute the dominant error category, with stop-location errors and planning failures as the most frequent subcategories.

The dataset and taxonomy together provide actionable insights that enable instruction generation systems to identify and avoid under-specification patterns while allowing evaluation frameworks to systematically distinguish between instruction quality issues and agent-specific artifacts.

Code: https://fuzsh.github.io/lost/

Files

GeoAI-Paper-3845.pdf

Files (193.1 kB)

Name	Size	Download all
GeoAI-Paper-3845.pdf md5:2b2237c17eb5c0d1545661b5733dc9a8	193.1 kB	Preview Download

Additional details

Research Council of Finland
Knowledgeable and Multimodal Geographic Large Language Models Grounded with Reasoning and Retrieval 368679

Repository URL: https://fuzsh.github.io/lost/
Development Status: Active

	All versions	This version
Views	58	58
Downloads	64	64
Data volume	17.0 MB	17.0 MB

GeoAI-Paper-3845.pdf

Files (193.1 kB)

Funding

Software

When Agents Get Lost: Dissecting Failure Modes in Graph-Based Navigation Instruction Evaluation

Authors/Creators

Description

Files

GeoAI-Paper-3845.pdf

Files (193.1 kB)

Additional details

Funding

Software