Representation Before Action: How Dynamics-Aware Perception, Tactile Grounding, and Instruction Granularity Define the Upstream Bottleneck in Robot Generalization

Saluca Agentic AI Research Team

doi:10.5281/zenodo.20525999

Published June 3, 2026 | Version v2

Working paper Open

Representation Before Action: How Dynamics-Aware Perception, Tactile Grounding, and Instruction Granularity Define the Upstream Bottleneck in Robot Generalization

Saluca Agentic AI Research Team¹

1. Saluca LLC

Version 2 — revised in response to an external structural review and an automated critique pass. See "Response to Review" appendix in the PDF for the change log.

A persistent structural pattern across recent robotics preprints is that generalization failures in robot learning are predominantly *upstream* failures — they originate in how the robot represents the world before any action is computed, not in the action-selection mechanism itself. This paper synthesizes five to seven findings from recent cs.RO and cs.HC preprints to argue that three complementary upstream bottlenecks — dynamics-aware visual representation, physics-grounded tactile encoding, and fine-grained language supervision — each independently constrain downstream policy generalization, and that addressing any one in isolation yields bounded gains. This is offered as a **heuristic reading**, not a formal derivation: the three bottlenecks share a structural pattern (richer upstream signal → more decodable downstream behavior) but are not unified by a single formalism, and the analogies drawn across modalities are structural rather than mechanistic. The corpus spans cs.RO and cs.HC preprints from May–June 2026, with supporting evidence from eess.SY on sample complexity. Key falsifiable claims include: (1) dynamics-aware visual encoders trained on image-language-3D flow triplets outperform static encoders by up to +22.5% in out-of-distribution manipulation scenarios, but only under the simulation and limited real-world conditions reported in the abstract [corpus:arxiv:2605.30350]; (2) Center-of-Pressure tactile representations achieve zero-shot sim-to-real transfer on contact-rich tasks where coarse binary-contact baselines fail, evaluated on two tasks with a single multi-fingered hand platform [corpus:arxiv:2605.28812]; (3) fine-grained instruction supervision follows an inverted-U mixing curve, peaking at FG:Raw = 1:2 to 1:1 and reaching 86.8% success in simulation only [corpus:arxiv:2605.27284]; (4) embodied VR feedback reshapes neural representations to yield r = 0.762 motor-imagery decoding correlation versus r = 0.672 for screen feedback, with improvements of 8.9–13.0% across movement dimensions, in a ten-participant human BCI study [corpus:arxiv:2605.29677]. The primary falsification path is: train matched policies on identical downstream tasks with and without each upstream enrichment, controlling for policy architecture and data volume, and test whether gains persist under held-out embodiment transfer. ---

Authorship: Saluca Agentic AI Research Team (Saluca LLC). AI-drafted from arXiv preprint corpus on the date in the filename.

Cited arXiv preprints: 2605.01597, 2605.26640, 2605.27284, 2605.28726, 2605.28812, 2605.29091, 2605.29677, 2605.30280, 2605.30326, 2605.30350, 2605.30864, 2606.01478, 2606.01970, 2606.02027, 2606.02562

Notes

This paper was AI-drafted by an internal multi-persona research agent over a curated arXiv corpus. It is not peer-reviewed. All cited works are listed by arXiv ID; readers should follow those links to verify claims against the primary preprints.

Files

20260603_cyborg_upstream-representation-bottleneck-robot-generalization_v2.pdf

Files (69.7 kB)

Name	Size	Download all
20260603_cyborg_upstream-representation-bottleneck-robot-generalization_v2.pdf md5:7398e5bdc3267f78d332a8e83aa3bda6	69.7 kB	Preview Download

	All versions	This version
Views	21	14
Downloads	5	3
Data volume	328.4 kB	209.0 kB

Representation Before Action: How Dynamics-Aware Perception, Tactile Grounding, and Instruction Granularity Define the Upstream Bottleneck in Robot Generalization

Authors/Creators

Description

Notes

Files

20260603_cyborg_upstream-representation-bottleneck-robot-generalization_v2.pdf

Files (69.7 kB)