Published January 16, 2026 | Version starVLA-1.2
Software documentation Open

StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing

  • 1. HKUST
  • 2. Fudan University
  • 3. SUSTech
  • 4. Xi'an Jiaotong University
  • 5. The Chinese University of Hong Kong
  • 6. Tsinghua University/Microsoft
  • 7. Microsoft
  • 8. Harbin Institute of Technology & Zhongguancun Academy
  • 9. SII

Description

StarVLA is a modular and flexible codebase for developing Vision-Language Model (VLM) to Vision-Language-Action (VLA) models. In StarVLA (also a pun on “start VLA” ), each functional component (model, data, trainer, config, evaluation, etc.) follows a top-down, intuitive separation and high cohesion and low coupling principle, which enabling plug-and-play design, rapid prototyping, and independent debugging.

Files

starVLA/starVLA-starVLA-1.2.zip

Files (33.2 MB)

Name Size Download all
md5:c4a8039bbba001a16ff1d19f281a6128
33.2 MB Preview Download

Additional details

Related works

Is supplement to
Software documentation: https://github.com/starVLA/starVLA/tree/starVLA-1.2 (URL)

Software

Repository URL
https://github.com/starVLA/starVLA
Programming language
Python

References

  • starvla2025