VLM Action Parser Library

Kuster, Boris; Simonič, Mihael; Mavsar, Matija; Nemec, Bojan

doi:10.5281/zenodo.14929983

Published February 26, 2025 | Version v1

Software Open

VLM Action Parser Library

1. Jožef Stefan Institute

Module for prediction and execution of robotic skills using vision-language models (VLMs).

Initial textual instructions (e.g. task board completion steps) along with an optional auxiliary image (e.g. depicting taskboard components) are processed into a robot-executable task list. This module relies on a skill library (consisting of motion primitives for executing tasks , e.g. steps in taskboard benchmark).

It can also be queried to determine action success (e.g. whether or not the door has been opened).

Internally, it uses langchain, so the module can connect to different VLMs (local models or OpenAI API).

Files

vlm_action_parser-main.zip

Files (642.1 kB)

Name	Size	Download all
vlm_action_parser-main.zip md5:fc7ca107e73955c41eb5793d58c52f23	642.1 kB	Preview Download

Additional details

European Commission
euROBIN - European ROBotics and AI Network 101070596

Repository URL: https://repo.ijs.si/hcr/deep_learning/vlm_action_parser
Programming language: Python
Development Status: Active

154

Views

Downloads

Show more details

	All versions	This version
Views	154	154
Downloads	23	23
Data volume	14.8 MB	14.8 MB

More info on how stats are collected....

DOI

Resource type

Software

Publisher

Jozef Stefan Institute

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: February 26, 2025
Modified: February 26, 2025

VLM Action Parser Library

Authors/Creators

Description

Files

vlm_action_parser-main.zip

Files (642.1 kB)

Additional details

Funding

Software