Published February 26, 2025
| Version v1
Software
Open
VLM Action Parser Library
Authors/Creators
Description
Module for prediction and execution of robotic skills using vision-language models (VLMs).
Initial textual instructions (e.g. task board completion steps) along with an optional auxiliary image (e.g. depicting taskboard components) are processed into a robot-executable task list. This module relies on a skill library (consisting of motion primitives for executing tasks , e.g. steps in taskboard benchmark).
It can also be queried to determine action success (e.g. whether or not the door has been opened).
Internally, it uses langchain, so the module can connect to different VLMs (local models or OpenAI API).
Files
vlm_action_parser-main.zip
Files
(642.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:fc7ca107e73955c41eb5793d58c52f23
|
642.1 kB | Preview Download |
Additional details
Software
- Repository URL
- https://repo.ijs.si/hcr/deep_learning/vlm_action_parser
- Programming language
- Python
- Development Status
- Active