Path Mixing VLN dataset

Anonymous

doi:10.5281/zenodo.10396782

Published January 15, 2024 | Version v0.3

Dataset Open

Path Mixing VLN dataset

Anonymous

The Room -to-Room (R2R) dataset consists of human annotated instructions corresponding to the paths in these graphs. Each path consists of a sequence of viewpoints encountered by the agent during navigation. A derived dataset, Fine-Grained R2R (FGR2R) dataset, annotated parts of instructions with corresponding graph edges to obtain a fine-grained dataset. Existing works in VLN have shown that more instruction examples can improve an agent’s performance in previously unseen environments. We generate 162k instruction-trajectory pairs with path lengths between 5m and 20m. The final dataset has on average 7.27 views per path, a mean of 14.4m trajectory length and an average of 82 words per instruction.

Files

augment_train.json

Files (7.4 GB)

Name	Size	Download all
augment_train.json md5:26b5b4eac16c6dc7c0e72a8d026ad8f7	7.0 GB	Preview Download
augment_val_seen.json md5:74a28718bcfb5bc3fc85515d5d7bc84f	2.8 MB	Preview Download
augment_val_unseen.json md5:a4755df592bb2c6d095c2b321618e142	389.3 MB	Preview Download

375

Views

212

Downloads

Show more details

	All versions	This version
Views	375	338
Downloads	212	198
Data volume	633.2 GB	633.1 GB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

Conference

Spatially-Aware Speaker for Vision Language Navigation Instruction Generation (SAS VLN)

Languages

English

License: MIT License

A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code. Read more

Technical metadata

Created: December 17, 2023
Modified: February 12, 2024

Path Mixing VLN dataset

Authors/Creators

Description

Files

augment_train.json

Files (7.4 GB)