Published July 11, 2021 | Version v1
Conference paper Open

Vision and Language Navigation Using Minimal Voice Instructions

  • 1. Research Scholar
  • 2. Professor

Description

Abstract— The proposed system aims to design an algorithm that can be used to navigate any 3-D mapped environment, using the Matterport 3D Simulator by giving only minimal voice instructions. During the training phase, the nodes of a selected environment are traversed sequentially in the Simulator and an object recognition algorithm is applied on the panorama at each node. This helps in identifying and tagging the objects in the vicinity of each viewpoint. For the testing phase, a natural language instruction, specifying the goal location is taken as input. The goal location is identified from among the various viewpoints in the 3D environment by matching it to the tags generated in the testing phase. A shortest path algorithm is employed to navigate from the starting location to the goal location. The proposed system focuses on the implementation of the algorithm which combines natural language processing and computer vision and can be employed by agents for indoor navigation.

Files

Vision and Language Navigation using Minimal Voice Instructions.pdf

Files (392.7 kB)