NVIDIA Corporation
Scene embedding for visual navigation

Last updated: 23 Jul 2021

Abstract:

Navigation instructions are determined using visual data or other sensory information. Individual frames can be extracted from video data, captured from passes through an environment, to generate a sequence of image frames. The frames are processed using a feature extractor to generate frame-specific feature vectors. Image triplets are generated, including a representative image frame (or corresponding feature vector), a similar image frame adjacent in the sequence, and a disparate image frame that is separated by a number of frames in the sequence. The embedding network is trained using the triplets. Image data for a current position and a target destination can then be provided as input to the trained embedding model, which outputs a navigation vector indicating a direction and distance over which the vehicle is to be navigated in the physical environment.

Status:

Grant

Type:

Utility

Filling date:

11 Dec 2018

Issue date:

26 Jan 2021

Full patent description

Patent application document

NVIDIA Corporation Scene embedding for visual navigation

Abstract:

NVIDIA Corporation
Scene embedding for visual navigation