NVIDIA Corporation
MULTI-MODAL IMAGE TRANSLATION USING NEURAL NETWORKS
Last updated:
Abstract:
A source image is processed using an encoder network to determine a content code representative of a visual aspect of the source object represented in the source image. A target class is determined, which can correspond to an entire population of objects of a particular type. The user may specify specific objects within the target class, or a sampling can be done to select objects within the target class to use for the translation. Style codes for the selected target objects are determined that are representative of the appearance of those target objects. The target style codes are provided with the source content code as input to a translation network, which can use the codes to infer a set of images including representations of the selected target objects having the visual aspect determined from the source image.
Utility
19 Feb 2019
12 Sep 2019