Adobe Inc.
SLOT FILLING WITH CONTEXTUAL INFORMATION
Last updated:
Abstract:
A system, method and non-transitory computer readable medium for editing images with verbal commands are described. Embodiments of the system, method and non-transitory computer readable medium may include an artificial neural network (ANN) comprising a word embedding component configured to convert text input into a set of word vectors, a feature encoder configured to create a combined feature vector for the text input based on the word vectors, a scoring layer configured to compute labeling scores based on the combined feature vectors, wherein the feature encoder, the scoring layer, or both are trained using multi-task learning with a loss function including a first loss value and an additional loss value based on mutual information, context-based prediction, or sentence-based prediction, and a command component configured to identify a set of image editing word labels based on the labeling scores.
Utility
6 Dec 2019
10 Jun 2021