Meta Platforms, Inc.
Language-agnostic understanding
Last updated:
Abstract:
Exemplary embodiments relate to techniques to classify or detect the intent of content written in a language for which a classifier does not exist. These techniques involve building a code-switching corpus via machine translation, generating a universal embedding for words in the code-switching corpus, training a classifier on the universal embeddings to generate an embedding mapping/table; accessing new content written in a language for which a specific classifier may not exist, and mapping entries in the embedding mapping/table to the universal embeddings. Using these techniques, a classifier can be applied to the universal embedding without needing to be trained on a particular language. Exemplary embodiments may be applied to recognize similarities in two content items, make recommendations, find similar documents, perform deduplication, and perform topic tagging for stories in foreign languages.
Utility
21 Dec 2017
19 May 2020