Microsoft Corporation
NEURAL TEXT-TO-SPEECH SYNTHESIS WITH MULTI-LEVEL TEXT INFORMATION
Last updated:
Abstract:
A method and apparatus for generating speech through neural text-to-speech (TTS) synthesis. A text input may be obtained (1310). Phoneme or character level text information may be generated based on the text input (1320). Context-sensitive text information may be generated based on the text input (1330). A text feature may be generated based on the phoneme or character level text information and the context-sensitive text information (1340). A speech waveform corresponding to the text input may be generated based at least on the text feature (1350).
Status:
Application
Type:
Utility
Filling date:
13 Dec 2018
Issue date:
20 Jan 2022