Microsoft Corporation
NEURAL TEXT-TO-SPEECH SYNTHESIS WITH MULTI-LEVEL TEXT INFORMATION

Last updated:

Abstract:

A method and apparatus for generating speech through neural text-to-speech (TTS) synthesis. A text input may be obtained (1310). Phoneme or character level text information may be generated based on the text input (1320). Context-sensitive text information may be generated based on the text input (1330). A text feature may be generated based on the phoneme or character level text information and the context-sensitive text information (1340). A speech waveform corresponding to the text input may be generated based at least on the text feature (1350).

Status:
Application
Type:

Utility

Filling date:

13 Dec 2018

Issue date:

20 Jan 2022