senior engineer on multi-media
Leo Ma, senior engineer on multi-media in Ximalaya Inc. having been engaging in embedding development, mobile development and deep learning on multi-media. holding a personal open source project Yasea, an stream media client for Android. Now working on TTS technology based on Tacotron model and vocoder development.
Tracks he partipates
From FFmpeg, GStreamer to FreeSWITCH, The multimedia ecosystem has large number of open source projects. The topic will introduce the optimization of practices about some mainstream projects.
A text to speech (TTS) synthesis system is one of the most important infrustration service on the open platform for AI. Currently an end-to-end synthesis system based on Google's open Tacotron model has achieved state-of-the-art results. The topic of this speech aims to indroduce to the human engaging in speech technology on the architecture and application of Tacotron model, the concatenation with text analysis frontend and vocoder, the advantages on performance, and some evaluations on audio samples.