Ximalaya Inc
senior engineer on multi-media
Leo Ma, senior engineer on multi-media in Ximalaya Inc. having been engaging in embedding development, mobile development and deep learning on multi-media. holding a personal open source project Yasea, an stream media client for Android. Now working on TTS technology based on Tacotron model and vocoder development.
Topic: End-to-end Speech Synthesis Open Source Practice Based on Tacotron Model
2019-04-20 14:00 - 14:45
A text to speech (TTS) synthesis system is one of the most important infrustration service on the open platform for AI. Currently an end-to-end synthesis system based on Google's open Tacotron model has achieved state-of-the-art results. The topic of this speech aims to indroduce to the human engaging in speech technology on the architecture and application of Tacotron model, the concatenation with text analysis frontend and vocoder, the advantages on performance, and some evaluations on audio samples.
in Track
Open Source Technology Stack
From FFmpeg, GStreamer to FreeSWITCH, The multimedia ecosystem has large number of open source projects. The topic will introduce the optimization of practices about some mainstream projects.
Topics in the same Track
Daniel Sun
Topic outline:1, Why WebRTC;  2, How work WebRTC;  3, TMS Architecture in WebRTC;  4, What do you need pay attention in WebRTC apllication.
Keyon Jie
In this section, I will share the background of Sound Open Firmware(aimed to be the Linux in Audio DSP Firmware domain), introduce how we implement the open source solution on the open platforms like Minnow Board and UP^2, all based on open resources(the open HAL interface from Cadence, GCC compiler, Linux kernel, …). With new Firmware debugging and configuration features added, The solution is replacing the traditional Intel close source Audio DSP Firmware on customer platforms.We will talk about the pros and cons about SOF, comparing with close source solution, sharing the latest development and future view in SOF community. Hopefully, the sharing can bring on some thinking to audiences who are interested in Audio development, contribute and help each other to make our better lives.
Seven Du
FreeSWITCH is a popular opensource Soft-Switch platform, support telephony communication and video conference, support WebRTC. It is widely used in telecoms, enterpise communications, call centers, video conferences, and online educations. Based on years of experience in opensource development and community operation, this talk comes with telecom histories and futures, use cases, key technology and use cases in ASR/TTS/AI fields.