Youdao Unveils Industry-First Accent-Free, Text-Free Voice Cloning Model Supporting 14 Languages

Deep News06-23

Youdao has introduced its new "Ziyue 4.0" TTS speech synthesis engine, Confucius4-TTS, which represents the industry's first open-source model capable of accent-free cross-lingual voice cloning across 14 languages without requiring reference text.

The model achieves state-of-the-art international levels across key dimensions including cross-lingual voice cloning, modeling without reference text, emotional prosody transfer, and localized deployment. It is now fully open-sourced for global users.

Currently, Youdao's Confucius4-TTS comprehensively supports natural and fluent expression in 14 languages, including Chinese, English, and Spanish.

Simultaneously, Confucius4-TTS achieves a comprehensive breakthrough. First, users only need to provide a 3-second audio sample for the model to complete voice cloning. The cloned voice's similarity to the original exceeds 85%, with cloning task accuracy reaching as high as 97%. Second, it supports seamless switching between 14 languages, eliminating cross-lingual accent barriers. Third, it enables lossless cross-lingual transfer of emotional prosody; Confucius4-TTS can automatically extract and analyze emotional features from the reference audio.

It is reported that Confucius4-TTS incorporates a GPT-style semantic large model as its backbone, paired with a learnable speaker encoder based on SSL pre-trained features and ECAPA-TDNN, and adopts a Flow Matching generative framework. Youdao has now fully open-sourced this model.

Confucius4-TTS is released under the Apache open-source license, making the complete model weights and supporting toolchain available to global developers with no commercial restrictions. Developers can download the complete 54GB resource package for local, offline deployment and operation.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment