ElevenLabs has released 'Dubbing v2,' an AI-powered voice editor that synchronizes with content while maintaining the original speaker's emotion, tone, and pace in over 90 languages.



ElevenLabs, an AI company founded by former Google engineers, has released ' Dubing v2, ' a voice dubbing model.

Introducing Dubbing v2: An innovative new dubbing model

https://elevenlabs.io/ja/blog/introducing-dubbing-v2



Dubbing v2 is a model that faithfully reproduces the emotions and expressions of the original speaker, making it easy to listen to without feeling any discomfort even when the language changes.

In the following video, the words spoken by YouTuber MrBeast are translated into various languages. Not only is the tone of his voice accurately reproduced, but the tempo and speaking style are also very similar to the original. Japanese is even included in the frog animation, and the dubbing is done with completely natural pronunciation.




ElevenLabs stated, 'The emotions and expressions of the original speaker are conveyed exactly as they are in any language. Instead of generating flat, disconnected audio as in conventional methods, it is generated directly based on the original performance, so the tone, tempo, speaking style, and even emotions are faithfully reproduced. It solves a problem that has remained unresolved in AI dubbing for many years, making the translated audio sound as if the original speaker is truly speaking.'

According to ElevenLabs, traditional AI dubbing systems relied heavily on text-based 'scripts.' While this allowed for accurate translation, it often resulted in the loss of natural, human-like speech patterns and the nuances of the original speaker's personality.

Dubbing v2 captures intonation and tone by directly interpreting the speaker's performance, reproducing them beyond language barriers. As a result, it achieves more natural and expressive dubbing, remaining faithful to the original speaker's way of speaking.



Rather than simply providing a literal translation, this system considers phrases, rhythms, and sentence structures that sound natural in each language, adjusting the translation to sound natural as spoken language while maintaining synchronization with the original content. The synchronization-conscious translation system automatically adjusts the start, end, and tempo, reducing the need for manual adjustments and resulting in a finish close to professional dubbing.

ElevenLabs stated, 'Professional dubbing can cost tens of thousands of yen per minute and requires many steps, including translators, voice actors, editors, and sound engineers. With Dubbing v2, this process can be automated. Creators and companies can create high-quality multilingual content without complex workflows or coordination with multiple vendors.'

Dubbing v2 is available on ' ElevenCreative ' and ' ElevenProductions '. It will also be available as an API at a later date.

One user who actually tried it posted their thoughts, saying, 'It looks good, but I was shocked when I saw the price. It's $22 a month (approximately 3500 yen) for 9 minutes of dubbed content.'




in AI, Posted by log1p_kr