Meta releases 'SeamlessM4T v2', an improved version of AI 'SeamlessM4T' that can transcribe by simply inputting Japanese audio



The AI '

SeamlessM4T ' released by Meta in August 2023 can transcribe, translate into another language, and dub into another language just by inputting voice. Furthermore, on November 30, 2023, Meta announced “ SeamlessM4T v2 ,” an improved version of SeamlessM4T. This 'SeamlessM4T V2' not only improves the performance of SeamlessM4T, but also allows dubbing into other languages while maintaining the emotion and style.

Introducing a suite of AI language translation models that preserve expression and improve streaming
https://ai.meta.com/blog/seamless-communication/



Seamless Communication Translation Demo

https://seamless.metademolab.com/expressive




According to Meta, the recently announced 'SeamlessM4T v2' is a model called 'SeamlessExpressive' that translates from one language to another while preserving the speaker's emotions and style of expression. It is said to be composed of two types of models: ``Seamless Streaming'', a model that reduces delays to the extent that real-time conversation with speakers of another language is possible by performing translation. Meta describes SeamlessM4T v2 as 'the first system that can realize expressive cross-lingual communication in real time.'

The demo video of SeamlessM4T v2 released by Meta is below. A female voice speaking happily in English is dubbed into Spanish with almost the same tone.


The man's voice, which speaks quickly in French, was also successfully dubbed into English.


The woman's English, which she speaks in a sad tone, has also been translated into Spanish.


Even a whisper in English can be translated into French.


SeamlessM4T v2 also has a demo that allows you to try out its functions. To try the demo, first visit the link below.

Seamless Communication Translation Demo
https://seamless.metademolab.com/expressive

Click 'Try the demo' at the top or bottom of the page.



A notice regarding the use of the demo will be displayed, so read the terms of use carefully and check the box to agree.



Select the language you speak or the language you want to translate to, and click 'NEXT' at the bottom right. Please note that at the time of article creation, only English, Spanish, French, and German are supported.



If you are asked to access your camera and microphone, click 'Allow'.



When the camera image is displayed, press the 'Record' button and then read the text you want to process into the microphone.



The maximum reading time that can be processed is 10 seconds. The process will be performed automatically after 10 seconds. Once the processing is complete, three types of voice are presented: the dubbed voice with enhanced expressiveness, the dubbed voice before enhanced expressiveness, and the original voice, allowing you to compare each one.



Additionally, SeamlessM4T v2 can also be installed and operated locally. Detailed installation instructions can be found at the link below.

GitHub - facebookresearch/seamless_communication: Foundational Models for State-of-the-Art Speech and Text Translation

https://github.com/facebookresearch/seamless_communication



In addition, you can see how the demo of Meta's AI 'SeamlessM4T' released on August 22, 2023 was run to perform transcription and translation from the article below.

Meta has released an AI ``SeamlessM4T'' that can perform ``transcription'', ``translation'', and ``dubbing'' just by inputting Japanese audio, and a free demo that can be translated into English and Chinese has also been released, so I tried using it - GIGAZINE

in Review,   Software,   Video, Posted by log1r_ut