Meta releases a speech translation system that makes full use of AI, and also supports minor languages that are difficult to collect text data



Meta AI , the artificial intelligence (AI) research division of Facebook's parent company Meta, has developed a speech translation system using AI that can translate oral conversations in near real time, and has released it as open source.

A new AI-powered speech translation system for Hokkien pioneers a new approach for a primarily oral language
https://ai.facebook.com/blog/ai-translation-hokkien/

Using AI to Translate Speech For a Primarily Oral Language | Meta
https://about.fb.com/news/2022/10/hokkien-ai-speech-translation/




AI-powered speech translation systems have basically focused on translating text. However, nearly 3,500 languages are spoken on Earth, many of which do not have logograms and are spoken only colloquially. Therefore, Meta points out that there are many languages that cannot prepare the ``massive amount of text data'' required to train the translation AI model.

To address this issue, Meta built the first AI-powered speech translation system for the Hokkien language used in Fujian, China. Hokkien is a language that is mainly used orally, and although it is widely spoken in Fujian, China, it is also a language that is difficult to prepare text data because it does not have a standard logogram.

Hokkien is a language used not only in Fujian, China, but also in areas such as Singapore, the Philippines, Taiwan, and Malaysia, and is spoken by nearly 46 million people worldwide.



Regarding this AI translation system for Hokkien, Meta said, 'It is part of the

Universal Speech Translator project, which aims to translate hundreds of spoken languages in real time, develop new AI methods, and eventually We aim to make it possible to translate from one language to another in real-time, even in many languages.We use voice communication to enable people to communicate with each other wherever they are, even in the Metaverse. I believe that we will be able to connect

You can see what real-time translation using the AI translation system for Hokkien developed by Meta looks like in one shot by watching the following video.



Many speech translation systems rely on transcription, but some spoken languages do not have logograms. For this reason, Meta aimed to perform phonetic-based translation instead of transcription-based translation in its AI translation system for Hokkien.

Therefore, by using a phonetic unit conversion that converts the input speech into a sequence of acoustic sounds, we can generate a waveform from it, rely on the text of the related language (Mandarin in the case of Hokkien), and various other methods. approach to speech-based translation.



Meta's Hokkien AI translation system is still in development and can only translate one sentence at a time. However, Meta is 'a first step towards a future where simultaneous translation between languages is possible. The technology we have developed can be extended to spoken languages as well as languages with many other logograms.' I have high hopes for future development.

Furthermore, at the same time as the announcement of the Hokkien AI translation system, Meta also has a large data set to help build a speech-to-speech translation system developed by an innovative natural language processing toolkit called ' LASER '. ' SpeechMatrix ' is also released. SpeechMatrix is a tool to enable other researchers to develop a translation system capable of speech-based translation from a spoken language to another language, with 136 language pairs and a total of 418,000 hours. A dataset consisting of audio data from

SpeechMatrix is based on unsupervised learning and can build AI models capable of high-quality 'speech-to-speech translation' without human annotation, which is typically required for training systems. It is possible to build a translation system even for languages that do not have 'labeled training data'.

In addition, Meta publishes Hokkien translation models , evaluation datasets, and research papers as open sources.

in Software,   Video, Posted by logu_ii