Oct 16, 2024 06:00:00

'Play 3.0 mini' can read text aloud in male and female voices in over 30 languages, including Japanese

The ' Play 3.0 mini ' has been released, which supports over 30 languages and can read text in multiple voices and accents. It also supports Japanese, and its selling point is the 'naturalness of the voice.'

Introducing Play 3.0 Mini - A Lightweight, Reliable And Cost-efficient Multilingual Text-to-Speech Model

https://play.ht/news/introducing-play-3-0-mini/

You can try out 'Play 3.0 mini' for free by registering your email address or Google account on the following page.

PlayAI Text-to-Speech API Playground
https://play.ht/playground/

I wanted to try it in Japanese, so when I accessed the site, I changed 'LANGUAGE' to 'Japanese.'

Select your favorite voice from 'VOICE'.

Enter text in the input box and click 'Generate Speech' to generate audio.

I tried out a few voices to read sentences, and while they do indeed produce natural accents, there were some choppy sounds and some words were not pronounced properly, with an average latency of 189 milliseconds.

I tried 'Play 3.0 mini' which can read text in male/female voice - YouTube

'Play 3.0 mini' supports more than 30 languages, including Japanese, Hindi, Arabic, Spanish, Italian, German, and French, and can be used through the API in addition to the trial site mentioned above. By building an application using the API, it is also possible to create a conversational model that responds to the user's voice.

Introducing Play 3.0 mini - a new compact Text to Speech model for realtime Voice AI - YouTube

Although some APIs are free to use, there is generally a fee involved.

According to the developer PlayHT, 'hallucinations' are seen in the large-scale speech language model, as in the large-scale conversational language model, and most text-to-speech models are prone to misreading alphanumeric characters, etc. Play 3.0 mini has adjusted this point and is compatible with use cases where important information such as phone numbers, passport numbers, and dates would be inconvenient if they were misread.

'Play3.0mini is the fastest and most conversational voice model, while significantly improving reliability and sound quality. This is the first of many efficient multilingual AI voice synthesis models we plan to release in the coming months, and our goal is to make the models smaller and more cost-effective so they can run on users' devices,' PlayHT said.

Related Posts:

Oct 16, 2024 06:00:00 in Software, Posted by log1p_kr