Former Mozilla engineer is a member of 'Coqui' who develops automatic transcription and automatic reading aloud models in open source.

Develop an open source Speech to Text (STT, voice-to-speech conversion) model that automatically transcribes the recorded voice into text, and a Text-to-Speech (TTS, text-to-speech conversion) model that reads out the input text with automatic voice. The company is ' Coqui '.


The founders of Coqui were originally from Mozilla, known for developing Firefox and Thunderbird. The founders thought that voice technology might be monopolized by a large company and would not be provided as open source, so they started an independent company and built an open source voice recognition engine.

The repository for Coqui's STT model 'Coqui STT' is hosted on GitHub. The feature of Coqui SST is that it does not have a voice library prepared in advance, but it can be read aloud by reading the pre-recorded voice.

GitHub --coqui-ai / STT --The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

Coqui STT is optimized for more than 80 languages, from languages with many users such as English, German, and French to minority languages such as Totonak , Hakachin, and Frisian, and can be downloaded from the following. At the time of writing the article, there is no Japanese model.

Models / Coqui > Models / Coqui

The repository of the TTS model 'Cqui TTS' is as follows. Coqui TTS will be available for download from the official website in the future.

GitHub --coqui-ai / TTS --a deep learning toolkit for Text-to-Speech, battle-tested in research and production

In addition, 'Coqui' is a coquí frog native to Puerto Rico, and it can be observed in the southern part of the United States, the Hawaiian Islands, and various parts of South America. This frog is 3-4 cm in length, but as the name suggests, it is known to make a loud noise. You can hear the coquí frog's bark in the following movie.

El Coquí de Puerto Rico. --YouTube

