OpenAI announces high-performance transcription AI 'Whisper', supports Japanese and can transcribe tongue twisters and lyrics with high accuracy
OpenAI, an AI development organization that has developed high-performance AI such as the image generation AI '
Introducing Whisper
https://openai.com/blog/whisper/
GitHub - openai/whisper
https://github.com/openai/whisper
Whisper is a transcription AI trained on a total of 680,000 hours of speech data collected from the Internet. OpenAI's blog post includes audio samples such as `` fast-talking sales talk '', `` K-POP songs '', `` French '', `` unique accent conversation '', and when you click `` REVEAL TRANSCRIPT '' Transcription with Whisper You can check the results.
Whisper is trained with speech data containing one-third of non-English speech, and also supports transcription of languages other than English, such as Japanese, French, and Korean. The accuracy of transcription differs for each language, but if you check the graph below showing the word error rate for each language, the word error rate for Japanese is 6.4%, which is quite high among the languages Whisper supports. can be transcribed. Click the image below to view the entire graph.
OpenAI publishes Whisper's model data and source code on its official GitHub repository . In addition, a demo using Google's Python execution environment 'Colaboratory' is also available, and you can easily run the demo with a Google account.
LibriSpeech.ipynb - Colaboratory
https://colab.research.google.com/github/openai/whisper/blob/master/notebooks/LibriSpeech.ipynb
Related Posts:
in Software, Posted by log1o_hf