Ex-Google engineer ElevenLabs launches Scribe, the world's most accurate automatic speech recognition model

ElevenLabs — Meet Scribe the world's most accurate ASR model | ElevenLabs
https://elevenlabs.io/blog/meet-scribe

ElevenLabs is launching its own speech-to-text model | TechCrunch
https://techcrunch.com/2025/02/26/elevenlabs-is-launching-its-own-speech-to-text-model/
'Scribe' is ElevenLabs' first automatic speech recognition model and the world's most accurate transcription model. Built to handle the unpredictability of real-world speech, Scribe includes word-level timestamps, speaker diarization , and audio event tagging, and can transcribe audio in 99 languages.
Scribe is designed with an emphasis on accuracy, and in benchmark tests using the FLEURS dataset, a common dataset for objective evaluation of speech recognition AI, and Common Voice, a free database for developing speech recognition software, Scribe has consistently outperformed competing models such as Google's Gemini 2.0 Flash , OpenAI's Whisper Large V3 speech-to-text model, and Deepgram's Nova-3 speech-to-text model.
Scribe has the lowest automatic transcription word error rate in Italian (98.7%), English (96.7%), and 97 other languages. Scribe has achieved 'excellent automatic speech recognition (word error rate less than 5%)' in more than 25 of the 99 languages. These include English, Italian, French, German, Hindi, Indonesian, Japanese, Malayalam, Polish, Portuguese, Spanish, and Vietnamese.
Scribe also significantly reduces word error rates in languages that traditional speech recognition models are not fully capable of handling, such as Serbian, Cantonese, and Malayalam, even in languages where competing models have word error rates of over 40%.
Developers can integrate Scribe via ElevenLabs' Speech to Text API to get structured JSON-formatted transcripts with speaker diarization, word-level timestamps, and non-speech event markers (e.g. laughter). Additionally, a low-latency version of Scribe for real-time applications will be released soon.
Creators and businesses can also use Scribe directly from the ElevenLabs dashboard to upload audio and video files and generate formatted transcripts.
According to ElevenLabs, Scribe's transcription service is available for $0.40 per hour. Regarding this pricing, technology media TechCrunch pointed out that 'This is a competitive price, but some competitors are now offering transcription services at lower prices while differentiating their features.'
Related Posts:
in Software, Posted by logu_ii