Introducing the high-speed and highly accurate character recognition AI model 'Mistral OCR', which can output formulas written in LaTeX and PDFs with figures and charts in Markdown format without destroying the layout, and can easily extract data to JSON



AI development company Mistral AI has announced the launch of Mistral OCR , an AI model that can recognize text contained in images and convert it into text data. In addition to being able to recognize complex mathematical formulas written in LaTeX, Mistral OCR can output documents in Markdown format without losing the relative positions of figures and tables contained in the document.

Mistral OCR | Mistral AI

https://mistral.ai/news/mistral-ocr

Mistral AI has published several examples showing the capabilities of Mistral OCR. First, the original data before processing is below. It includes not only text but also figures and tables.



The results of processing with Mistral OCR look like this. The conversion was possible without breaking the positional relationship between the figures and text. The contents of the table can also be converted without breaking the relationship between rows and columns. The OCR results are output in Markdown format, and the output results can also be compiled into structured data formats such as JSON. By incorporating Mistral OCR into AI services such as chat AI, scanned or photographed document data can be converted into a format that is easy for AI to process.



Documents containing complex mathematical formulas can also be OCR processed. The original data before processing looks like this.



The processing result is as follows. The formula was converted without any changes to its appearance.



Below is a table comparing the performance of Mistral OCR with 'Google Document AI', 'Azure OCR', 'Gemini 1.5 Flash', 'Gemini 1.5 Pro', 'Gemini 2.0 Flash' and 'GPT-4o'. Mistral OCR is capable of the most accurate OCR in all categories, including mathematical expressions and scanned data.



Mistral OCR was also developed with multilingual support in mind, and can recognize languages other than English, such as Russian and French, with high accuracy.



Mistral OCR is also characterized by its high operating speed, and can process up to 2,000 pages per minute on a single node. You can see how fast the processing is by playing the following 'Demo of converting a PDF file containing a chart to Markdown format'.

Mistral OCR on Alphafold paper - YouTube


Mistral OCR is free to use with Le Chat , and the API is available through Mistral AI's developer platform, la Plateforme. Mistral AI will soon be available on various cloud platforms, as well as on-premise for organizations with strict data privacy requirements.

in Software,   Video, Posted by log1o_hf