Jun 21, 2024 11:00:00

Anthropic suddenly releases Claude 3.5 Sonnet, benchmark results rival GPT-4o

Anthropic, the developer of the chatbot AI 'Claude,' announced a new model, ' Claude 3.5 Sonnet ,' on June 21, 2024. This is the first model in the upcoming Claude 3.5 family, and its benchmark results are said to be comparable to OpenAI's

GPT-4o .

Introducing Claude 3.5 Sonnet \ Anthropic
https://www.anthropic.com/news/claude-3-5-sonnet

According to Anthropic, Claude 3.5 Sonnet has graduate-level reasoning skills, undergraduate level knowledge and coding skills, and is significantly better at understanding nuance, humor, and complex instructions than previous Claude models such as Claude 3 Opus. Anthropic claims that it is better at writing high-quality content in a natural, friendly tone.

Claude 3.5 Sonnet has improved performance and cost efficiency compared to Claude 3 Opus, and is twice as fast as Claude 3 Opus. In the agent coding evaluation, Claude 3.5 Sonnet solved 64% of the problems, beating Claude 3 Opus, which solved 38%.

According to the benchmark results published by Anthropic, Claude 3.5 Sonnet performed equal to or better than GPT-4o in five of the eight categories: inference (

GPQA ), knowledge ( MMLU ), coding ( HumanEval ), mathematical problem translation ( MGSM ), and text inference ( DROP ).

Anthropic said, 'Claude 3.5 Sonnet is the most powerful vision model ever, beating Claude 3 Opus in standard vision benchmarks,' and that it has particularly improved performance in tasks that require visual reasoning, such as interpreting charts and graphs, and can accurately transcribe text from incomplete images. Anthropic has released a movie in which Claude 3.5 Sonnet actually performs visual tasks.

Claude 3.5 Sonnet for vision - YouTube

Below is a table comparing the results of visual inference benchmarks with Claude 3 Opus, GPT-4o, and Gemini 1.5 Pro.

In addition, Anthropic announced that it has implemented a new feature called 'Artifacts' in Claude.ai. Artifacts is a function that displays content in a dedicated window instead of in the answer when you ask Claude to generate content such as code, text documents, or website designs. For more information on Artifacts, please see the following movie.

Claude 3.5 Sonnet for sparking creativity - YouTube

Regarding safety and privacy, Antropic asked the UK Artificial Intelligence Security Institute (UK AISI) to evaluate the security of Claude 3.5 Sonnet and made improvements before introducing it. In addition, Anthropic appeals that by integrating policy feedback from external experts, Claude 3.5 Sonnet can deal with various types of abuse.

Claude 3.5 Sonnet is available for free on Claude.ai and the Claude iOS app , with higher rate limits available to paid Claude Pro and Team subscribers. It is also available through Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI, with pricing of $3 per million input tokens and $15 per million output tokens.

Anthropic plans to continue improving the intelligence, speed and cost of the Claude 3.5 model, releasing Claude 3.5 Haiku and Claude 3.5 Opus in the second half of 2024.

Related Posts:

Jun 21, 2024 11:00:00 in Software, Web Service, Video, Posted by log1i_yk