OpenAI's CEO posted on social media that 'o2 achieved 105% score on PhD-level benchmark', and the next AI model will achieve about twice the score in the high-difficulty test where GPT-4o only recorded 53.6%
OpenAI CEO Sam Altman posted on his X account on Sunday, November 3, 2024, 'I heard that o2 achieved a score of 105% on GPQA .' Although the official name of 'o2' is unknown, it is highly likely that the next-generation AI model being developed by OpenAI will have phenomenal performance.
i heard o2 gets 105% on GPQA
— Sam Altman (@sama) November 2, 2024
GPQA is a benchmark for measuring AI performance, consisting of 448 multiple-choice questions created by experts in biology, physics, and chemistry. The GPQA questions are quite difficult, with non-experts using Google searches scoring only 34% of the time, and PhD holders or students seeking a PhD only scoring 65%.
GPQA is already used as a benchmark for various high-performance AIs, with GPT-4o scoring 53.6%, Claude3 Opus scoring 50.4%, and Llama3 400b scoring 48.0%. The '105%' figure mentioned by CEO Altman far exceeds the scores of existing AIs, suggesting that the AI model called 'o2' has answering capabilities that exceed those of PhD holders.
The 'o2' mentioned by CEO Altman can be assumed to be a successor to 'GPT-4o' and 'OpenAI o1'. It is unclear when 'o2' will be released, but CEO Altman said on Friday, November 1, 2024 that 'multiple products will be released in 2025'. However, CEO Altman mentioned that 'there are no plans to release GPT-5 in 2025', and OpenAI has also denied the 'release of the next-generation AI model 'Orion' in 2024' reported by some media.
OpenAI's CEO descends on an online bulletin board and drops important information one after another, such as 'GPT-5 will not be released in 2024,' 'Multiple products will be announced in 2024,' and 'AGI can be realized with existing hardware' - GIGAZINE
Immediately after posting the problem, Altman wrote, 'Damn, I made a mistake with my account.'
Damn, wrong account
— Sam Altman (@sama) November 2, 2024
Related Posts:
in Software, Posted by log1o_hf