OpenAI creates standard to evaluate 'how close large-scale language models are to human intelligence'



OpenAI, which develops large-scale language models such as GPT-4o, has revealed that it has created an evaluation scale to show how well the intelligence of large-scale language models is catching up with human levels.

OpenAI Sets Levels to Track Progress Toward Superintelligent AI - Bloomberg

https://www.bloomberg.com/news/articles/2024-07-11/openai-sets-levels-to-track-progress-toward-superintelligent-ai



Here's how OpenAI will determine how powerful its AI systems are - The Verge
https://www.theverge.com/2024/7/11/24196746/heres-how-openai-will-determine-how-powerful-its-ai-systems-are

An OpenAI spokesperson told the business newspaper Bloomberg that the new AI evaluation scale was shared at an all-hands meeting for OpenAI employees.

The scale consists of five levels, from level 1 to level 5, and the higher the level, the more the robot is evaluated as catching up with humans.

OpenAI states that the large-scale language model at the time of writing is level 1 and is approaching level 2. According to OpenAI, level 2 is evaluated as a system with basic problem-solving ability equivalent to a human with a doctoral level education. In addition, level 3 is set to 'act on behalf of the user,' level 4 is 'can create new innovations,' and the highest level 5 is set to 'can perform the work of the entire organization.'



OpenAI's new rating scale was introduced shortly after the company signed a partnership with Los Alamos National Laboratory.

OpenAI and Los Alamos National Laboratory collaborate to strengthen AI safety - GIGAZINE



OpenAI aims to develop artificial general intelligence (AGI) as 'a system that is highly autonomous at a level that exceeds humans in most economically valuable tasks.' CEO Sam Altman said in October 2023 that 'it will take about five more years to complete AGI,' but it will require enormous computing power and funding to complete AGI equivalent to level 5.

However, this rating scale is provisional and may be subject to adjustment in the future based on feedback from employees, investors, and the board of directors.

in Software, Posted by log1i_yk