Dec 24, 2024 15:12:00

OpenAI is facing problems in the development of its next-generation AI model 'GPT-5,' and costs are rising

OpenAI

announced the AI model 'GPT-4' in April 2023, but its successor, the new AI model 'GPT-5', has not been announced at the time of writing. Although OpenAI is developing GPT-5 under the code name 'Orion', the Wall Street Journal, an economic newspaper, reported that it is facing various problems in the process.

OpenAI's Next Big AI Effort, GPT-5, Is Behind Schedule and Crazy Expensive - WSJ
https://www.wsj.com/tech/ai/openai-gpt5-orion-delays-639e7693

Since OpenAI released ChatGPT in 2022 and shocked the world, AI has been improving rapidly and is already becoming more prevalent in our lives. Analysts estimate that major companies could spend $1 trillion (about 157 trillion yen) on AI projects in the next few years, and at the center of this is OpenAI, which is 'ground zero of the AI boom.'

OpenAI has been working on GPT-5 for more than 18 months since the first half of 2023 to make significant advances in the technology underlying ChatGPT. In July 2023, the company filed a trademark application for 'GPT-5' with the United States Patent and Trademark Office (USPTO), and it was reported that 'GPT-5 will be released in the summer of 2024.' Microsoft, OpenAI's most important investor, also expected GPT-5 to appear in 2024.

However, OpenAI CEO Sam Altman appeared on the online bulletin board Reddit on November 1, 2024, and commented, 'We have some great products coming out in the second half of 2024! But that doesn't include a product called 'GPT-5,'' clarifying that GPT-5 will not be released in 2024.

OpenAI's CEO descends on an online bulletin board and drops important information one after another, such as 'GPT-5 will not be released in 2024,' 'Multiple products will be announced in 2024,' and 'AGI can be realized with existing hardware' - GIGAZINE

GPT-5 is expected to perform everyday human tasks, such as making appointments with someone or booking a flight. Existing AI also has a problem with hallucinations , which are when people assert incorrect information as if it were fact, but researchers hope that GPT-5 will significantly reduce hallucinations or at least allow the AI itself to question uncertain information.

There is no standard for how smart an AI model needs to be to be certified as 'GPT-5,' but so far GPT-5 does not seem to have performed to OpenAI's expectations. OpenAI has conducted at least two large-scale training runs of GPT-5, but each time new problems have arisen. The Wall Street Journal reports that so far there has not been enough progress to justify the enormous costs of running GPT-5.

Many AI models, including GPT-4, are trained over months on huge amounts of data in data centers equipped with tens of thousands of high-performance chips. According to CEO Altman, training GPT-4 cost more than $100 million, and future AI models are expected to cost more than $1 billion to train. When training such a large-scale AI model fails, it can be as damaging and disappointing as a space rocket exploding shortly after launch.

Therefore, researchers are trying to mitigate the damage as much as possible by conducting small-scale tests before large-scale training. OpenAI launched a training plan in mid-2023 to test new designs for GPT-5, but it was suggested that large-scale training could take a huge amount of time and cost.

It is generally believed that the more data an AI model uses for training, the better its performance, and OpenAI has previously used data scraped from the Internet, such as news articles, social media posts, and scientific papers. However, following its failure in 2023, OpenAI concluded that data obtained from the public Internet was not sufficient to train GPT-5, and that a more diverse and high-quality dataset was needed.

To solve this problem, OpenAI has adopted a method of creating data from scratch by itself. So they hired software engineers and mathematicians to write new software code and solve the mathematical problems for GPT-5 to learn. Jonathan Sidharth, CEO and co-founder of Turing, an AI infrastructure company that partners with OpenAI and Meta, commented, 'We are transferring human intelligence from the human mind to the machine mind.'

According to Turing, software engineers will be asked to write programs that efficiently solve complex logical problems, mathematicians will be asked to calculate the height of a pyramid made of one million basketballs, etc. These answers and the paths to them will be incorporated into the training data of the AI, which will hopefully help improve its performance.

But this manual process has the disadvantage of being very time-consuming. GPT-4 was trained on an estimated 13 trillion

tokens , but even if 1,000 people wrote 5,000 words per day, it would take months to generate 1 billion tokens. So OpenAI is combining data generated by other AI models to help create the training data for GPT-5.

In addition to solving these already difficult problems, OpenAI has also been hit by internal turmoil, such as the dismissal of CEO Altman , and headhunting by competitors. Competitors have reportedly offered millions of dollars (hundreds of millions of yen) to lure away talent, and in 2024, co-founder and chief scientist Ilya Satskivar founded the AI company Safe Superintelligence . More than 20 key executives and researchers left OpenAI.

In 2024, about a year after the release of GPT-4, competitors such as Anthropic and Google have released AI models with performance approaching GPT-4. Meanwhile, OpenAI's executives began to allocate resources to the development of new products related to GPT-4, and a dispute arose between the GPT-5 development team and the new product development team over limited computing resources.

Nevertheless, the GPT-5 development team, which conducted small-scale training in early 2024, began large-scale training in May. Initially, this training was expected to continue until November, but when training began, a problem arose that 'the training data was not as diverse as expected, which may limit learning.' This problem was not discovered in the small-scale training, but OpenAI, which had already spent so much time and money that it was impossible to start over from scratch, scrambled to find new data during the training process. It is not clear whether this strategy has borne fruit.

These issues suggest that the 'more-is-more' strategy that once drove the success of AI models is losing steam. 'We have one internet, so data is not growing,' Satskivar said at an AI conference after founding Safe Superintelligence. 'You could even say that data is the fossil fuel of AI.'

Faced with a variety of problems, OpenAI researchers are turning to 'inference' as a new approach to evolving AI models. The idea is that by spending more time 'thinking,' an AI model may be able to solve difficult problems that it was not trained on. In September 2024, OpenAI released ' o1 ,' an AI model that can perform inference incrementally, but Apple researchers have raised doubts about its accuracy, claiming that 'inference models, including o1, are not actually solving new problems, but are merely imitating patterns.'

Apple's AI researchers announce research results showing that current AI language models have lower reasoning ability for arithmetic word problems than elementary school students - GIGAZINE

The Wall Street Journal reported that 'a more advanced and efficient inference model could become the basis for Orion (GPT-5). OpenAI researchers are pursuing that approach and want to combine it with the more traditional method of more data, some of which could come from OpenAI's other AI models.'

In addition, CEO Altman announced a new inference model 'o3' on December 20, 2024, but did not reveal anything about GPT-5.

Altman then referenced an article in the Wall Street Journal, writing sarcastically, 'The Wall Street Journal is, I think, the best newspaper in the entire United States right now, but just hours after announcing o3, they published an article saying that 'AI's next leap is behind schedule and becoming prohibitively expensive.''

i think the wsj is the overall best us newspaper right now, but they published an article called 'The Next Great Leap in AI Is Behind Schedule and Crazy Expensive' many hours after we announced o3?! https://t.co/JkA83TNTfO
— Sam Altman (@sama) December 22, 2024

Related Posts:

Dec 24, 2024 15:12:00 in Software, Web Service, Posted by log1h_ik