Mar 17, 2023 12:00:00

OpenAI co-founder says 'We were wrong', a major shift from the dangers of AI to a policy of not opening data

OpenAI, an AI research institute that handles ChatGPT for language models and DALL E for image generation AI, aims to prevent abuse of AI technology and develop it in a socially friendly way. It is a non-profit organization

launched in 2015. Ilya Satsukivar, one of the founding members of OpenAI, said in an interview with The Verge, an IT news site, that the original idea of making AI open was a mistake.

OpenAI co-founder on company's past approach to openly sharing research: 'We were wrong' - The Verge
https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-launch-closed-research-ilya-sutskever-interview

On March 14, 2023, OpenAI released ``

GPT-4 '', the latest version of the language model used for ChatGPT. GPT-4 has already made the AI community noisy with its performance and flexibility, such as scoring in the top 10% of the bar exam and breaking through the Turing test by pretending to be a visually impaired person, but AI experts and research Some people have expressed disappointment that information disclosure about GPT-4 is insufficient.

According to Ben Schmidt, vice president of information design at AI company Nomic AI, it is essential to publish training data in order to check and correct biases and errors of AI.

However, OpenAI does not disclose the data set and training method used to build GPT-4, and Mr. Schmidt said in this regard, ``OpenAI is a paper introducing GPT-4, ``What is the content of the training set? We will not disclose it,” he said.

I think we can call it shut on 'Open' AI: the 98 page paper introducing GPT-4 proudly declares that they're disclosing *nothing* about the contents of their training set.pic.twitter.com/dyI4Vf0uL3
— Ben Schmidt / @benmschmidt@vis.social (@benmschmidt) March 14, 2023

In the technical report (PDF file) cited by Mr. Schmidt, OpenAI ``Considered both the competitive environment and safety of large-scale models such as GPT-4, this report includes architecture, hardware, and model size. We do not provide additional details regarding ware, training calculations, dataset construction, training methods, or the like.'

Amid criticism of OpenAI's decision, in an interview with The Verge, OpenAI's chief scientist and co-founder, Ilya Satsukivar, said that GPT-4's training data is being used for 'competitive' and 'safety' reasons. He explains that the choice to keep it private was “obvious.”

First, regarding the development competition in the AI field, Mr. Satsukivar said, ``GPT-4 was not easy to develop. , There are many other companies that want to make the same thing, and it can be said that this field is maturing in terms of competition.'

From a safety standpoint, Satsukivar also said, “These models are very powerful, and they will be even more powerful. It's going to be very easy, and that's why as the AI gets more powerful, you don't want to reveal it.'

The approach of making AI closed is a major policy change for OpenAI, which was established with the aim of open AI research. None other than Satsukivar wrote in a blog post in December 2015 when OpenAI was launched, 'Our purpose as a nonprofit is to build value for everyone, not our shareholders.' I'm here.

Asking The Verge why he changed his initial stance on sharing AI research, Satsukivar said, ``We were completely wrong . So if you believe general purpose artificial intelligence is going to be incredibly powerful, then open sourcing it is pointless and a bad idea.In a few years, open sourcing AI will be I think it will be obvious to everyone that it is not wise.'

Legal liability has been pointed out as another reason why OpenAI is keeping the details of GPT-4 private. Language models are trained on large amounts of text data, most of which has been scraped from the internet via web scraping , so the dataset may contain copyrighted material. . This is a common problem not only with language models but also with image generation AI that has learned paintings and illustrations.

When asked about this, Satsukivar said, ``In my view, training data is technology. It may not look like it, but it is. It's pretty much the same reason,' he said. However, it did not answer a question about whether OpenAI's training data is subject to copyright infringement.

As AI is developing rapidly, major IT companies are rushing to incorporate AI into their products, often leaving behind discussions about safety and ethics. For example, Microsoft, which revealed that Bing Chat, an interactive AI , is based on GPT-4, in March 2023 dismissed an internal team specializing in risk research related to AI technology.

It turns out that Microsoft fired the ``team that warns of the risks of AI technology'' - GIGAZINE

Jess Whittlestone, who is in charge of AI policy at the British think tank The Center for Long-Term Resilience, expressed some understanding of OpenAI's decision not to share details of GPT-4, but said that AI will be centralized. He also noted that there are concerns about

Whittlestone told The Verge, “The choice of whether or not to expose AI should not be left to individual companies. I believe that it is necessary to scrutinize whether it makes sense to disclose it to the world.' I was.

Related Posts:

Mar 17, 2023 12:00:00 in Software, Posted by log1l_ks