OpenAI to pay $1 million to $5 million per year in licensing fees to media outlets that use its content to train AI
OpenAI, which develops generative AI such as ChatGPT, has been accused of using news media content to train its AI, and some media outlets have filed lawsuits against it. In response to this, OpenAI is reportedly in talks to pay news media a license fee of $1 million to $5 million.
OpenAI Offers Publishers as Little as $1 Million a Year — The Information
OpenAI In Talks With Dozens of Publishers to License Content - Bloomberg
https://www.bloomberg.com/news/articles/2024-01-04/openai-in-talks-with-dozens-of-publishers-to-license-content
OpenAI's news publisher deals reportedly top out at $5 million a year - The Verge
https://www.theverge.com/2024/1/4/24025409/openai-training-data-lowball-nyt-ai-copyright
OpenAI offering media outlets as little as $1 million to use news articles for AI models: report
https://nypost.com/2024/01/04/business/openai-offering-media-outlets-as-little-as-1-million-to-use-news-articles-for-ai-models-report/
2023 was a year of remarkable progress in generative AI, including OpenAI's ChatGPT, Google's Bard, and Microsoft's Copilot. These generative AIs are based on large-scale language models (LLMs), which are known to use a variety of data available on the Internet for training. However, it has become an issue that the datasets used to train the AI include books by famous authors , and there have been calls for strict action from regulators .
Google, which develops the chat AI Bard, also initially announced that it would ' scrape everything published online for its AI ' to promote the development of AI tools. However, after receiving backlash from content creators, the company has now announced an option to prevent their websites from being used to train generative AI.
Google announces option to prevent your website from being used to train generative AI, but some point out that it's too late - GIGAZINE
Media outlets have pushed back against their content being used to train AI. The New York Times, the third-largest daily newspaper in the United States, has not only blocked crawlers from using its content to train generative AI, but has also sued OpenAI and Microsoft for copyright infringement.
Major daily newspaper New York Times sues OpenAI and Microsoft for copyright infringement - GIGAZINE
The New York Times claims that LLM's GPT-4, which is the basis for generative AI such as OpenAI's ChatGPT and Microsoft's Copilot, was trained on New York Times content, 'enabling the AI to output content that mimics the New York Times' writing style, resulting in the AI generating content that directly competes with The New York Times.'
In fact, one of the documents submitted by the New York Times as evidence to the court pointed out that the 'text about 2019 Pulitzer Prize winners' (left) output by GPT-4, the LLM on which ChatGPT is based, was almost a direct copy of the 'New York Times article' (right). The red text in the text is the part that is directly copied from the New York Times article, and it is obvious that the content is almost a complete copy.
In response to this backlash, it was also reported that OpenAI was in discussions with major American media outlets about the price and terms for licensing its content. According to the report, OpenAI is in discussions with major American newspaper companies such as Gannett , publisher of USA Today, News Corp , publisher of The Wall Street Journal, and IAC, operator of The Daily Beast, about licensing. Microsoft, OpenAI's largest investor, also appears to have participated in the discussions between OpenAI and the media.
Inside the News Industry's Uneasy Negotiations With OpenAI - The New York Times
https://www.nytimes.com/2023/12/29/business/media/media-openai-chatgpt.html
Some media companies have already allowed OpenAI to use their content by receiving payment from the company. Axel Springer, a major German media company that owns media such as Politico and Business Insider, signed a contract with OpenAI in December 2023 to allow ChatGPT to obtain data directly from Politico and Business Insider. The Associated Press also signed a contract to allow OpenAI to train AI models based on news articles.
AP, Open AI agree to share select news content and technology in new collaboration: https://t.co/MnqRD3HBHe
— AP CorpComm (@AP_CorpComm) July 13, 2023
And now, The Information has reported that OpenAI is offering to pay between $1 million and $5 million in licensing fees to use news media content to train its AI. The Verge, a foreign media outlet that covered the report, said, 'This is one of the first indicators of how much OpenAI plans to spend on data to train its AI.'
A similar case in which technology companies' use of content became an issue was the 'News' tab introduced on Facebook in 2019. Meta reportedly paid up to $3 million (approximately 435 million yen) per year to media outlets in licensing fees for news articles for this news tab. Google also agreed to pay a total of 100 million Canadian dollars (approximately 11 billion yen) per year to Canadian news organizations under the Online News Act, which requires Canadian news distributors to pay usage fees to news organizations. Based on these cases, The Verge pointed out that the licensing fees of '$1 million to $5 million' are roughly equivalent to existing contracts. Meanwhile, there were voices on Reddit that the licensing fees paid to media parties were too low.
Google agrees to pay $100 million a year to Canadian government to resume news distribution - GIGAZINE
It has also been revealed that OpenAI's annual sales have reached $1.6 billion (approximately 230 billion yen), and the company's monthly sales have reached up to $130 million (approximately 19 billion yen). OpenAI's annual sales in 2022 were only $28 million (approximately 4 billion yen), so sales increased 58 times compared to the previous year. In addition, OpenAI's annual sales in 2024 are expected to reach $5 billion (approximately 720 billion yen), so it is easy to predict that the license fees that OpenAI plans to pay to the media will not be much of a blow to the company.
Apple has also reportedly been in talks with multiple media outlets for a multi-year deal worth more than $50 million to train its AI using news content.
It turns out that Apple has discussed multi-year deals worth over $50 million with various media outlets to train generative AI on news articles - GIGAZINE
Related Posts:
in Software, Posted by logu_ii