Jan 09, 2024 11:26:00

OpenAI argues that ``The New York Times tricked ChatGPT into copying the article'' and claims that training is fair use

In response to the American luxury daily newspaper The New York Times filing

a lawsuit against OpenAI and Microsoft for copyright infringement in December 2023, OpenAI counters that ``The New York Times' lawsuit is groundless.'' The article has been published.

OpenAI and journalism
https://openai.com/blog/openai-and-journalism

OpenAI claims The New York Times tricked ChatGPT into copying its articles - The Verge
https://www.theverge.com/2024/1/8/24030283/openai-nyt-lawsuit-fair-use-ai-copyright

OpenAI warns copyright crackdown could doom ChatGPT
https://www.telegraph.co.uk/business/2024/01/07/openai-warns-copyright-crackdown-could-doom-chatgpt/

In a blog post published on January 8, 2024, OpenAI revealed that the company and The New York Times have been negotiating for the use of articles until December 19, 2023. He said that learning about the lawsuit in the paper on December 27th was ``a surprise and a disappointment to us.''

The main thrust of The New York Times' complaint is that ChatGPT plagiarized the article. In response, OpenAI points out that ``The New York Times intentionally repeated the training data to make it appear as if the article was plagiarized.''

According to OpenAI, the New York Times article presented as evidence of ChatGPT plagiarism is an old article that has been reprinted on multiple sites. Even if ChatGPT used it as training data, OpenAI removed ChatGPT's function called ``Browse'' to avoid ``regurgitate'', which outputs it as is, so the normal procedure OpenAI claims that this prevents articles from being duplicated.

This suggests that OpenAI may have intentionally manipulated the prompts to include long excerpts of articles in order to force our model to parrot the claims made by The New York Times. Even when we do, our models usually don't behave like The New York Times pointed out. So this means they either told the model to parrot back, or cherry-picked it out of many attempts. It suggests that this is an example.'

While denying the copyright infringement claim by The New York Times, OpenAI emphasizes the need to use copyrighted content as essential to the development of AI.

In a submission to the UK's House of Lords Communications and Digital Committee, OpenAI said: ``Today, copyright can be used to protect information such as blog posts, photographs, forum postings, pieces of software code, government documents, etc. 'It is impossible to train major AI models without using copyrighted material, as it covers all kinds of expressions.'

OpenAI also points out that countries such as the EU, Japan, Singapore, and Israel have laws that allow copyrighted content to be used to train AI models, and such usage is considered fair use. He expressed the view that this is the case. The Japanese law that OpenAI pointed out refers to 'minor use incidental to information processing by computer and provision of the results,' as stipulated in Article 47-5 of the Copyright Act .

In light of this, OpenAI said, ``We believe The New York Times' lawsuit is without merit.Nonetheless, we look forward to a constructive partnership with The New York Times and are excited about the transformative potential of AI.'' We look forward to continued collaboration with news organizations to support their ability to produce high-quality journalism.'

Related Posts:

Jan 09, 2024 11:26:00 in AI, Software, Posted by darkhorse_log