A writer sues OpenAI as ``a pirated book was used to learn ChatGPT''



ChatGPT , which creates high-precision sentences in an interactive format, learns a large amount of sentence data, allowing it to have natural conversations like humans and output summaries of articles and stories. It can generate advanced sentences across genres. However, on the other hand, since the content and origin of the training dataset is unclear, there are copyright issues such as class action lawsuits being filed alleging that it infringes the copyrights and privacy of countless people. increase. In addition, two other novelists are suing OpenAI, which developed ChatGPT, saying that ``ChatGPT obtains pirated copyrighted works and uses them for training.''

Authors Accuse OpenAI of Using Pirate Sites to Train ChatGPT * TorrentFreak
https://torrentfreak.com/authors-accuse-openai-of-using-pirate-sites-to-train-chatgpt-230630/



(PDF file) authors-vs-openai.pdf
https://torrentfreak.com/images/authors-vs-openai.pdf



On June 28, 2023, California-based Clarkson Law Firm told OpenAI, 'OpenAI uses vast amounts of text found on the Internet without asking or warning users for I have filed a complaint alleging that I have violated federal and state privacy laws by studying ChatGPT. The complaint calls for a temporary suspension of ChatGPT until security measures are in place, and seeks damages for scraping personal information without consent.

OpenAI developed by ChatGPT is filed a class action lawsuit over AI learning data - GIGAZINE



In addition, image generation AI Stable Diffusion and Midjourney also filed a class action lawsuit, claiming that they were trained with billions of copyrighted images, and the images were downloaded and used without compensation or consent from the artists. It has been.

A class action lawsuit is filed against image generation AI ``Stable Diffusion'' and ``Midjourney''-GIGAZINE



In addition, California-based Joseph Saveri Law Firm filed a complaint on June 28, 2023, claiming that ``OpenAI used pirated books to train the ChatGPT model.'' Participating in the lawsuit are novelists Paul Tremblay and Mona Awad , who work on science fiction and horror, and despite their own works being copyrighted, ChatGPT training without permission. claimed to be used. The complaint states, 'When you enter a prompt into ChatGPT, you can generate a synopsis/summary of the work, even if it is a copyrighted work. It is possible only if you are trained on the work.”

A distinctive feature of the complaint by Joseph Saveli Law Firm is that the complaint claims that ``OpenAI used a pirated site for training.'' In particular, the world's largest pirated e-book site 'Z-Library' has been revived and the number of users has increased, despite the fact that the operator has been arrested and criminally prosecuted by the US Department of Justice. It has been pointed out that there is a possibility that ``Z-Library's pirated books are acquired as learning data for ChatGPT by scraping.''

The pirated e-book repository ``Z-Library'' is used by more than 600,000 people worldwide, including students and teachers, despite repeated closures-GIGAZINE



Also, as a big problem, it is mentioned that OpenAI does not disclose the dataset for training ChatGPT. Old papers on ChatGPT only refer to two databases, 'Books1' and 'Books2'. Including web pages. ``OpenAI must have used pirated resources,'' Tremblay and Awad argue, since there is no official database containing so many books. In conclusion, the complaint seeks legal damages of $ 150,000 (about 21.7 million yen) for each work, saying that OpenAI has infringed copyright.

At the `` House Hearing on Copyright and AI '' held in May 2023, the U.S. Congress has indicated that it is ``not going to overreact to copyright issues related to generative AI.'' However, this lawsuit has attracted the attention of users of generative AI and rights holders of various images and documents. It is expected that it will be possible to clarify whether or not using it for copyright infringement will be clarified.

in Web Service, Posted by log1e_dh