How is the copyright issue of image generation AI interpreted overseas and in Japan?

Image generation AI such as Stable Diffusion, Midjourney, DALL E, etc. can automatically generate images by learning with a dataset composed of a huge number of photos and illustrations. However, the current situation is that the debate about the copyright of the images contained in this dataset has not caught up with the progress of AI technology. The Verge, an IT news site, summarizes the copyright issue of such image generation AI.

The scary truth about AI copyright is nobody knows what will happen next - The Verge

Q: Can I copyright what an AI model creates?
The US Copyright Office does not recognize copyright protection for purely machine-generated works.

``There is no copyright in the work of art created by AI,'' the US Copyright Office rejects AI's copyright-GIGAZINE

However, if the creator can prove that there was human input, copyright may occur. In September 2022, the US Copyright Office approved the copyright registration of manga written using the image generation AI Midjourney. This manga is an 18-page work drawn in a conventional manga format, such as characters, lines, and frame divisions.

Copyright registration is recognized for the first time for art created by AI - GIGAZINE

The artist who created the work, Kristina Kashtanova, has asked the U.S. Copyright Office to provide details of the production process to demonstrate that there was substantial human involvement in the creation of this cartoon. He said he had a request.

Andres Guadamuz, a legal scholar at the University of Sussex who specializes in AI and intellectual property, says the case of granting copyright to works generated with the help of AI will be an ongoing problem. For example, a picture generated by just entering 'Van Gogh's cat' would not be considered sufficiently involved to be copyrighted in the United States, Guadamuz said. But if you dig deeper into the settings, such as entering prompts to generate multiple images, tweaking settings to explore seed values as well, and so on, your work will be copyrightable.

In September 2022, the news that the picture drawn in Midjourney won first place at the art fair became a hot topic. The work shows a lot of human involvement, as the artist himself says it took weeks to sort out the prompts and then hand-edit the finished work.

The picture drawn by the image generation AI ``Midjourney'' took first place at the art competition and the human artist was furious - GIGAZINE

In the United Kingdom, where Stablity AI, which develops Stable Diffusion, has its headquarters, copyright protection is also recognized for works generated only by computers, and the author is defined as ``a person who made the necessary arrangements for the creation of the work''. increase. There are different interpretations as to whether this author is the developer of the AI model or the person who set up the AI model, but there is no doubt that the work itself by AI is subject to copyright protection.

However, Guadamuz said, ``The American Copyright Office is not a court. To sue someone for copyright infringement, you need to register with the Copyright Office, but the court decides whether it is legally enforceable. That's it,' he said.

Q: Can copyrighted data be used to train AI models?
Most AI models are trained on vast amounts of content collected from the internet, including text, code and images. Stable Diffusion, for example, consists of hundreds of millions of images collected from hundreds of sites, including photos and illustrations uploaded to personal blogs, illustration sites like DevianArt , and stock image sites like Shutterstock and Getty Image. I am learning on a dataset. AI research groups and startups justify using images in these datasets as ' fair use .'

Some people criticize this, saying, ``By entrusting the creation of datasets to universities and non-profit organizations, AI companies are fleeing legal liability.''

Criticism that ``companies are escaping legal liability by leaving the creation of datasets for AI learning to universities and non-profit organizations''-GIGAZINE

Professor Daniel Gervais, who specializes in intellectual property law at the Vanderbilt University School of Law, said that there are two points to determine whether a use is fair use: ``What is the purpose of the use?'' ``What is the impact on the market?'' It is very important, and it is said that it is also emphasized whether the original author's life is threatened by competing with the work.

Gervais interprets that training AI models on copyrighted data is 'relatively likely' to qualify for fair use. However, whether content generation is subject to fair use is another matter. In other words, even if it is possible to train an AI model using other people's data, generating a work using that model may be a copyright infringement.

For example, when generating an image with an AI model trained on a huge dataset, the image is basically not the image of the dataset as it is, but various elements are intricately intertwined and output. Therefore, it is highly unlikely that the resulting output will directly threaten the existing market. However, The Verge explains that it is possible to be sued for copyright infringement if a model learned from a particular artist's painting creates a painting that copies the artist's painting style.

Gervais said, ``If you let AI learn 10 Stephen King novels and instruct you to make a Stephen King novel, the work will be in direct competition with Stephen King.It is fair use Can you say? Probably not.”

Ryan Khurana, the development staff of image generation AI Wombo, commented, ``Intentionally outputting with a prompt using copyrighted works violates the terms of service of all major companies.'' . According to Khurana, many AI companies apply fair use by 'learning on datasets containing copyrighted works' and 'generating images with AI trained on those datasets.' It seems that they are aware that there is a difference in He also says, 'Rather than restricting training datasets, companies are trying to devise ways not to use AI models for piracy purposes.'

In addition, Mr. Gervais said that the legal interpretation of fair use may change significantly depending on the outcome of the dispute caused by the artist Andy Warhol 's use of the musician Prince 's photo in his work. .

What is the Andy Warhol case that is being discussed about fair use and cultural development? -GIGAZINE

by Andy Warhol: 32 Campbell's Soup Cans

Q: How can artists and AI companies reconcile?
Even if AI model training turns out to be fair use, the problem will not be solved. This is because the anger of artists whose work has been used to train AI is unbearable, and the interpretation of fair use cannot necessarily apply to any AI.

The easiest solution is to license the data and pay the artists. Matthew Butterick, a lawyer suing the company that produced the AI model dataset, told The Verge, ``Napstar was completely illegal in the early 2000s, but now there are Spotify, iTunes, etc. How did these systems come into being?Companies entered into licensing agreements and legally brought in their content, and all parties became equals and it worked. I think it will happen to AI,' he said.

Khurana also said, ``Music involves various licenses and rights holders, and the rules of copyright are overwhelmingly complicated. It is thought that it will evolve to have a system, ”he commented.

Efforts to adopt licensing agreements for AI datasets are actually underway, and Shutterstock has partnered with OpenAI, which develops DALL E for image generation AI, and pays rewards to the authors of the materials they learn from. We are announcing that we are building a system.

'OpenAI', which develops image generation AI 'DALL E', and 'Shutterstock', one of the largest photo materials and stock photos, have partnered to provide image generation functions to users within the next few months & authors of learning source materials To build a mechanism to pay rewards to - GIGAZINE

However, because the training dataset is so huge, it is not realistic to license all the images, videos, audio files, and texts contained in the dataset. Mark Lemley and Bryan Casey, authors of Fair Learning , a paper on the use of copyrighted material, argued that 'allowing any copyright claim is not rewarding the copyright owner, but permitting all use.' It can be said that it is equivalent to saying that there is no.

Regarding the legal interpretation of AI learning and production in Japan, Mr. Taichi Kakinuma of STORIA Law Office has announced the following views.

Image automatic generation AI such as Midjourney, Stable Diffusion, mimic and copyright | STORIA Law Office

Image automatic generation AI such as Midjourney, Stable Diffusion, mimic and copyright (Part 2) | STORIA Law Office

Attorney Kakinuma explains as follows whether it is possible to actually prohibit the illustration from being used for learning by stating, ``I will prohibit the use of the illustration I drew for AI learning.''

From the conclusion, it seems that a 'contract' is not formed just by making such a unilateral statement. In order for a contract to be formed, it is necessary for both parties to the contract to agree.
Therefore, a mere 'representation' is not enough to establish a contract stating that 'the use of this image for AI learning is prohibited.' required (e.g. images can only be accessed after accepting the terms of use).
However, in fact, even if a 'contract' is effectively established by making the user follow the terms of use, the effect is limited. This is because the “contract” is valid only between the parties who signed the “contract”.

In 2018, Mr. Kakinuma participated in the formulation of guidelines (PDF file) as a member of the Ministry of Economy, Trade and Industry's ' Contract Guidelines for Use of AI and Data' and 'Contract Guidelines for Manufacturing Startups .'

in Software,   Art, Posted by log1i_yk