Is the programming AI 'Copilot' learned from the GitHub source code a copyright infringement?



In June 2021, the software development platform GitHub released 'GitHub Copilot ', a function that complements the continuation of writing source code. There is a debate about the commercial use of Copilot, which is made by learning all the source code on GitHub regardless of license, by GitHub under Microsoft, 'Is there a copyright problem?' It's happening

Julia Reda – GitHub Copilot is not infringing your copyright
https://juliareda.eu/2021/07/github-copilot-is-not-infringing-your-copyright/

Copilot, released by GitHub in collaboration with the artificial intelligence research organization OpenAI , is a function that automatically complements the 'continuation' of the source code written halfway. The features are summarized in detail in the following articles.

'GitHub Copilot', a function that automatically complements the 'continuation' of the source code, has appeared on GitHub, with the cooperation of OpenAI --GIGAZINE



There is also a debate over Copilot, which has been described as 'much better than the controversial writing AI GPT-3,' asking 'Will it drive out engineers?'

Will 'GitHub Copilot', which automatically describes the continuation of the source code written by the programmer, destroy the engineer? --GIGAZINE



Meanwhile, software developer Nora Tindall posted a conversation with GitHub about Copilot on Twitter. In an inquiry to GitHub, Tindall asked, 'Did my GitHub account code be used for Copilot training?' GitHub said, 'All the code published on GitHub was used for training. We didn't distinguish the code by license type. ' 'What a hell they are, they literally don't shame,' Tindall accuses.



Not convinced that GitHub did not respect the copyrights of its users and used the code to learn AI without permission, Tindall said, 'We contacted the Legal Department of the Free Software Foundation and the Electronic Frontier Foundation. Participation in class action. If you are interested in this, please contact me. I can't do anything by myself. '



Tindall's tweet, which reported his interaction with GitHub, received more than 4000 likes at the time of writing. In addition, the thread of the online bulletin board Hacker News where the tweet was taken up said, 'If the learning data contains the code of the GPL license, Copilot should also be distributed based on GPL. Although some people called how is this different from a human being to learn to read the code, different significantly from the AI and human beings. human beings that is not intended to be distributed to copy ' comments and,' tweet attention Before collecting, only lawyers and business owners were interested in copying short code for programmers. How did it suddenly make everyone insist on maximizing copyright? Many posts were posted, including comments such as 'Are you blowing around?', And the discussion was confusing.

Regarding these discussions, Julia Leda, a member of the European Parliament belonging to the European Pirate Party , said, 'Because a large company, Microsoft, the parent company of GitHub, is trying to analyze free software and profit from it, copyright It may seem natural for copyleft advocates to try to prevent it using ”, and the idea of copyleft that allows code redistribution and modification is used by large companies. He understands the refusal to be done.



On the other hand, Mr. Leda said, 'Under EU copyright law, scraping of GPL-licensed code and other copyrighted works is legal regardless of license. In the United States, scraping is also fair use .' He pointed out that data mining the code for AI training does not constitute a copy that violates copyright law. Leda also concludes that the code generated by Copilot is not a derivative work as stipulated by copyright law because it is not of a nature that allows the original code to be unique. It was.

in Software, Posted by log1l_ks