GitHub Copilot, which automatically completes the continuation of the source code, points out that ``copyrighted code is output''



In June 2022, GitHub, a software development platform, released ``

GitHub Copilot '', a service that automatically complements the code that programmers want to write, to all users . GitHub Copilot is expected to dramatically improve development speed by using it well, but Tim Davis , a computer science professor at Texas A & M University , said, ``GitHub Copilot is protected by the copyright I wrote. It is outputting a code that is



GitHub Copilot is a service that automatically completes source code when you write it halfway, or converts it to code when you write logic in comments. It adapts to a wide range of frameworks and dozens of programming languages, and works particularly well with Python, JavaScript, TypeScript, Ruby, and Go.

Code that is auto-completed does not always follow best practices, and it seems that it may generate code that does not work in the version in which the code base is used, or output unnecessary code. Therefore, although it is unlikely that GitHub Copilot will completely replace human developers, it is said to be useful for experienced programmers to use as a 'co-pilot'.

Will 'GitHub Copilot', which automatically writes the continuation of the source code written by the programmer, drive out the engineer? -GIGAZINE



On the other hand, GitHub Copilot has learned based on all source code on GitHub regardless of license, and it has been pointed out that 'there may be copyright problems'.

It is pointed out that the code automatic input AI `` GitHub Copilot '' is `` a service that sells developer's code without permission ''-GIGAZINE



Meanwhile, Mr. Davis actually used GitHub Copilot and gave an example that ``the copyrighted code I wrote was output''. The image below shows the code written by Davis for a sparse matrix on the left, and the code generated by GitHub Copilot based on the prompt 'sparse matrix transpose, cs_' on the right. Certainly, you can see that the code output by GitHub Copilot is largely the same as the code written by Mr. Davis.



In addition, Mr. Davis also conducted an experiment to enter a prompt with Mr. Davis' name, 'sparse matrix transpose in the style of Tim Davis (sparse matrix transpose in the style of Tim Davis)' without naming the function. . As a result, GitHub Copilot seems to have output a slightly tweaked version of Mr. Davis's code, and ``GitHub's AI knows that this is my code,'' Davis points out. .



In addition, since the code itself in question is released under the LGPL license , it is possible to obtain, use, modify, commercial use, etc., and it is also used for software development such as Google Street View and space development. That's what I'm talking about. However, the code output by GitHub Copilot does not have a copyright notice, and it is regarded as a problem that it may not be redistributed under the LGPL license even if it is embedded and used.



Ryan Salva , a member of the GitHub Copilot project team, also responded to Davis's post, saying that it may have been quoted from adjacent files in the editor, and that it frequently appears in public repositories. It explains that the code is easy to pattern and output. ``I understand it's disturbing that code similar to yours appears in automation suggestions,'' Ryan said, adding that the development team is still learning.



This case has also become a hot topic on the social news site Hacker News, saying, ``People talk about AI piracy in their field of expertise, but they show indifference or positive reactions to AI outside their field of expertise. There are many comments such as 'AI technology is certainly wonderful, but I do not agree with training these models with data that has not even been consented to, regardless of license. '

GitHub Copilot, with “public code” blocked, emits my copyrighted code | Hacker News
https://news.ycombinator.com/item?id=33226515

in Software, Posted by log1h_ik