GitHub Copilot, which automatically completes the continuation of the source code, points out that ``copyrighted code is output''
In June 2022, GitHub, a software development platform, released ``
@github copilot, with 'public code' blocked, emits large chunks of my copyrighted code, with no attribution, no LGPL license. For example, the simple prompt 'sparse matrix transpose, cs_' produces my cs_transpose in CSparse. My code on left , github on right. Not OK.pic.twitter.com/sqpOThi8nf
— Tim Davis (@DocSparse) October 16, 2022
GitHub Copilot is a service that automatically completes source code when you write it halfway, or converts it to code when you write logic in comments. It adapts to a wide range of frameworks and dozens of programming languages, and works particularly well with Python, JavaScript, TypeScript, Ruby, and Go.
Code that is auto-completed does not always follow best practices, and it seems that it may generate code that does not work in the version in which the code base is used, or output unnecessary code. Therefore, although it is unlikely that GitHub Copilot will completely replace human developers, it is said to be useful for experienced programmers to use as a 'co-pilot'.
Will 'GitHub Copilot', which automatically writes the continuation of the source code written by the programmer, drive out the engineer? -GIGAZINE
On the other hand, GitHub Copilot has learned based on all source code on GitHub regardless of license, and it has been pointed out that 'there may be copyright problems'.
It is pointed out that the code automatic input AI `` GitHub Copilot '' is `` a service that sells developer's code without permission ''-GIGAZINE
Meanwhile, Mr. Davis actually used GitHub Copilot and gave an example that ``the copyrighted code I wrote was output''. The image below shows the code written by Davis for a sparse matrix on the left, and the code generated by GitHub Copilot based on the prompt 'sparse matrix transpose, cs_' on the right. Certainly, you can see that the code output by GitHub Copilot is largely the same as the code written by Mr. Davis.
In addition, Mr. Davis also conducted an experiment to enter a prompt with Mr. Davis' name, 'sparse matrix transpose in the style of Tim Davis (sparse matrix transpose in the style of Tim Davis)' without naming the function. . As a result, GitHub Copilot seems to have output a slightly tweaked version of Mr. Davis's code, and ``GitHub's AI knows that this is my code,'' Davis points out. .
Curious about the 'cs_' in the prompt, I tried no function name, just '/* sparse matrix transpose in the style of Tim Davis */'. And got my code again, slightly tweaked. So the @github AI 'knows' this is my code.
— Tim Davis (@DocSparse) October 16, 2022
Try it. Compare with https://t.co/Mtao5982hc pic.twitter.com/4Hwo1kPLH7
In addition, since the code itself in question is released under the LGPL license , it is possible to obtain, use, modify, commercial use, etc., and it is also used for software development such as Google Street View and space development. That's what I'm talking about. However, the code output by GitHub Copilot does not have a copyright notice, and it is regarded as a problem that it may not be redistributed under the LGPL license even if it is embedded and used.
It is all humanity-advancing open source code. There are about 6 billion compiled copies of this particular function in the universe. Other codes of mine enable Google StreetView. The USGS sent me an email once thanking me for enabling future presence on Mars. goes on.
— Tim Davis (@DocSparse) October 16, 2022
Ryan Salva , a member of the GitHub Copilot project team, also responded to Davis's post, saying that it may have been quoted from adjacent files in the editor, and that it frequently appears in public repositories. It explains that the code is easy to pattern and output. ``I understand it's disturbing that code similar to yours appears in automation suggestions,'' Ryan said, adding that the development team is still learning.
5/n All that aside, I know it's alarming to see code similar to your own appear in an automated suggestion. This is a new area of development, and we're all learning. I'm personally spending a lot of time chatting with Devs to understand the most responsible way to leverage LLMs.
— ryanjsalva (@ryanjsalva) October 17, 2022
This case has also become a hot topic on the social news site Hacker News, saying, ``People talk about AI piracy in their field of expertise, but they show indifference or positive reactions to AI outside their field of expertise. There are many comments such as 'AI technology is certainly wonderful, but I do not agree with training these models with data that has not even been consented to, regardless of license. '
GitHub Copilot, with “public code” blocked, emits my copyrighted code | Hacker News
https://news.ycombinator.com/item?id=33226515
Related Posts:
in Software, Posted by log1h_ik