How often do people who visit programming Q & A sites 'copy and paste' code and text?



When you come across a question or problem you don't know how to solve while programming, you often ask the online community for answers, or look at the questions of someone who has already faced the same problem as you.

Stack Overflow, an online site where users can ask and answer programming-related questions, finds out how many people are reusing code and text in the community by 'copy and paste'. It is open to the public.

How often do people actually copy and paste from Stack Overflow? Now we know. --Stack Overflow Blog
https://stackoverflow.blog/2021/04/19/how-often-do-people-actually-copy-and-paste-from-stack-overflow-now-we-know/



Stack Overflow, which is now an important community for many software developers, was founded out of dissatisfaction with ' the answers to coding questions are hidden in paywalls that can only be read for a fee.' It's so common to copy and paste the code featured on Stack Overflow so much that the joke 'What happens to the world if copying code from Stack Overflow is charged?' ..

So Stack Overflow decided to investigate how many people are actually copying and pasting code and text. Starting with a joke, the project was undertaken by David Gibson, a data analyst on Stack Overflow's product marketing team. Gibson and his team, who themselves have copied the code on Stack Overflow for many years, used a web tracking tool developed in-house to investigate.

The web tracking tool embedded in Stack Overflow does not record the copied content itself, but 'Is the copied question or answer or comment?' 'Is the copied part a code block?' We collected various metadata such as 'Is it plain text?' 'What is the rating score of the copied post?' 'Where is the area where the copied user lives?'

According to a two-week survey conducted from March 26th to April 9th, 2021, '1 in 4 users who accessed Stack Overflow did something within 5 minutes of accessing the page. It turns out that it was copied. While the number of questions, answers, and comments posted to Stack Overflow during the period was 7,305,042, the total number of copies was 40,623,987, which is more than five times as many as the questions and answers. It was done. I also heard that I copy blocks of code 10 times more often than text, and I've found that quite a few people copy Stack Overflow code at hand for reference or diversion.



The frequency of copies across Stack Overflow is tied to site traffic, with most copies occurring during weekday working hours in the area where the user lives. Looking at the place of residence of the users who actually copied it, it was 33% in Asia, 30% in Europe, and 26% in North America.

The graph below shows the 'evaluation score' of the user who made the copy. Since the account that registered as a user automatically has an evaluation score of '1', the group of users with the evaluation score of '0' that makes the most copies is an anonymous user who does not have an account. I will. According to Gibson, 86% of the copies are made by anonymous users.



In addition, the graph below shows how many users with a specific evaluation score copied on average per person, excluding users with an evaluation score of '0'. Looking at the plot with the number of copies on the vertical axis and the evaluation score on the horizontal axis, it can be seen that the higher the evaluation score, the lower the average number of copies. For this reason, Gibson suggests that newcomers are accelerating learning with copies, and more highly rated users may be tackling difficult challenges that Stack Overflow copies can't handle. did.



Users who post a question on Stack Overflow can indicate that the question has been resolved by 'approving' the best answer received. In this survey, we investigated how often 'approved answers' and 'unapproved answers' are copied. As a result, 47.56% of 'approved answers' and 'unapproved answers' The ratio was 52.44%. When calculating the average number of copies per post, it seems that 'approved answers' were copied 7 times per post, and 'unapproved answers' were copied 5 times per post.

Regarding this result, Gibson pointed out that there are cases where the 'approved answer' to the question is not defined in the first place. In Stack Overflow, even if many users vote for an answer and get a high rating score, it remains an 'unapproved answer' unless the questioner himself approves the answer.

In addition, Gibson and colleagues measured the number of copies based on the user's rating score for the post. Looking at the 'Answer' on the left side of the graph below, the number of times posts with evaluation scores '1-5', '6-25', '26-100', and '101-1000' were copied does not change much. .. On the other hand, if you look at the 'Question' on the right, the number of copies from the question with the evaluation score '1 to 5' is high. It is speculated that this is because the respondent copied the content of the question, reproduced it in the environment at hand, and then answered.



The reason why the number of copies of the answers with the evaluation score '1 to 5' and the answer with '101 to 1000' does not change so much is that most of the answers posted on Stack Overflow have the evaluation score '1 to 5'. It may be the reason. Looking at the graph below, which shows 'how many answers with a particular rating score were copied on average,' you can see that posts with higher rating scores tend to be copied more often.



Ben Popper, content director on Stack Overflow, argues that it's not a bad thing to reuse ideas that someone else has already come up with. 'This helps us learn, run code faster, and reduce frustration. Our entire website is run on knowledge reuse, making Stack Overflow a strong community. Is an altruistic

mentorship . '

On the other hand, you need to learn some basic points to prevent bugs and safety issues from getting in when copying, and you have a specific license to reuse some code. Pointed out that it may be necessary. With that in mind, he said he encourages all users to share the benefits of what the Stack Overflow community has created.



in Software,   Web Service, Posted by log1h_ik