Internet free encyclopedia "Wikipedia"It is very convenient for a small investigation thing, and it is often things that you take care of with work and hobbies. However, Wikipedia also has the aspect of collective intelligence, and because of the characteristic that anyone can easily edit it,VandalismThe act is inherent. The Wikipedia Foundation that runs Wikipedia seems to call such vandalism "Vandal (destroyer)", the history of Wikipedia is not an exaggeration to say the history of the fight against Vandal.

Wikipedia is an online encyclopedia project by expert "NupediaStarting in 2001 with Jimmy Wells and Mr. Larry Sanger as the predecessor. People from around the world accepted the feature that everyone can participate in the editing, and in 4 years from the start, more than 750,000 pages will be created, and in the end it will jump into the main stream of the net world .

However as the popularity of Wikipedia has increased and as many people join the editing work, pages with peculiar content suddenly increase. Inappropriate posts such as using Wikipedia as advertisement came to rampant and became a source of trouble of the Wikipedia Foundation. That is why Wikipedia has "a style that everyone can participate". Mr. Wells, who plagued me with this problem, held a keynote speech in 2006 saying "How should we keep articles quality?" Towards Wikipedian (nicknamed the people who edit Wikipedia). As a result, it was confirmed that the pace at which pages are created is significantly reduced, although the number of new postings with lower value has obviously decreased.

At that time, obviously malicious editing acts such as intentionally tampering with the contents of the page, filling out the page with meaningless words, changing the picture to a character image having no relation to the content starts to emerge It was. It is obvious that such "vandalism" acts are to trust Wikipedia's trust and it is an urgent matter to deal with.

Following the crisis of this Wikipedia, four Wikipedian will develop "Anti-Vandal Bot" (AVB) called "Troubleshooting Bot". AVB monitors relatively new posting / editing content, judges vandalism according to simple law, and has the function to automatically correct it to appropriate contents before vandalism. AVB's correspondence is limited to obvious vandalism to anyone's eyes, and in cases where the content is appropriateJudge by human and correct if necessaryWith such a combination of AVB and human power, we succeed in responding to exponentially increasing vandalism.

Aaron Halfaaker, who works at the Wikipedia Foundation, said that if there was no AVB, many Wikipedians would have been overwhelmed by Vandal, "Every time I see AVB, I saved Wikipedia from a defeat that makes a lot of people sorrowful It feels like I got it. "

In 2007, Jacob Carter developed "Cluebot" which evolved AVB normally. According to Mr. Carter, he has developed Cluebot with the witness that many pages edited properly have been erroneously modified by AVB or that contents apparently altered are left untouched. Mr. Carter was an active high school student at the time.

Cluebot employs an algorithm to decide whether or not it is vandalism by coding data on accuracy, inappropriate expression, personal attacks, etc. of grammar and making it point system. Cluebot succeeded in fixing more than 20,000 vandalism articles in the first two months of introduction. Since then, it has been used for a long period of 3 years and will continue to keep the quality of Wikipedia.

In 2010, Carter developed the next generation version of Cluebot "Cluebot NG"Machine learningTo incorporate the ability to improve the accuracy of vandalism determination. Kimo for machine learning is "accumulation of data", because the Wikipedia Foundation had about 60,000 patterns that classified vandalisms until then by human hands, Cluebot NG was able to fully demonstrate the machine learning effect I was able to demonstrate.

Corrections by Cluebot / Cluebot NG that can carry out more than 9000 editing work per minute exceeded 2 million cumulative total at the beginning of 2013. However, although Cluebot NG drives out Vandal without resting 24 hours a day, 365 days, it is impossible to make a vandalism judgment with 100% accuracy, and it is still possible to judge a sincere editing act as vandalism "False positiveThe problem of "was lying down. The last fortress to solve this false positive problem is still "human power".

Cluebot NG is a very difficult technique to find out sophisticated tampering like repeating fine tampering little by little to change contents. Most of the task of finding such a successful tampering act and correcting it appropriately is done by human beings. It is an old WikipedianSeaPhotoHe is a famous person as a vandal amendment, and he said that he has carried out a correction work of 55,000 so far. SeaPhoto says, "The corrective action against Vandal is fun, but to everyone who edits with Wikipedia, I will check for" it will hurt someone by that edit "in a moment I want you to make time. "

Mr. Halfaaker says that while Wikipedia always admits that a composition of "Vandal VS Human and Excellent Tool Association" exists, Wikipedia is not just a battle zone. The spirit of Wikipedia is to collaborate with strangers as complex social relationships lay down. In order to prepare the environment so that new users can participate in editing work with peace of mind, Wikipedia will continue to fight with vandalism and endless fighting.

