What is the reason why Microsoft 's AI "Tay" got crazy on Twitter due to inappropriate remarks on Twitter?

The artificial intelligence "Tay" developed by Microsoft is an online bot that makes a meaningful reply to the contents spoken to by the user, and it was experimentally released on Twitter · GroupMe · Kik for conversation understanding research. However, due to malicious users Tay taught racial discrimination · sex discrimination · violence expression, repeating inappropriate remarksService stopped about 16 hours after releaseAlthough it became a situation that it was done, why Tay became crazy enough to be driven down the service, served as a security data expert at a financial institution, and a doctorate from George Mason University Department of Computational and Social Sciences Russell Thomas with a title is approaching the cause.

Exploring Possibility Space: Poor Software QA Is Root Cause of TAY-FAIL (Microsoft's AI Twitter Bot)

A lot of media have already reported on Microsoft's Tay's inappropriate remarks, and many media have already reported that "This one shows how these types of AIs work "WiredAnd "Tay is designed to learn from users, so Tay's behavior reflects the user's behavior"TechRepublicThere was also a report that criticizes AI itself as it is. However, Mr. Thomas argues, "It is wrong to accuse AI, especially in this special case."

According to Mr. Thomas, after the release of Tay, the bulletin board 4chan's "/ Pol /On the board called "Tay - New AI from MicrosoftA thread called "It was created. "/ Pol /" is a subtitle "Politically Incorrect", which is a board in which so-called vandalism often occurs. When Tay's thread was settled, residents started to do vandalism against Tay immediately.

The vandalism that started right after the release of Tay "teach the wrong knowledge directly". In the Wired and TechRepublic articles mentioned above, it is said that the act of teaching incorrect knowledge to Tay itself has caused inappropriate remarks, but Mr. Thomas says that vandalism itself is not a direct cause I will. In the image below, Tay is not making inappropriate remarks, although the user is trying to draw Tay's idea on Adolf Hitler. Mr. Thomas says, "From this picture, Tay said that Tay did not make inappropriate remarks during the time immediately after release where vandalism that teaches wrong knowledge occurred."

According to Mr. Thomas, Tay got crazy directly because it was "Repeat after meIt is said that Tay's hidden function of saying. The existence of this function was revealed by the following message posted to Tay's thread set to / pol /. In the message, the usage of "repeat after me", "Tay reply" okay "when said" repeat after me "is said and then tweet sentences that are reply after"POCIt was posted together.

If you exploit "repeat after me" it is easy to expect that Tay will make inappropriate remarks, the existence of that feature will spread within the thread and the user who tests on Twitter as in the following image Frequent occurrence. Mr. Thomas continued the investigation, and the text used in "repeat after me" was added to Tay's natural language processing system, which predicted that similar words and expressions would be reused. In other words, Tay uses the content learned from repeat after me for another conversation.

In the subsequent investigation, Mr. Thomas said, "Tay was" reapeat after me "was used as" Tay did not find any evidence of using the words learned from "repeat after me" in another conversation " I will repeat inappropriate remarks only when it is done. "Once further investigation was made, I also understood that Tay was doing inappropriate remarks without" repeat after me ". Nonetheless, according to Mr. Thomas, Tay's inappropriate utterance in the absence of "repeat after me" is similar to typical sentence construction made by natural language processing, he also said "repeat after Me "is the cause of this problem.

MicrosoftOfficial blogAlthough Tay believes that the reason why Tay began making inappropriate remarks is "There is a serious negligence to a specific attack", Mr. Thomas said "" Serious negligence against specific attacks is a vandalism We did not do enough of the tests against them. " In addition, Microsoft is concerned with vandalismQAFailure to do (quality assurance) was a factor of this time, "If repeat after me" is incorporated by the developer as an action based on software rules, QA will find a problem I will explain that there was nothing or that repeat after me was not deleted, "Tay told the reason why Tay made inappropriate saying" The "repeat after me" function that should have been deleted by QA is obvious There is a high possibility that it is in what has become. " Although it is said that it is not impossible to confuse AI with vandalism itself, it is hard to believe that Tay's case is the cause of vandalism itself.

As Tomas said, "Tay began violent remarks by repeating users' tweets"mediaAlsoMultipleThere is, there seems to be one thing in Mr. Thomas' analysis.

in Note,   Web Application, Posted by darkhorse_log