Survey results show that about 20% of Twitter's active accounts are spam / fake accounts

SparkToro , a website analysis tool, and Followerwonk , an analysis tool for Twitter, have jointly analyzed more than 40,000 accounts published on Twitter and released a report summarizing the results. According to this survey, 19.42% of the 'active Twitter accounts' that have been confirmed to be active on Twitter are spam / fake accounts.

SparkToro & Followerwonk Joint Twitter Analysis: 19.42% of Active Accounts Are Fake or Spam --SparkToro

Twitter acquisition Elon Musk said on May 13, 2022, 'Temporary acquisition of Twitter until data supporting Twitter's estimate that'spam and fake accounts are less than 5% of users' I'll put it on hold. '

In response to this, multiple media and stakeholders are conducting research on Twitter using SparkToro's free tool for Twitter analysis called Fake Followers . However, Spark Toro wrote, 'Fake Follower is just a tool to support informal free surveys, and we provide another survey tool for businesses,' and analysis using Fake Follower shows the survey results. I am afraid that it may be inadequate.

So SparkToro, in collaboration with Followerwonk, analyzed the 'percentage of spam and fake accounts' on Twitter. In addition, the 'spam / fake account' in this survey is 'regularly created tweet content by human hands, activity (checking other people's tweets, etc.) is confirmed on the timeline, Twitter It is said that it is an account whose behavior such as being involved in the ecosystem of Twitter is not confirmed. '

According to SparkToro, many fake accounts are benign and problem-free. Specific examples of fake accounts include @newsycombinator , which automatically posts website posts on Twitter, and @_restaurant_bot , which tweets about any restaurant found on Google Maps. SparkToro points out that there are many such auto-posting Twitter accounts on Twitter, and they are meaningful accounts for users, so 'no malicious intent or problems'.

On the other hand, spam accounts are used for advertising, disinformation forgery, phishing scams, malware distribution, stock and virtual currency manipulation, harassment and intimidation of users, etc. It is stated.

In addition, spam and fake accounts in this survey are biased toward relatively conservative interpretations, such as 'accounts that are irregularly operated by multiple people like corporate accounts' and 'semi-automatically operated'. 'Account' does not seem to be classified as a fake account.

The following is a graph summarizing the results of this survey. Light blue is the spam / fake account rate of active accounts that have tweeted more than once in the last 90 days, red is the spam / fake account rate of the entire Twitter account, and green is the data released by Twitter in the fourth quarter of 2021. It shows the stated 'Twitter spam / fake account rate'.

'Followerwonk Sample' at the left end of the above graph is a 'public account' and 'active account (post one or more tweets in the last 9 weeks) out of 1,047 million Twitter accounts indexed by Followerwonk. This is an analysis of 44,058 randomly selected accounts out of 130 million accounts that satisfy the condition of 'accounts that are used'. The reason for emphasizing active accounts is that the number of active accounts is close to mDAU (Monetizable Daily Active Usage), which Twitter uses as an index for monetization. Explains. As a result of the analysis, 8555 ( 19.42% ) of the active accounts were classified as spam / fake accounts. SparkToro writes, 'This data is the best data showing the spam / fake account rate of active Twitter users.'

'Spark Toro Fake Followers', which is the second from the left of the graph, shows the spam / fake account rate of 501,532 Twitter accounts (regardless of whether they are active accounts or not) collected by Spark Toro's Fake Follower. What was analyzed. 'This data is an analysis of the largest dataset of accounts available on Twitter, but we don't consider whether we tweeted in the last 90 days,' SparkToro wrote. According to this, the spam / fake account rate of Twitter accounts is 22.35% .

The third from the left in the graph, 'Followers of @Twitter,' is an analysis of 93.4 million followers on Elon Musk's Twitter account. According to this, 16.0% of the followers of Mr. Musk's Twitter account were spam / fake accounts.

The fourth item from the left in the graph, '@ElonMusk Active Followers,' is an analysis of the spam and fake account rates of active accounts (26.8 million) that follow Musk's Twitter account. As a result of the analysis, the spam / fake account rate is 23.42% .

The fifth data from the left of the graph is the result of randomly extracting 100 users who follow Mr. Musk's Twitter account and analyzing the spam / fake account rate. The spam / fake account rate in this case was an amazing 70.23% .

SparkToro explains why the spam / fake account rate when randomly extracting Musk's Twitter followers exceeded 70%: 'Popular accounts tend to be followed by spam / fake accounts more than other accounts. 'Accounts related to media coverage and public interest tend to be followed by spam and fake accounts more than other accounts.' 'Mr. Musk's account is Twitter's recommended account for new users, and these accounts There is a tendency for follow-up by spam and fake accounts to increase. '

SparkToro's spam account identification model is a machine that uses an account data set of 85,000, including 35,000 spam accounts and 50,000 non-spam accounts that the company purchased from vendors in July 2018. Trained by the learning process. According to this, spam accounts have features such as 'profile image is not set', 'followers are abnormally small', 'tweets are small', 'account name is constructed by specific keywords and patterns'. There seems to be. In addition, SparkToro's spam account identification model can correctly identify spam accounts with an accuracy of 65% or more, and even if spam / fake accounts are mistakenly identified as normal accounts, normal accounts are regarded as spam / fake accounts. 'There are few misidentifications,' says Spark Toro.

in Web Service, Posted by logu_ii