Jun 11, 2026 07:00:00

Research suggests there are signs of people becoming addicted to conspiracy theories several years in advance, and machine learning models have revealed characteristic language usage.

A study has revealed that users involved in conspiracy theory communities use distinctive language even when not discussing conspiracy theories, and that these tendencies were present even before their involvement in the community.

[2506.05086] Among Us: Language of Conspiracy Theorists on Mainstream Reddit

https://arxiv.org/abs/2506.05086

Distinctive language reveals likely conspiracy-community users across 500 million Reddit comments
https://phys.org/news/2026-06-distinctive-language-reveals-conspiracy-community.html

In situations like posting movie reviews on social media, commenting on food photos, or asking questions about science news, many people seem to use completely different language depending on the topic. For movies, they talk about the actors and story; for food, about the taste and recipe; and for science, about the mechanisms and evidence. In communities not directly related to conspiracy theories, the conspiracy theory aspect is less apparent.

However, even when the topic changes, linguistic habits such as expressions that convey anxiety, expressions that contain anger, and the frequency of mentioning conflict, illness, or death may remain. A research team from Politecnico di Milano and other institutions investigated whether such linguistic clues remain even in situations where conspiracy theories themselves are not being discussed.

The research team focused on Reddit, a message board-style social networking service originating in the United States. Reddit has themed communities called 'subreddits,' and in addition to 'r/conspiracy,' which deals with conspiracy theories, there are large communities dealing with news, movies, music, cooking, science, jokes, photography, and more. The research team collected approximately 510 million comments from 22 major communities, mainly focusing on activity from 2013 to 2023. For r/conspiracy, they collected available comments from its launch on January 25, 2008, to December 31, 2023, analyzing approximately 25.7 million comments and over 980,000 users.

The study first defined users who had commented at least once on r/conspiracy as 'users involved in conspiracy theories,' and then examined comments they had written in general communities such as movies, cooking, science, and news, comparing them to comments from users who were not participating in r/conspiracy.

The research team extracted 115 linguistic features from each comment, and averaged 110 features (excluding 5 related to punctuation) on a per-user basis. Furthermore, for each community, they prepared an equal number of users involved in conspiracy theories and users who were not, and evaluated how well the classifier could distinguish between the two groups using training and test data.

The machine learning model was able to distinguish between users involved in conspiracy theories and those who were not, using only common community language, with an average accuracy of 87%. Factors that helped in the classification included anger, anxiety, conflict, references to illness and death, and aggressive or emotional language.

Analysis tracking the flow of time revealed that these characteristics were also observed in comments made years before users became involved in conspiracy theories. The research team observed a slight increase in classification accuracy as users approached their involvement in conspiracy theories, but overall, they explained that the ability to distinguish between conspiracy theories and those made in the early stages before joining was maintained. The research team suggests that these results not only explain a change in language use due to joining a conspiracy theory community, but also suggest that individuals with pre-existing linguistic and psychological tendencies may gravitate towards such communities.

On the other hand, the research team emphasizes that the findings should not be used for individual surveillance or preemptive exclusion. The model is intended for group-level analysis and is not a tool for determining an individual's thoughts or psychological state. The team warns that misjudgments could lead to unfair surveillance and labeling, and that using it without transparency, consent, and objection mechanisms could cause serious harm.

Related Posts:

Jun 11, 2026 07:00:00 in Science, Posted by log1d_ts