Sentences written by non-native English speakers are misidentified as ``created by AI''



With the emergence of chat AIs with advanced text creation capabilities such as ChatGPT and Bard, there is a growing demand

for tools to identify texts created by AI , and Japanese companies are also able to identify ChatGPT texts. It is reported that they have developed software to do so . However, the latest research has revealed that this type of tool distinguishes sentences written by people whose native language is not English as ``created by AI''.

GPT detectors are biased against non-native English writers: Patterns
https://www.cell.com/patterns/fulltext/S2666-3899(23)00130-7



Programs to detect AI discriminate against non-native English speakers, shows study | Artificial intelligence (AI) |
https://www.theguardian.com/technology/2023/jul/10/programs-to-detect-ai-discriminate-against-non-native-english-speakers-shows-study



GPT detectors can be biased against non-native English writers
https://techxplore.com/news/2023-07-gpt-detectors-biased-non-native-english.html

With the advent of AI that can generate highly accurate sentences such as ChatGPT, there is a need for a 'tool for detecting AI-generated sentences' to prevent the latest fraudulent acts using AI. In fact, OpenAI, which develops ChatGPT, which is famous as a text generation AI, has released a tool `` AI Text Classifier '' to see through `` sentences written with AI ''.

OpenAI releases a tool to see through ``text written with AI''-GIGAZINE



In some cases, this type of AI-generated sentence detection tool appeals for its high detection accuracy, such as 'detection accuracy of 99%'. However, it turns out that it actually makes quite a lot of false positives.

A research team led by James Zou, an associate professor in the Department of Biomedical Data Science at Stanford University, asks non-native English speakers to write an essay for the TOEFL , a test that measures English proficiency. I requested. Then, we conducted a test in which the 91 essays created were subjected to seven AI text detection tools to determine 'whether the text was generated by AI'.

As a result, it seems that more than half of the essays were determined to be generated by AI. It seems that some of the tools used discriminated the essay as ``created by AI'' with 98% accuracy. On the other hand, when a composition written by an American second-year junior high school student whose native language is English was applied to a similar tool, the tool determined that it was ``a sentence created by a human being with a probability of 90% or more''.



These AI text detection tools look at the so-called 'Text Perplexity'. This is a measure of how 'surprised' or 'confused' a generative language model is when trying to predict the next word in a sentence. If the model can easily predict the next word, the text will be ranked low in complexity, but if the model determines that the next word is difficult to predict, the text will be ranked high in complexity. .

Large scale language models (LLMs) such as ChatGPT are trained to churn out sentences with this low text perplexity. In other words, even if a human creates a sentence, if you choose a word that makes the text less confusing, the sentence will be determined to be 'created by AI'. Therefore, the research team points out that ``people who are not native English speakers tend to choose simpler words, so there is a higher risk that tools will identify them as ``sentences created by AI''.''

After identifying biases lurking in the AI text detection tool, the research team asked ChatGPT to 'rewrite the essays written by the subjects using more sophisticated words.' After that, when the essay rewritten by ChatGPT was applied to the AI text detection tool again, it seems that the AI text detection tool determined that it was all 'human-written essays'.



Based on these findings, the research team warned, ``Since tools that detect sentences by ChatGPT have a serious impact on people whose native language is not English, we need to think carefully about how to use these tools.'' I'm here. In addition, he also mentioned the possibility that search engines such as Google will judge sentences written by non-native English speakers as ``text created by AI'' and lower the ranking of the relevant content.

In addition, 'In the education sector, which is the most important market for AI text detection tools, non-native students are at increased risk of being falsely accused of cheating, which can negatively impact student academics and mental health.' ” also points out.

in Software,   Science, Posted by logu_ii