17 Challenges with AI Detection Products
When AI turned into a buzzword earlier this year (2023), there was a lot of discussion about different AI detectors and their effectiveness, including one called GPTZero. Some of these tools claim to be up to 99% accurate, but AI has also suggested that human-generated text is the result of chatbots when it is not. In June, Turnitin publicly acknowledged that its software has a higher false positive rate than the company originally stated. In July, Open AI pulled its detection tool, AI Classifier, because of its “low rate of accuracy” (Nelson 2023). False positive results can have negative impacts for students, as seen in the example of a Texas A&M professor who suspected his students were using AI to cheat on their final essays. He copied essays into ChatGPT to determine whether or not his students were cheating and gave out incomplete grades to students in his class, which caused serious problems for graduating seniors, including many who had in fact not used AI on their assignments.
In addition to the false positives, many AI detectors are biased against non-native writers, as discussed in this paper by Liang et al. (2023). The book AI for Diversity, by Roger Søraa discusses a wide range of bias in varied ways, including gender, age, sexuality, etc.
There are also some opinions that it will be easy to “catch” students who use AI tools because AI technology doesn’t sound human. While that may have been the case early on, these language models improve each time someone plugs in a new prompt. This article in the Chronicle got a lot of attention a few months ago when a student described how many of their peers were using this technology and challenging the notion about academic integrity policies. Consider how the story begins: “Submit work that reflects your own thinking or face discipline. A year ago, this was just about the most common-sense rule of Earth. Today, it’s laughably naive” (Terry, 2023). Faculty need to assume that at least some students are going to seek out this technology.
Some faculty may choose a more hands-on approach to AI-generated work. For example, if they suspect a student has used AI to produce work for an assignment, they might invite that student to have a one-on-one conversation and ask the student to explain their paper. In any case, it is especially important for faculty not to accuse students outright, as that will result in a lack of trust and will cause students to lose confidence and motivation to complete the course.
So what does this mean for AI detection software at this point? It means faculty can’t rely on detectors. Given all of this, it is even more important to design assignments with AI in mind – by integrating these tools into assignments, faculty can teach students how to use them ethically.