By Doug Ward

When Turnitin activated its artificial intelligence detector this month, it provided a substantial amount of nuanced guidance.

Montage of gophers and men trying to hit moles that pop up from the ground at a university quad
Trying to keep ahead of artificial intelligence is like playing a bizarre game of whack-a-mole.

The company did a laudable job of explaining the strengths and the weaknesses of its new tool, saying that it would rather be cautious and have its tool miss some questionable material than to falsely accuse someone of unethical behavior. It will make mistakes, though, and “that means you’ll have to take our predictions, as you should with the output of any AI-powered feature from any company, with a big grain of salt,” David Adamson, an AI scientist at Turnitin, said in a video. “You, the instructor, have to make the final interpretation.”

Turnitin walks a fine line between reliability and reality. On the one hand, it says its tool was “verified in a controlled lab environment” and renders scores with 98% confidence. On the other hand, it appears to have a margin of error of plus or minus 15 percentage points. So a score of 50 could actually be anywhere from 35 to 65.

The tool was also trained on older versions of the language model used in ChatGPT, Bing Chat, and many other AI writers. The company warns users that the tool requires “long-form prose text” and doesn’t work with lists, bullet points, or text of less than a few hundred words. It can also be fooled by a mix of original and AI-produced prose.

There are other potential problems.

A recent study in Computation and Language argues that AI detectors are far more likely to flag the work of non-native English speakers than the work of native speakers. The authors cautioned “against the use of GPT detectors in evaluative or educational settings, particularly when assessing the work of non-native English speakers.”

The Turnitin tool wasn’t tested as part of that study, and the company says it has found no bias against English-language learners in its tool. Seven other AI detectors were included in the study, though, and, clearly, we need to proceed with caution.

So how should instructors use the AI detection tool?

As much as instructors would like to use the detection number as a shortcut, they should not. The tool provides information, not an indictment. The same goes for Turnitin’s plagiarism tool.

So instead of making quick judgments based on the scores from Turnitin’s AI detection tool on Canvas, take a few more steps to gather information. This approach is admittedly more time-consuming than just relying on a score. It is fairer, though.

  • Make comparisons. Does the flagged work have a difference in style, tone, spelling, flow, complexity, development of argument, use of sources and citations than students’ previous work? We often detect potential plagiarism that way. AI-created work often raises suspicion for the same reason.
    • Try another tool. Submit the work to another AI detector and see whether you get similar results. That won’t provide absolute proof, especially if the detectors are trained on the same language model. It will provide additional information, though.
  • Talk with the student. Students don’t see the scores from the AI detection tool, so meet with the student about the work you are questioning and show them the Turnitin data. Explain that the detector suggests the student used AI software to create the written work and point out the flagged elements in the writing. Make sure the student understands why that is a problem. If the work is substantially different from the student’s previous work, point out the key differences.
  • Offer a second chance. The use of AI and AI detectors is so new that instructors should consider giving students a chance to redo the work. If you suspect the original was created with AI, you might offer the resubmission for a reduced grade. If it seems clear that the student did submit AI-generated text and did no original work, give the assignment a zero or a substantial reduction in grade.
  • If all else fails … If you are convinced a student has misused artificial intelligence and has refused to change their behavior, you can file an academic misconduct report. Remember, though, that the Turnitin report has many flaws. You are far better to err on the side of caution than to devote lots of time and emotional energy on an academic misconduct claim that may not hold up.

No, this doesn’t mean giving up

I am by no means condoning student use of AI tools to avoid the intellectual work of our classes. Rather, the lines of use and misuse of AI are blurry. They may always be. That means we will need to rethink assignments and other assessments, and we must continue to adapt as the AI tools grow more sophisticated. We may need to rethink class, department, and school policy. We will need to determine appropriate use of AI in various disciplines. We also need to find ways to integrate artificial intelligence into our courses so that students learn to use it ethically.

If you haven’t already:

  • Talk with students. Explain why portraying AI-generated work as their own is wrong. Make it clear to students what they gain from doing the work you assign. This is a conversation best had at the beginning of the semester, but it’s worth reinforcing at any point in the class.
  • Revisit your syllabus. If you didn’t include language in your syllabus about the use of AI-generated text, code or images, add it for next semester. If you included a statement but still had problems, consider whether you need to make it clearer for the next class.

Keep in mind that we are at the beginning of a technological shift that may change many aspects of academia and society. We need to continue discussions about the ethical use of AI. Just as important, we need to work at building trust with our students. (More about that in the future.)  When they feel part of a community, feel that their professors have their best interests in mind, and feel that the work they are doing has meaning, they are less likely to cheat. That’s why we recommend use of authentic assignments and strategies for creating community in classes.

Detection software will never keep up with the ability of AI tools to avoid detection. It’s like the game of whack-a-mole in the picture above. Relying on detectors does little more than treat the symptoms of a much bigger problem, and over-relying on them turns instructors into enforcers.

The problem is multifaceted, and it involves students’ lack of trust in the educational system, lack of belonging in their classes and at the university, and lack of belief in the intellectual process of education. Until we address those issues, enforcement will continue to detract from teaching and learning. We can’t let that happen.


Doug Ward is associate director of the Center for Teaching Excellence and an associate professor of journalism and mass communications at the University of Kansas.

Comments are closed.

CTE’s Twitter feed