By Doug Ward

Not surprisingly, tools for detecting material written by artificial intelligence have created as much confusion as clarity.

Students at several universities say they have been falsely accused of cheating, with accusations delaying graduation for some. Faculty members, chairs, and administrators have said they aren’t sure how to interpret or use the results of AI detectors.

Giant white hand pokes through window of a university building as college students with backpacks walk toward it
Doug Ward, via Bing Image Creator

I’ve written previously about using these results as information, not an indictment. Turnitin, the company that created the AI detector KU uses on Canvas, has been especially careful to avoid making claims of perfection in its detection tool. Last month, the company’s chief product officer, Annie Chechitelli, added to that caution.

Chechitelli said Turnitin’s AI detector was producing different results in daily use than it had in lab testing. For instance, work that Turnitin flags as 20% AI-written or less is more likely to have false positives. Introductory and concluding sentences are more likely to be flagged incorrectly, Chechitelli said, as is writing that mixes human and AI-created material.

As a result of its findings, Turnitin said it would now require that a document have at least 300 words (up from 150) before the document can be evaluated. It has added an asterisk when 20% or less of a document’s content is flagged, alerting instructors to potential inaccuracies. It is also adjusting the way it interprets sentences at the beginning and end of a document.

Chechitelli also released statistics about results from the Turnitin AI detector, saying that 9.6% of documents had 20% or more of the text flagged as AI-written, and 3.5% had 80% to 100% flagged. That is based on an analysis of 38.5 million documents.

What does this mean?

Chechitelli estimated that the Turnitin AI detector had incorrectly flagged 1% of overall documents and 4% of sentences. Even with that smaller percentage, that means 38,500 students have been falsely accused of submitting AI-written work.

I don’t know how many writing assignments students at KU submit each semester. Even if each student submitted only one, though, more than 200 could be falsely accused of turning in AI-written work every semester.

That’s unfair and unsustainable. It leads to distrust between students and instructors, and between students and the academic system. That sort of distrust often generates or perpetuates a desire to cheat, further eroding academic integrity.

We most certainly want students to complete the work we assign them, and we want them to do so with integrity. We can’t rely on AI detectors – or plagiarism detectors, for that matter – as a shortcut, though. If we want students to complete their work honestly, we must create meaningful assignments – assignments that students see value in and that we, as instructors, see value in. We must talk more about academic integrity and create a sense of belonging in our classes so that students see themselves as part of a community.

I won’t pretend that is easy, especially as more instructors are being asked to teach larger classes and as many students are struggling with mental health issues and finding class engagement difficult. By criminalizing the use of AI, though, we set ourselves up as enforcers rather than instructors. None of us want that.

To move beyond enforcement, we need to accept generative artificial intelligence as a tool that students will use. I’ve been seeing the term co-create used more frequently when referring to the use of large language models for writing, and that seems like an appropriate way to approach AI. AI will soon be built in to Word, Google Docs, and other writing software, and companies are releasing new AI-infused tools every day. To help students use those tools effectively and ethically, we must guide them in learning how large language models work, how to create effective prompts, how to critically evaluate the writing of AI systems, how to explain how AI is used in their work, and how to reflect on the process of using AI.

At times, instructors may want students to avoid AI use. That’s understandable. All writers have room to improve, and we want students to grapple with the complexities of writing to improve their thinking and their ability to inform, persuade, and entertain with language. None of that happens if they rely solely on machines to do the work for them. Some students may not want to use AI in their writing, and we should respect that.

We have to find a balance in our classes, though. Banning AI outright serves no one and leads to over-reliance on flawed detection systems. As Sarah Elaine Eaton of the University of Calgary said in a recent forum led by the Chronicle of Higher Education: “Nobody wins in an academic-integrity arms race.”

What now?

We at CTE will continue working on a wide range of materials to help faculty with AI. (If you haven’t, check out a guide on our website: Adapting your course to artificial intelligence.) We are also working with partners in the Bay View Alliance to exchange ideas and materials, and to develop additional ways to help faculty in the fall. We will have discussions about AI at the Teaching Summit in August and follow those up with a hands-on AI session on the afternoon of the Summit. We will also have a working group on AI in the fall.

Realistically, we anticipate that most instructors will move into AI slowly, and we plan to create tutorials to help them learn and adapt. We are all in uncharted territory, and we will need to continue to experiment and share experiences and ideas. Students need to learn to use AI tools as they prepare for jobs and as they engage in democracy. AI is already being used to create and spread disinformation. So even as we grapple with the boundaries of ethical use of AI, we must prepare students to see through the malevolent use of new AI tools.

That will require time and effort, adding complexity to teaching and additional burdens on instructors. No matter your feelings about AI, though, you have to assume that students will move more quickly than you.


Doug Ward is an associate director of the Center for Teaching Excellence and an associate professor of journalism and mass communications.

By Doug Ward

When Turnitin activated its artificial intelligence detector this month, it provided a substantial amount of nuanced guidance.

Montage of gophers and men trying to hit moles that pop up from the ground at a university quad
Trying to keep ahead of artificial intelligence is like playing a bizarre game of whack-a-mole.

The company did a laudable job of explaining the strengths and the weaknesses of its new tool, saying that it would rather be cautious and have its tool miss some questionable material than to falsely accuse someone of unethical behavior. It will make mistakes, though, and “that means you’ll have to take our predictions, as you should with the output of any AI-powered feature from any company, with a big grain of salt,” David Adamson, an AI scientist at Turnitin, said in a video. “You, the instructor, have to make the final interpretation.”

Turnitin walks a fine line between reliability and reality. On the one hand, it says its tool was “verified in a controlled lab environment” and renders scores with 98% confidence. On the other hand, it appears to have a margin of error of plus or minus 15 percentage points. So a score of 50 could actually be anywhere from 35 to 65.

The tool was also trained on older versions of the language model used in ChatGPT, Bing Chat, and many other AI writers. The company warns users that the tool requires “long-form prose text” and doesn’t work with lists, bullet points, or text of less than a few hundred words. It can also be fooled by a mix of original and AI-produced prose.

There are other potential problems.

A recent study in Computation and Language argues that AI detectors are far more likely to flag the work of non-native English speakers than the work of native speakers. The authors cautioned “against the use of GPT detectors in evaluative or educational settings, particularly when assessing the work of non-native English speakers.”

The Turnitin tool wasn’t tested as part of that study, and the company says it has found no bias against English-language learners in its tool. Seven other AI detectors were included in the study, though, and, clearly, we need to proceed with caution.

So how should instructors use the AI detection tool?

As much as instructors would like to use the detection number as a shortcut, they should not. The tool provides information, not an indictment. The same goes for Turnitin’s plagiarism tool.

So instead of making quick judgments based on the scores from Turnitin’s AI detection tool on Canvas, take a few more steps to gather information. This approach is admittedly more time-consuming than just relying on a score. It is fairer, though.

  • Make comparisons. Does the flagged work have a difference in style, tone, spelling, flow, complexity, development of argument, use of sources and citations than students’ previous work? We often detect potential plagiarism that way. AI-created work often raises suspicion for the same reason.
    • Try another tool. Submit the work to another AI detector and see whether you get similar results. That won’t provide absolute proof, especially if the detectors are trained on the same language model. It will provide additional information, though.
  • Talk with the student. Students don’t see the scores from the AI detection tool, so meet with the student about the work you are questioning and show them the Turnitin data. Explain that the detector suggests the student used AI software to create the written work and point out the flagged elements in the writing. Make sure the student understands why that is a problem. If the work is substantially different from the student’s previous work, point out the key differences.
  • Offer a second chance. The use of AI and AI detectors is so new that instructors should consider giving students a chance to redo the work. If you suspect the original was created with AI, you might offer the resubmission for a reduced grade. If it seems clear that the student did submit AI-generated text and did no original work, give the assignment a zero or a substantial reduction in grade.
  • If all else fails … If you are convinced a student has misused artificial intelligence and has refused to change their behavior, you can file an academic misconduct report. Remember, though, that the Turnitin report has many flaws. You are far better to err on the side of caution than to devote lots of time and emotional energy on an academic misconduct claim that may not hold up.

No, this doesn’t mean giving up

I am by no means condoning student use of AI tools to avoid the intellectual work of our classes. Rather, the lines of use and misuse of AI are blurry. They may always be. That means we will need to rethink assignments and other assessments, and we must continue to adapt as the AI tools grow more sophisticated. We may need to rethink class, department, and school policy. We will need to determine appropriate use of AI in various disciplines. We also need to find ways to integrate artificial intelligence into our courses so that students learn to use it ethically.

If you haven’t already:

  • Talk with students. Explain why portraying AI-generated work as their own is wrong. Make it clear to students what they gain from doing the work you assign. This is a conversation best had at the beginning of the semester, but it’s worth reinforcing at any point in the class.
  • Revisit your syllabus. If you didn’t include language in your syllabus about the use of AI-generated text, code or images, add it for next semester. If you included a statement but still had problems, consider whether you need to make it clearer for the next class.

Keep in mind that we are at the beginning of a technological shift that may change many aspects of academia and society. We need to continue discussions about the ethical use of AI. Just as important, we need to work at building trust with our students. (More about that in the future.)  When they feel part of a community, feel that their professors have their best interests in mind, and feel that the work they are doing has meaning, they are less likely to cheat. That’s why we recommend use of authentic assignments and strategies for creating community in classes.

Detection software will never keep up with the ability of AI tools to avoid detection. It’s like the game of whack-a-mole in the picture above. Relying on detectors does little more than treat the symptoms of a much bigger problem, and over-relying on them turns instructors into enforcers.

The problem is multifaceted, and it involves students’ lack of trust in the educational system, lack of belonging in their classes and at the university, and lack of belief in the intellectual process of education. Until we address those issues, enforcement will continue to detract from teaching and learning. We can’t let that happen.


Doug Ward is associate director of the Center for Teaching Excellence and an associate professor of journalism and mass communications at the University of Kansas.

CTE’s Twitter feed