1. Home
  2. Science Stories
  3. Developing AI that doctors can trust
Reading time: 5 min.
Share

Developing AI that doctors can trust

AI helps clinicians quickly find the information they need, filtering hundreds of reports and documents, automating processes, and preventing something from being overlooked. But traditional AI isn't safe enough when it comes to high-level tasks, such as diagnosing. Here, explainability is crucial. Unlike non-explainable models that operate like black boxes, explainable AI follows a clinically acceptable reasoning and reveals how it arrives at a decision, so doctors can evaluate whether its output is trustworthy.

University of Twente researcher analysing hip fracture on X-ray with AI support
Adobe Stock

The risks of AI

“AI makes mistakes, and we want to prevent that. My research group is building explainable AI: AI that shows how it comes to certain conclusions,” says Adjunct Professor Maurice van Keulen, who teaches in the Bachelor’s and Master’s programmes in Computer Science, among others, at the University of Twente. He has also developed the master’s course Data Science, which is open to all students across the university.

After his team developed an AI system to detect hip fractures on X-rays, they ran into a fundamental problem. Academic literature claimed 95% accuracy of detection, and their independent study with data from the ZGT hospital found 93%. When they manipulated the images by removing the fractures and re-ran them through the model, the AI still flagged 25% of the altered images as showing a fracture, although there was none.

It took a radiologist in their team to figure out what the model had done. “It used visual cues such as wrinkled skin. Over 90% of X-rays of elderly people show fractures. So, the model found a pattern: wrinkled skin was associated with a fracture. This is called shortcut learning, and it is a big risk in medical AI. Models make decisions that appear accurate but reason in a non-medical way,” van Keulen says.

Necessity of AI explainability in healthcare

Traditional AI operates like a black box. Van Keulen explains: “Take ChatGPT: you insert input, and it generates an answer. However, you don’t know how it has come up with that output. When detecting fractures, a simple yes-or-no output isn’t enough. Without understanding the reasoning of the model, clinicians can't verify the output and, therefore, shouldn’t trust it.”

To enable inspection of AI’s reasoning, his team proposed a new method, called PIP-Net, for developing explainable-by-design AI. His team saw promising results when they developed an AI for ankle fracture detection using PIP-Net. The explainable AI model appeared to reason in a way similar to the Weber classification system: a standard framework used by radiologists to categorise fractures. Remarkably, it had learned to detect those patterns on its own by analysing examples.

But the explainability of the model also exposed it had used other kinds of clues. It had associated visual elements, such as parts of an emergency room bed in the X-ray, with the presence of a fracture. Since patients in emergency settings are more likely to be injured, the model used contextual cues rather than anatomical evidence to guide its predictions. “Thanks to explainability, we could intervene and refine the model, so it no longer relied on such arguments,” van Keulen says.

AI literacy in healthcare

“If we hadn’t had a radiologist in our team, we wouldn’t have discovered that the AI used wrinkled skin to predict hip fractures. That’s why it is crucial that doctors are part of the AI development process. They need to sufficiently understand how those systems work: not just to use them responsibly but also to help develop them. Essentially, it is the doctors who will rely on those AI systems, so we need to design them together.”

Come study at the University of Twente

Did you like this article? Then you might find these study programmes interesting as well.

Related stories