Technology

How Deepfake Medical Images Work—and Why They Fool Doctors

AI can now generate synthetic X-rays so realistic that radiologists and other AI systems struggle to tell them apart from authentic scans, raising urgent questions about fraud, patient safety, and digital trust in healthcare.

R
Redakcia
4 min read
Share
How Deepfake Medical Images Work—and Why They Fool Doctors

When AI Learns to Fake an X-Ray

Generative artificial intelligence can write essays, compose music, and produce photorealistic portraits. Now it can also fabricate medical images convincing enough to deceive trained radiologists. A study published in the journal Radiology found that doctors correctly distinguished AI-generated X-rays from real ones only about 75 percent of the time—barely better than a coin flip when they were not warned fakes were present.

The finding has put a spotlight on a fast-emerging threat: deepfake medical images. Understanding how they are made, why they are so hard to detect, and what can be done about them matters for every patient, insurer, and clinician.

How Synthetic Medical Images Are Created

Modern deepfake X-rays rely on two main families of AI. Diffusion models, such as the open-source RoentGen system developed at Stanford Medicine, learn the statistical patterns of thousands of authentic scans and then generate new images pixel by pixel. Multimodal large language models like ChatGPT-4o can now produce anatomically plausible radiographs from a simple text prompt—"show a chest X-ray with a right-sided pneumothorax," for instance.

The result is an image that contains realistic bone density, lung markings, and soft-tissue contrast. Unlike crude photo edits, these synthetic scans are built from the ground up, making traditional manipulation-detection methods largely useless.

Why Doctors and AI Struggle to Spot Fakes

The Mount Sinai–led study, conducted across 12 research centres with 17 radiologists, tested detection in phases. When participants did not know fakes were included, only 41 percent spontaneously flagged the AI-generated images. Even after being told synthetic scans were present, average accuracy reached just 75 percent. Individual performance ranged from 58 to 92 percent, and years of experience made no statistical difference.

AI fared no better. Four leading multimodal models—GPT-4o, GPT-5, Gemini 2.5 Pro, and Llama 4 Maverick—scored between 57 and 85 percent, according to Nature's reporting on the study. The fakes look "too perfect," researchers noted: bones appear overly smooth, spines unnaturally straight, and fracture lines suspiciously clean—subtleties easy to miss in a busy reading room.

Real-World Risks

The consequences extend well beyond an academic exercise:

  • Insurance fraud — A fabricated fracture or tumour image could support a bogus injury claim worth thousands.
  • Diagnosis tampering — If a hospital's picture archiving system is breached, an attacker could insert or alter images, potentially leading to unnecessary surgery or withheld treatment.
  • Scientific manipulation — Fake scans could contaminate research datasets or be submitted as evidence in clinical trials and legal proceedings.

"The ability to generate highly realistic medical images with minimal effort introduces new vulnerabilities in healthcare systems that were never designed to question the authenticity of a radiograph," the study's authors wrote.

How the Field Is Fighting Back

Researchers and standards bodies are pursuing several defensive layers:

  • Invisible watermarks — Digital markers embedded at the moment of image capture tie a scan to a specific machine and technologist, making post-hoc fabrication detectable.
  • Cryptographic signatures — Hash-based verification ensures that any pixel-level alteration breaks the signature chain, alerting systems to tampering.
  • AI-based detectors — Purpose-built neural networks such as the Dual-Stage Knowledge Infusing Detector (DSKI) and EfficientNetV2 architectures have achieved detection accuracy above 90 percent in controlled tests, outperforming both human readers and general-purpose AI.
  • Blockchain audit trails — Decentralised ledgers can log every access and modification event for a medical image, creating an immutable chain of custody.

What Comes Next

None of these safeguards are yet standard in clinical practice. Most hospital imaging systems still store files without cryptographic provenance, and regulatory frameworks have not caught up with generative AI's capabilities. The gap between what AI can fabricate and what institutions can verify is widening—and closing it will require coordinated action from technology vendors, hospital networks, and regulators alike.

For now, the lesson is straightforward: in an era when seeing is no longer believing, medical imaging needs the digital equivalent of a tamper-proof seal.

Stay updated!

Follow us on Facebook for the latest news and articles.

Follow us on Facebook

Related articles