Technology

How AI Medical Scribes Work—and Why They Spark Debate

AI medical scribes use ambient listening and large language models to auto-generate clinical notes from doctor-patient conversations, saving physicians hours of paperwork—but raising concerns about accuracy, upcoding, and rising healthcare costs.

R
Redakcia
4 min read
Share
How AI Medical Scribes Work—and Why They Spark Debate

The Documentation Crisis AI Scribes Aim to Solve

Physicians spend roughly two hours on paperwork for every hour of patient care. Electronic health records, billing codes, and clinical notes have turned doctors into typists—fueling burnout and eroding the quality of face-to-face medicine. AI medical scribes promise to reverse that trend by listening to appointments and writing notes automatically.

The technology has spread rapidly across hospitals and clinics, with major health systems deploying products from companies like Nuance (DAX Copilot), Abridge, and DeepScribe. But as adoption accelerates, a fierce debate has erupted over whether these tools are inflating healthcare costs—and whether the notes they produce are safe enough to trust.

How the Technology Works

An AI scribe runs as a background application during a clinical encounter—on a phone, tablet, or desktop. Using ambient listening, the system captures the natural conversation between doctor and patient without requiring anyone to press "record" or dictate into a microphone.

The audio passes through an automatic speech recognition (ASR) engine that converts speech to text. A large language model then analyzes the transcript, identifies clinically relevant information—symptoms, diagnoses, medications, exam findings—and organizes it into a structured medical note that follows standard formats like SOAP (Subjective, Objective, Assessment, Plan).

The draft note appears in the physician's electronic health record within minutes. The doctor reviews, edits if necessary, and signs off. According to the American Medical Association, AI scribes can save physicians up to an hour of documentation time per day, while cutting after-hours charting by 50 percent.

The Burnout Payoff

The benefits are well documented. A large-scale deployment tracked by the AMA found that generative AI scribes saved physicians an estimated 15,791 hours of documentation—equivalent to nearly 1,800 eight-hour workdays. Surveys show 84 percent of physicians reported improved communication with patients, while 82 percent said their overall work satisfaction increased.

Without screens and typing competing for attention, doctors maintain eye contact and listen more actively. For a profession where burnout rates exceed 50 percent, that shift matters.

The Upcoding Problem

The controversy centers on billing. AI scribes capture clinical detail more thoroughly than a rushed physician scribbling notes between appointments. That thoroughness translates into higher-complexity billing codes—and higher reimbursement. One health system reported a five-percent increase in the most expensive visit codes after deploying AI scribes, raising revenue by over $1,000 per provider per month.

A 2026 payer study estimated that AI-driven coding practices have added $2.3 billion in healthcare spending—$663 million in inpatient costs and at least $1.67 billion in outpatient costs. Crucially, the higher-acuity coding was not associated with increased clinical interventions, suggesting the notes capture complexity that was always present but previously underdocumented—or, depending on perspective, that the AI systematically inflates billing.

Insurers have responded with automated "downcoding" algorithms that reduce reimbursements, creating what a npj Digital Medicine policy brief calls a "coding arms race" between providers and payers.

Accuracy and Safety Risks

AI scribes report error rates of roughly one to three percent, but in medicine, even rare mistakes carry serious consequences. Research published in npj Digital Medicine identified several failure modes:

  • Hallucinations—the AI fabricates information, such as documenting a physical exam that never happened
  • Omissions—the most common error type, where critical details are left out of the note
  • Misattribution—patient statements get assigned to the clinician, or vice versa

In one documented case, an AI scribe generated a note for a pneumonia patient with sepsis that downgraded the treatment plan from immediate hospital admission to merely "considering" admission—an error researchers flagged as carrying a risk of death.

Speech recognition accuracy also varies across demographics. Studies show ASR systems perform worse on patients with non-standard accents or limited English proficiency, raising equity concerns about who gets accurately documented.

What Comes Next

The market for AI scribes is growing rapidly, with tools ranging from $99 to over $600 per physician per month. Regulatory bodies have yet to establish specific standards for ambient clinical AI, and adoption is outpacing formal validation. The core tension remains unresolved: a technology that genuinely reduces physician suffering may simultaneously drive up the cost of care for everyone. How the healthcare system balances those competing pressures will shape the future of clinical documentation.

Stay updated!

Follow us on Facebook for the latest news and articles.

Follow us on Facebook

Related articles