NEJM AI Automation Bias RCT — Physicians Pulled 14%p by ChatGPT Errors. AI Medical Establishment Shadow
ChatGPT-assisted diagnosis era. Patients and doctors increasingly rely on LLMs. But NEJM AI 2026.4 RCT — even AI-literacy-trained physicians get pulled 14%p toward wrong LLM answers. Automation bias clinical visualization.
Key Announcement
NEJM AI 2026.4 RCT: AI-literacy-trained physicians n=44, diagnostic reasoning cases, error LLM-exposed vs control, error-exposed diagnostic accuracy 73.3%, control 84.9%, gap -14%p (p<0.01)
JAMA 21 LLM Comparison (2026): ChatGPT-4, Claude 3, Gemini Pro etc 21 types, clinical case evaluation, 80%+ cases inadequate differential diagnosis failure, some accurate some dangerous
Automation Bias
Automation Bias: Over-reliance on automated system answers, self-judgment < system answer, aviation·automotive·medicine same, physicians not exempt
Clinical automation bias: LLM answer perceived as “correct”, self-doubt ↑, other possibilities ↓, error diagnosis possibility ↑
Study Design
Participants: 44 physicians (internal·emergency·general), AI literacy pre-training, LLM limitation awareness
Cases: Clinical scenarios, diagnosis·treatment reasoning, some intentionally erroneous LLM answers
Results: Error LLM exposure → diagnostic accuracy -14%p, control maintained judgment, ↓ self-awareness (physicians unaware of influence)
JAMA 21 LLM
Mass General Brigham 2026: 21 LLMs evaluated, 80%+ clinical cases inadequate differential diagnosis, confident wrong answers (hallucination), “absent clinical reasoning”
LLM Limits: Statistical pattern ≠ clinical reasoning, patient context lacking, physical exam·lab integration limit, training data bias
L72 Digital Verification·Establishment Dimension - 2nd Axis
Digital medicine light and shadow simultaneous visualization. Balanced establishment era. L72 MamaLift = DTx clinical establishment (light), L72 NEJM AI automation bias = AI shadow.
Patient ChatGPT Self-Diagnosis
Current trend (2026): Patient 50%+ medical info search (Google → ChatGPT), family emergency first inquiry, drug side effects·symptoms, pre-visit research
Risks: Automation bias (patients affected too), wrong diagnosis·treatment decision, ↓ doctor trust, emergency delay
AI Literacy - Patient Guide
Safe ChatGPT use: 1st info·education purposes, not diagnosis·treatment tool, doctor consult required, no single dependence (multiple sources cross-check), medical emergency = ER·911
LLM error likelihood: Rare diseases, multi-factor interactions, Korean healthcare system (US training data), latest drugs·research, personal history·drug combination
Physician·Clinic Guide
AI tool utilization: Diagnostic assist (not confirmatory), chart organization·summary, patient education materials, medical literature search, administrative·coding
AI tool avoid areas: Solo diagnosis·treatment decisions, prescription decisions, emergency triage, patient decision replacement
Automation Bias Response - Clinical Guidelines
FDA·AAMI guidelines: AI output verification mandatory, physician final decision, patient consent·education, error reporting system
Korean implications: Korean MFDS AI medical device guideline (2020~), partial clinical adoption (imaging·pathology), automation bias education absent, patient education policy needed
FAQ
Q. ChatGPT medical info safe? A. 1st info·education OK. Not diagnosis·prescription. Doctor consult required. No single dependence.
Q. Doctor ChatGPT use safe? A. Adjunct safe. Solo decision dangerous. NEJM AI shows even doctors -14%p pulled. Self-verification + cross-check required.
Q. AI literacy training helps? A. Some. But trained physicians still automation bias. Systematic verification·peer review·patient education needed.
Q. Will AI medicine advance? A. Definitely. But verification·balance·education together. L72 = AI establishment era balanced visualization.
Q. Patient protection? A. Doctor·pharmacist + emergency to ER. AI = adjunct. Medical decisions = human + AI combo.
Conclusion
NEJM AI automation bias RCT = AI medical establishment shadow visualization. Even doctors -14%p pulled and 80%+ LLMs fail clinical reasoning. L72 = 45 pillars + digital verification·establishment dimension (AI shadow 2nd axis). MamaLift Plus (DTx light) + NEJM AI (bias shadow) = balanced digital medicine era. Patient·doctor both need AI literacy.