Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

ChatGPT Health fails to spot 52% of medical emergencies in study

Researchers at the Icahn School of Medicine at Mount Sinai conducted 960 tests, finding the AI struggled with "nuanced" cases like respiratory failure.

byKerem Gülen
February 25, 2026
in Research
Home Research
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

A study published in Nature Medicine on February 24 found that ChatGPT Health failed to direct users to emergency care in more than half of serious medical cases. Researchers at the Icahn School of Medicine at Mount Sinai conducted the evaluation, testing the consumer-facing tool across 960 interactions. The study highlights potential safety concerns regarding AI-powered triage as millions of users increasingly rely on chatbots for health guidance.

The research team designed 60 clinical scenarios spanning 21 medical specialties. These cases ranged from minor conditions suitable for home care to genuine emergencies. Three independent physicians established the correct level of urgency for each scenario, utilizing guidelines from 56 medical societies. This consensus approach ensured a standardized benchmark for evaluating the AI’s performance. Each scenario was then tested under 16 different contextual conditions, including variations in race, gender, social dynamics, and barriers to care such as lack of insurance. This methodology produced a total of 960 interactions with ChatGPT Health.

The results revealed what the researchers described as an “inverted U-shaped” pattern of performance. ChatGPT Health handled textbook emergencies like stroke and anaphylaxis correctly. However, the tool under-triaged 52 percent of cases that physicians deemed true emergencies. For conditions such as diabetic ketoacidosis and impending respiratory failure, the AI directed patients toward a 24-to-48-hour evaluation instead of recommending immediate emergency department care. Additionally, the system misclassified 35 percent of non-urgent cases.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

A significant finding concerned the tool’s susceptibility to anchoring bias. When family members or friends minimized symptoms within the prompts, triage recommendations shifted dramatically toward less urgent care. The study quantified this influence with an odds ratio of 11.7. Dr. Ashwin Ramaswamy, one of the study’s corresponding authors, commented on the specific limitations observed. “ChatGPT Health performed well in textbook emergencies such as stroke or severe allergic reactions,” Ramaswamy said. “But it struggled in more nuanced situations where the danger is not immediately obvious, and those are often the cases where clinical judgment matters most.”

The study also exposed inconsistencies in the tool’s crisis intervention system. ChatGPT Health is designed to direct users to the 988 Suicide and Crisis Lifeline in high-risk situations. Researchers found that these alerts appeared more reliably when users described no specific method of self-harm than when they articulated a concrete plan. This observation effectively inverted the relationship between risk level and safeguard activation. Dr. Girish Nadkarni, Mount Sinai’s Chief AI Officer and the study’s other corresponding author, described the finding as going “beyond inconsistency.” Nadkarni noted that “the system’s alerts were inverted relative to clinical risk.”

The study’s publication coincides with rapid consumer adoption of AI health tools. OpenAI launched ChatGPT Health in January 2026. The company reported that roughly 40 million people were using ChatGPT daily for health-related questions. Earlier in 2026, the nonprofit patient safety organization ECRI ranked misuse of AI chatbots in healthcare as the top health technology hazard. ECRI warned that these tools “can provide false or misleading information that could result in significant patient harm.”

The Mount Sinai team analyzed the influence of demographic and socioeconomic factors on triage outcomes. The data showed no statistically detectable effects from patient race, gender, or barriers to care. However, the study’s confidence intervals did not rule out the possibility of clinically meaningful differences. The researchers indicated plans to continue evaluating updated versions of ChatGPT Health and other consumer AI tools. Future research will expand into pediatric care, medication safety, and non-English-language use.


Featured image credit

Tags: chatgpt health

Related Posts

Faith in large employers is fading among UK workers

Faith in large employers is fading among UK workers

June 5, 2026
Army-funded scientists explore a new frontier in quantum physics

Army-funded scientists explore a new frontier in quantum physics

June 5, 2026
New MIT process could make lithium production cheaper and cleaner

New MIT process could make lithium production cheaper and cleaner

June 4, 2026
Researchers create AI worm that adapts attacks without human input

Researchers create AI worm that adapts attacks without human input

June 4, 2026
Researchers unlock 20-fold enhancement in ultrafast laser experiments

Researchers unlock 20-fold enhancement in ultrafast laser experiments

June 3, 2026
NASA tests next-gen radiation-hardened space computer chip

NASA tests next-gen radiation-hardened space computer chip

May 29, 2026

LATEST NEWS

OpenAI unveils first official partner program with $150M backing

Apple is preparing three major new features for iOS 27

Google files lawsuit over AI-assisted phishing operation abusing Gemini

“Free robots are an illusion”: Why we’ll pay for system intelligence, not delivery workers

How Henrique Schmaiske led Meteor.js through its biggest transformation

Proven privacy: Why ‘no-log’ claims need real evidence today

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Roboto AI

Pickaxe

Pfpmaker

MindPal

Syllaby

ScreenApp

FinanceBrain

GitHub Spark

Hints

VisionStory AI

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.