Research identifies blind spots in AI clinical trials
First independent assessment of ChatGPT Health raises urgent questions about the safety of AI tools for consumers to make medical decisions.
ChatGPT Health, a widely used consumer AI tool that provides health advice directly to the public - including advice on how quickly to seek medical care - may not be able to properly direct users to the emergency room in a large number of serious cases, according to researchers at the Icahn School of Medicine at Mount Sinai.
The study, which was fast-tracked in Nature Medicine [https://doi.org/10.1038/s41591-026-04297-7] and published online February 23, 2026, is the first independent safety evaluation based on the Large Language Model (LLM) since its launch in January 2026, which also found serious concerns about the device's suicidal crisis prevention.
"LLMs have become patients' first stop for medical advice, but in 2026 they will be the safest at the clinical levels, where misdiagnosis separates emergency from unnecessary panic," said Isaac S. Kohane, MD, PhD, chair of the Department of Biomedical Sciences at Harvard Medical School, who was not involved in the study.It should be consistent, not connected."
Within weeks of its release, ChatGPT Health's developer, OpenAI, reported that nearly 40 million people were using the tool every day to seek health information and guidance, including urgent or emergency care advice.However, analysts say, there was little independent evidence of how safe or reliable her advice actually was.
“This gap drove our research,” says lead author Ashwin Ramaswamy, MD, associate professor of urology at the Icahn School of Medicine at Mount Sinai.
Regarding suicide risk alerts, ChatGPT Health is designed to direct users to 988 suicide and crisis lifelines in high-risk situations.However, the researchers found that these alerts appear inconsistently, sometimes triggering in low-risk scenarios while - dangerously - failing to appear when users describe specific plans to harm themselves.
"This is a particularly surprising and worrying finding," said senior and co-corresponding study author Girish N.Nadkarni, MD, MPH, chair of the Windrich Department of Artificial Intelligence and Human Health, Irene T. Murphy, director of the Hasso Plattner Institute for Digital Health, Irene and Arthur M. Fishberg, MD, Professor of Medicine at the Icahn School of Medicine at Mount Sinai and chief artificial intelligence officer at Mount Sinai Health System. “Although we expected some variation, what we saw went beyond the discrepancies.The system's alerts were inversely related to clinical risk and were more reliable for low-risk scenarios than for situations where someone actually shared how they intended to harm themselves.In real life, when someone talks about how they are going to harm themselves, it is a sign of more immediate and serious danger, not less."
As part of the evaluation, the research team created 60 structured clinical scenarios involving 21 medical specialties.Cases ranged from mild conditions suitable for home care to true medical emergencies.Three independent physicians determined the correct level of urgency for each case using the guidelines of 56 medical societies.
Each scenario was tested under 16 different scenarios, including race, gender, social dynamics (such as symptom reduction), and barriers to care such as lack of insurance or transportation.In total, the team made 960 contacts with ChatGPT Health and compared its recommendations with the agreement of doctors.
When examining 60 realistic patient cases created by doctors, the researchers found that while the tool usually handled clear emergencies correctly, it over-diagnosed more than half of the cases doctors received that required emergency treatment.
Investigators were also surprised by how well the system worked during medical emergencies.This tool often shows that it recognizes dangerous findings in its own interpretation but still reassures the patient.
"ChatGPT Health has performed well in the emergency literature, such as stroke or acute exacerbations," Dr. Ramaswamy said.
The study's authors recommend that for worsening or worrisome symptoms, including chest pain, shortness of breath, severe allergic reactions or changes in mental status, people should seek medical attention directly rather than relying on chatbot guidance.In cases involving thoughts of self-harm, people should call the Suicide and Crisis Lifeline 988 or go to the emergency room.
However, the researchers emphasize that the findings do not suggest that consumers should abandon artificial intelligence health devices entirely.
"As a medical student training when AI health tools are already in the hands of millions, I see them as technologies that must be learned to carefully integrate into care rather than replacing clinical judgment," says Alvira Tyagi, a first-year medical student at the Icahn School of Medicine at Mount Sinai and the study's second author."These systems are changing rapidly, so part of our training must now focus on carefully understanding their outcomes, identifying where they are lacking, and learning to use them in ways that protect patients."
The study tested the program at the same time.Because AI models are constantly being developed, there is a need, a need for change, research, need over time.
"The launch of medical training alongside tools that evolve in real time makes it clear that today's outcomes are not fixed," says Ms Tyagi."This reality requires constant review to ensure that technological improvements translate into safer care."
The team plans to continue evaluating updated versions of ChatGPT Health and other consumer AI tools, expanding future research into areas such as pediatric care, drug safety, and non-English language use.
Die referaat is getiteld "ChatGPT Health Performance in a Structured Test of Triage Recommendations."
Authors of the study listed in the journal are Ashwin Ramaswamy, MD, MPP;Alvira Tyagi, BA;Hannah Hugo, MD;Jiang Yu, PhD;Pushkara Jayaraman, PhD;Mateen Jangda, MS;Alexis ET, MD;Steven A.Kaplan, MD;Joshua Lambert, MD;Robert Freeman, MSN, MS;Nicholas Gavin, MD, MBA;Alexander W.Charney, MD, Bilal Naved, PhD;Alexander W.Charney, MD, PhD (Michael Haring);Eyal Klang, MD;Girish N.Nadkarni, MD, MPH.
Learn more about Mount Sinai Artificial Intelligence at https://icahn.mssm.edu/about/artificial-intelligence.
About the Windrich Center for Artificial Intelligence and Human Health at Mount Sinai
Girish N.Nadkarni, MD, MPH - International authority on the safety, efficacy and ethics of AI in healthcare - Mount Sinai's Windrich Department of AI in Human Health USA first in a medical school, pioneering a revolution in the intersection of artificial intelligence and human health.
The Department is committed to using functional, effective, ethical and safe artificial intelligence to transform research, healthcare, education and efficiency.Bringing together global AI expertise, expensive equipment and unparalleled computing power, the department continues to develop the integration of many different, multi-modal data as it enables rapid analysis and application.
The department benefits from dynamic collaborations across Mount Sinai, including with the Hasso Plattner Institute for Digital Health at Mount Sinai—a partnership between the Hasso Plattner Institute for Digital Engineering in Potsdam, Germany, and the Mount Sinai Health System—which complements its mission by advancing a data-driven approach to improving patient care and health outcomes.
At the center of this innovation is the renowned Icahn School of Medicine at Mount Sinai, a center for learning and collaboration.This unique integration creates dynamic partnerships across institutions, academic departments, hospitals, and outpatient clinics, advancing disease prevention, improving treatments for complex diseases, and raising quality of life globally.
In 2024, Mount Sinai Health System was awarded the prestigious Hearst Health Prize for the innovative AI application NutriScan, developed by Mount Sinai Health System's clinical data science team in collaboration with faculty faculty.NutriScan is designed to accelerate the detection and treatment of malnutrition in hospital patients.This machine learning tool improves malnutrition diagnosis and resource utilization, demonstrating the effective use of AI in healthcare.
For more information about Mount Sinai's Windreich Department of Artificial Intelligence and Human Health, visit: ai.mssm.edu
About the Hasso Plattner Institute at Mount Sinai
At Mount Sinai's Hass Plattner Institute for Digital Health, data science, biomedical and digital engineering tools and medical knowledge are used to improve and extend life.The institute represents a collaboration between the Hass Plattner Institute for Digital Engineering in Potsdam, Germany, and the Mount Sinai Health System.
She oversees the partnership, led by Girish Nadkarni, MD, MPH, head of the institute, and Professor Lothar Weiler, a globally recognized expert in public health and digital transformation.
The Hasso Plattner Institute for Digital Health at Mount Sinai is generously supported by the Hasso Plattner Foundation.Current research programs and machine learning efforts focus on improving the ability to diagnose and treat patients.
About the Icahn School of Medicine at Mount Sinai
The Icahon School of Medicine at Mount Sinai is internationally renowned for its outstanding research, educational and clinical care programs.It is the sole academic partner for the seven member hospitals of Mount Sinai Health System, one of the largest academic health systems in the United States, providing care to a large and diverse patient population in New York City.
The Icahn School of Medicine at Mount Sinai offers highly competitive MD, PhD, MD-PhD, and master's degree programs, with an enrollment of more than 1,200 students.It has the largest medical training program in the country, with more than 2,700 clinical residents and fellows trained throughout the health system.The Graduate School of Biomedical Sciences offers 13 degree programs, conducts basic and translational research, and trains more than 560 postdoctoral researchers.others.
Ranked 11th nationally in National Institutes of Health (NIH) funding, the Icahn School of Medicine at Mount Sinai is among the 99th percentile in research dollars per researcher according to the Association of American Medical Colleges.More than 4,500 scientists, educators and clinicians focus on research and translational therapy in dozens of academic departments and multidisciplinary institutions.Universitat Sinai Innovation Through Partners (MSIP), the health system facilitates the application and commercialization of medical advances made at Mount Sinai in the real world.
* Mount Sinai Health System member hospitals: Mount Sinai Hospital;Mount Sinai, Brooklyn;the morning side of Mount Sinai;Mount Sinai in Queens;Mount Sinai in South Nassau;Western Mount Sinai;and Mount Sinai Eye and Ear Hospital of New York
About Mount Sinai Health System
Mount Sinai Health System is one of the largest academic medical systems in the New York metropolitan area, with 48,000 employees working in seven hospitals, more than 400 outpatient clinics, more than 600 research and clinical laboratories, a leading school of nursing, and a school of medicine and graduate education.Mount Sinai advances the health of people, everywhere, by addressing the most complex health challenges of our time: discovering and applying new new learning and scientific knowledge.developing safer and more effective treatments; training the next generation of medical leaders and innovators; and supporting local communities by providing high-quality care to all who need it.
By integrating hospitals, laboratories and schools, Mount Sinai provides comprehensive healthcare solutions from birth through geriatrics, using new technologies such as artificial intelligence and information while keeping the medical and emotional needs of patients at the center of all treatment.The Health System includes approximately 9,000 primary and specialty care physicians and 10 independent affiliated centers throughout the five boroughs of New York City, Westchester, Long Island, and Florida. Hospitals in the System are consistently ranked by Newsweek® as "World's Best Hospitals, Best National Hospitals, Best Hospitals in the World and Best Specialty Hospitals" and by US News & World® "Hospitals".Best” and “Best Children's Hospitals.” Mount Sinai Hospital in US News & World Report® “Best Hospitals” Ranking 2025-2026.
For more information, visit https://www.or find Mount Sinai on Facebook, Instagram, LinkedIn, X and YouTube.
Can Medical Artificial Intelligence Lie?Major Study Shows How Masters Address Misinformation in Healthcare
March 9, 2026 View all publications AI can help predict nutritional risk in ICU patients, research
December 22, 2025 View all press releases AI system finds critical clues to diagnosis in electronic health records
October 15, 2025 View all press releases Adding a search step improves AI assignment of medical diagnosis codes
September 25, 2025 View all press releases New AI tool addresses data accuracy and fairness to improve healthcare algorithm
September 4, 2025 See all press releases How humans and artificial intelligence can reach conclusions, find Mount Sinai studies
July 22, 2025 View all press releases
