Foundation models with the ability to process and generate multi-modal data have transformed AI’s role in medicine. Nevertheless, researchers discovered that a major limitation of their reliability is hallucinations, where inaccurate or fabricated information can impact clinical decisions and patient safety, according to a study published in medRxiv.
In the study, researchers defined medical hallucination as any instance in which a model generates misleading medical content.
Researchers aimed to study the unique characteristics, causes and implications of medical hallucinations, with a special emphasis on how these errors manifest themselves in real-world clinical scenarios.
When looking at medical hallucinations, the researchers focused on a taxonomy for understanding and addressing medical hallucinations; benchmarking models using medical hallucination dataset and physician-annotated large language models (LLM) responses to real medical cases, providing direct insight into the clinical impact of hallucinations and a multi-national clinician survey on their experiences with medical hallucinations.
“Our results reveal that inference techniques such as chain-of-thought and search augmented generation can effectively reduce hallucination rates. However, despite these improvements, non-trivial levels of hallucination persist,” the authors wrote.
Researchers said that data from the study underscore the ethical and practical imperative for “robust detection and mitigation strategies,” establishing a foundation for regulatory policies that prioritize patient safety and maintain clinical integrity as AI becomes more integrated into healthcare.
“The feedback from clinicians highlights the urgent need for not only technical advances but also for clearer ethical and regulatory guidelines to ensure patient safety,” the authors wrote.
THE LARGER TREND
The authors noted that as foundation models become more integrated into clinical practice, their findings should serve as a critical guide for researchers, developers, clinicians and policymakers.
“Moving forward, continued attention, interdisciplinary collaboration and a focus on robust validation and ethical frameworks will be paramount to realizing the transformative potential of AI in healthcare, while effectively safeguarding against the inherent risks of medical hallucinations and ensuring a future where AI serves as a reliable and trustworthy ally in enhancing patient care and clinical decision-making,” the authors wrote.
Earlier this month, David Lareau, Medicomp Systems’s CEO and president sat down with HIMSS TV to discuss mitigating AI hallucinations to improve patient care. Lareau said 8% to 10% of AI-captured information from complex encounters may be correct, however, his company’s tool can flag these issues for clinicians to review.
The American Cancer Society (ACS) and healthcare AI company Layer Health announced a multi-year collaboration aimed at using LLMs to expedite cancer research.
ACS will use Layer Health’s LLM-powered data abstraction platform to pull clinical data from thousands of medical charts of patients enrolled in ACS research studies.
Those studies include the Cancer Prevention Study-3, a population study of 300,000 participants among whom several thousands have been diagnosed with cancer and provided their medical records.
Layer Health’s platform will provide data in less time with the aim of improving the efficiency of cancer research and allowing ACS to obtain deeper insights from medical records. The AI platform is intended specifically for healthcare to examine a patient’s longitudinal medical record and answer complex clinical questions, using an evidence-based method aimed at justifying every answer with direct quotes from the chart.
The plan prioritizes transparency and explainability and removes the problem of “hallucination” that is periodically observed with other LLMs, the companies said.
Foundation models with the ability to process and generate multi-modal data have transformed AI’s role in medicine. Nevertheless, researchers discovered that a major limitation of their reliability is hallucinations, where inaccurate or fabricated information can impact clinical decisions and patient safety, according to a study published in medRxiv.
In the study, researchers defined medical hallucination as any instance in which a model generates misleading medical content.
Researchers aimed to study the unique characteristics, causes and implications of medical hallucinations, with a special emphasis on how these errors manifest themselves in real-world clinical scenarios.
When looking at medical hallucinations, the researchers focused on a taxonomy for understanding and addressing medical hallucinations; benchmarking models using medical hallucination dataset and physician-annotated large language models (LLM) responses to real medical cases, providing direct insight into the clinical impact of hallucinations and a multi-national clinician survey on their experiences with medical hallucinations.
“Our results reveal that inference techniques such as chain-of-thought and search augmented generation can effectively reduce hallucination rates. However, despite these improvements, non-trivial levels of hallucination persist,” the authors wrote.
Researchers said that data from the study underscore the ethical and practical imperative for “robust detection and mitigation strategies,” establishing a foundation for regulatory policies that prioritize patient safety and maintain clinical integrity as AI becomes more integrated into healthcare.
“The feedback from clinicians highlights the urgent need for not only technical advances but also for clearer ethical and regulatory guidelines to ensure patient safety,” the authors wrote.
THE LARGER TREND
The authors noted that as foundation models become more integrated into clinical practice, their findings should serve as a critical guide for researchers, developers, clinicians and policymakers.
“Moving forward, continued attention, interdisciplinary collaboration and a focus on robust validation and ethical frameworks will be paramount to realizing the transformative potential of AI in healthcare, while effectively safeguarding against the inherent risks of medical hallucinations and ensuring a future where AI serves as a reliable and trustworthy ally in enhancing patient care and clinical decision-making,” the authors wrote.
Earlier this month, David Lareau, Medicomp Systems’s CEO and president sat down with HIMSS TV to discuss mitigating AI hallucinations to improve patient care. Lareau said 8% to 10% of AI-captured information from complex encounters may be correct, however, his company’s tool can flag these issues for clinicians to review.
The American Cancer Society (ACS) and healthcare AI company Layer Health announced a multi-year collaboration aimed at using LLMs to expedite cancer research.
ACS will use Layer Health’s LLM-powered data abstraction platform to pull clinical data from thousands of medical charts of patients enrolled in ACS research studies.
Those studies include the Cancer Prevention Study-3, a population study of 300,000 participants among whom several thousands have been diagnosed with cancer and provided their medical records.
Layer Health’s platform will provide data in less time with the aim of improving the efficiency of cancer research and allowing ACS to obtain deeper insights from medical records. The AI platform is intended specifically for healthcare to examine a patient’s longitudinal medical record and answer complex clinical questions, using an evidence-based method aimed at justifying every answer with direct quotes from the chart.
The plan prioritizes transparency and explainability and removes the problem of “hallucination” that is periodically observed with other LLMs, the companies said.