In the field of radiology, where a correct diagnosis is critical to ensure proper patient care, large language models such as ChatGPT may improve accuracy or offer a second opinion in assessing brain tumor magnetic response imaging (MRI) reports, according to a recent study published by Mitsuyama et al in European Radiology.
Study Methods and Results
In the recent study, researchers compared the diagnostic performance of GPT-4–based ChatGPT model and radiologists after assessing 150 preoperative brain tumor MRI reports. Based on these daily clinical notes—written in Japanese—ChatGPT, two board-certified neuroradiologists, and three general radiologists were asked to provide differential diagnoses and a final diagnosis. Their accuracy was then calculated based on the actual diagnosis of the tumor after its removal.
The researchers found that the ChatGPT model exhibited an accuracy of 73% compared with an average of 72% among neuroradiologists and 68% among general radiologists. Additionally, the ChatGPT model’s final diagnostic accuracy varied depending on whether the clinical report was written by a neuroradiologist or a general radiologist. The accuracy with neuroradiologist reports was 80% vs 60% when using general radiologist reports.
Conclusions
“These results suggest that ChatGPT can be useful for preoperative MRI diagnosis of brain tumors,” highlighted lead study author Yasuhito Mitsuyama, MD, a graduate student at the Osaka Metropolitan University Graduate School of Medicine. “In the future, we intend to study large language models in other diagnostic imaging fields with the aims of reducing the burden on physicians, improving diagnostic accuracy, and using AI to support educational environments,” he concluded.
Disclosure: For full disclosures of the study authors, visit link.springer.com.