Incorrect advice offered by an artificial intelligence (AI)-based decision support system could impair the performance of radiologists at every level of expertise when reading mammograms, according to a new study published by Dratsch et al in Radiology.
Background
Often touted as a “second set of eyes” for radiologists, AI-based decision support systems for mammograms may be one of the most promising applications of AI in radiology. However, as the technology expands, there are concerns that it may make radiologists susceptible to automation bias—the tendency of individuals to favor suggestions from automated decision-making systems. Several studies have already shown that the introduction of computer-aided detection into the mammography workflow could impair radiologists’ performance. Although, none of the studies have examined the influence of AI-based decision support systems on the ability of radiologists to accurately read mammograms.
Breast Imaging Reporting and Data System (BI-RADS) categorizations are standard methods used by radiologists to describe and categorize breast imaging findings. While BI-RADS categories are not diagnoses, they can be crucial in helping physicians determine the next steps in care.
Study Methods and Results
In the new prospective study, researchers asked 27 radiologists to read 50 mammograms and then provide their BI-RADS categories assisted by AI-based decision support systems, in order to determine how automation bias can affect radiologists at varying levels of experience when reading the mammograms.
The researchers presented the mammograms in two randomized sets—a training set of 10 mammograms in which the AI had suggested the correct BI-RADS categories; and a set containing incorrect BI-RADS categories, purportedly suggested by AI, in 12 of the 40 mammograms.
After conducting their analyses, the researchers discovered that the radiologists were significantly worse at assigning the correct BI-RADS categorizations for the cases in which the AI-based decision support systems had suggested an incorrect BI-RADS category. For example, inexperienced radiologists assigned the correct BI-RADS categories in almost 80% of the cases in which the AI-based decision support systems had suggested the correct BI-RADS categories. When the AI systems suggested the incorrect categories, their accuracy fell below 20%. Experienced radiologists—with an average of more than 15 years of experience—saw their accuracy fall from 82% to 45.5% when the AI-based decision support systems suggested the incorrect categories.
Conclusions
“We anticipated that inaccurate AI predictions would influence the decisions made by radiologists in our study, particularly those with less experience,” explained lead study author Thomas Dratsch, MD, PhD, Professor of Medicine at the Institute of Diagnostic and Interventional Radiology at the University Hospital Cologne. “Nonetheless, it was surprising to find that even highly experienced radiologists were adversely impacted by the AI system’s judgments, albeit to a lesser extent than their less seasoned counterparts,” he added.
The researchers hope their new findings demonstrated why the effects of human-machine interactions must be carefully considered to ensure safe deployment and accurate diagnostic performance when utilizing AI-based decision support systems.
“Given the repetitive and highly standardized nature of mammography screening, automation bias may become a concern when an AI system is integrated into the workflow. Our findings emphasize the need for implementing appropriate safeguards when incorporating AI into the radiologic process to mitigate the negative consequences of automation bias,” Dr. Dratsch stressed.
The researchers proposed potential safeguards to decrease automation bias—including presenting radiologists with the confidence levels of the AI-based decision support systems by displaying the probability of each output, teaching radiologists about the reasoning processes of the AI systems, and ensuring that the radiologists feel accountable for their own decisions.
The researchers plan to use tools such as eye-tracking technology to better understand the decision-making processes of radiologists using the AI systems.
“[W]e would like to explore the most effective methods of presenting AI output to radiologists in a way that encourages critical engagement while avoiding the pitfalls of automation bias,” Dr. Dratsch concluded.
Disclosure: For full disclosures of the study authors, visit pubs.rsna.org.