Machine-Learning Model for HCC Risk Prediction May Outperform Current Methods

An interpretable machine-learning framework, called PRE-Screen-HCC, may predict risk levels for developing hepatocellular carcinoma (HCC) more accurately than publicly available risk scores, according to findings from a large population-based multicentric study published in Cancer Discovery.

“Our study highlights the potential of a simple, easily utilized machine-learning model to improve risk stratification for HCC using only routinely collected clinical data,” said co-senior and corresponding author Carolin Schneider, MD, Assistant Professor at RWTH Aachen University in Germany. “If validated in additional populations, our model would enable primary care physicians to efficiently identify at-risk patients and refer them to liver cancer screening. This could enable earlier detection and improved outcomes for patients with this aggressive disease.”

Background and Study Methods

Imaging-based and blood-based cancer screening for HCC is offered for individuals at a higher risk, but some researchers believe the guidelines for screening miss many other at-risk individuals.

“With so many factors impacting risk, there is an urgent need for effective tools to help clinicians identify high-risk patients,” said first study author Jan Clusmann, MD, a clinician-scientist at the Technical University of Dresden. “Machine-learning tools that can simultaneously work with different types of clinical data could be particularly useful for this major clinical challenge.”

Researchers collected prospective multimodal data from over 900,000 individuals as well as 983 cases of HCC from the UK Biobank study, which was used for the development cohort, and from the All of Us Research Program, which was used for external testing. They used 80% of the data from the UK Biobank to train the model; the other 20% was used for internal validation.

They explored individual and cumulative impacts of demographics, lifestyle, health records, blood, genomics, and metabolomics data to build machine-learning models to assess risk for developing HCC. The machine-learning models incorporated an architecture of random forest including hundreds of decision trees for greater robustness, reliability, and interpretability. Additionally, the investigators trained a separate random forest model on each of the five kinds of clinical data as well as combinations of the data to determine optimal performance.

The models were also compared against previously available risk prediction models, including the FIB-4, APRI, and NFS scores.

Key Findings

The model that combined demographics, electronic health record, and blood test data resulted in the best performance of all tested models, with an area under the receiver operating characteristic curve of 0.88. The researchers found that adding genomics and/or metabolomics data did not increase the model's performance substantially.

“This showed that we can predict HCC risk using simple, readily available data without the need for complex and expensive genetic sequencing,” said Dr. Schneider, noting that this feature increases the model’s potential for widespread use, particularly in resource-limited settings.

Compared with prior risk prediction models, the PRE-Screen-HCC model performed better at detecting true cases of HCC, with fewer false-positives as well.

Additionally, in an ablation experiment, the researchers reduced the number of clinical features that the model needed to assess. The simplified version of the model reviewed 15 routinely collected clinical features to arrive at its risk prediction, and the simplified version still outperformed existing models.

The model also demonstrated strong generalizability when evaluated on patients in a non-White subgroups in external validation from the All of Us group. The authors noted, however, that further validation of the model's performance is needed in different population sets.

DISCLOSURES: The study was supported by the German Cancer Aid; the German Federal Ministry of Research, Technology and Space; the German Research Foundation; the German Academic Exchange Service; the German Federal Joint Committee; the European Union Horizon Europe Research and Innovation Programme; the Breast Cancer Research Foundation; the National Institute for Health and Care Research; the German Federal Ministry of Education and Research; the German Federal Ministry of Health; the Interdisciplinary Centre for Clinical Research at RWTH Aachen University; the Junior Principal Investigator Fellowship Program of RWTH Aachen Excellence Strategy; the NRW Rueckkehr Programme of the Ministry of Culture and Science of the German State of North Rhine-Westphalia; and the National Institutes of Health. Dr. Clusmann has received honoraria from Johnson & Johnson. Dr. Kather declared ongoing consulting services for AstraZeneca and Bioptimus; holds shares in StratifAI, Synagen, and Spira Labs; and has received institutional research grants from GSK and AstraZeneca and honoraria from AstraZeneca, Bayer, Daiichi Sankyo, Eisai, Janssen, Merck, MSD, Bristol Myers Squibb, Roche, Pfizer, and Fresenius. Dr. Schneider declared no conflicts of interest. For full disclosures of the other study authors, visit aacrjournals.org.

Machine-Learning Model for HCC Risk Prediction May Outperform Current Methods

Early Results From a Trial of Active Surveillance for Low-Risk DCIS Are Reassuring Say Researchers

AI-Driven Multiagent System for Guiding First-Line Immunotherapy for NSCLC

Machine-Learning Model for HCC Risk Prediction May Outperform Current Methods

Nivolumab With or Without Ipilimumab in Platinum-Refractory Advanced Neuroendocrine Carcinoma

Molecular Test Doubles Detection of Bile Duct Cancer