Advertisement

External Validation Confirms Ability of AI Model to Stratify Recurrence Risk in Early-Stage Lung Cancer


Advertisement
Get Permission

A machine learning–based survival model, incorporating preoperative CT images and routinely available clinical data, outperformed standard clinical staging systems in predicting recurrence after surgery in patients with lung cancer, especially in stage I, and showed correlations with established pathologic risk factors. Results of the model’s external validation were presented at the European Society for Medical Oncology (ESMO) Congress 2025.1

Presenting author Ann Valter, MD, from the Chemotherapy Department of North Estonia Medical Centre (NEMC) Foundation, Tallinn, explained that conventional staging systems inadequately capture biologic heterogeneity, even within the same tumor-nodal-metastasis (TNM) stage. This limits the system’s ability to guide treatment decisions and follow-up strategies for early-stage disease after surgery, ,especially for patients with stage I disease, who are largely excluded from trials of neoadjuvant and adjuvant therapies. Previous studies have demonstrated the potential of other artificial intelligence (AI) models for predicting survival and response to immunotherapy in non–small cell lung cancer (NSCLC), and for assessing lung cancer risk in screening populations, she added, offering a path toward addressing the above-mentioned limitations.

Highlighting the potential clinical impact of their findings, Dr. Valter and colleagues stated: “This model could be used to identify high-risk stage I patients for more personalized treatment decisions and follow-up strategies based on risk-assessment.”

Study Details

The investigators analyzed CT scans and clinical data from 1,267 patients with clinical stage I to IIIA lung cancer who underwent surgical resection, selected from the U.S. National Lung Screening Trial (NLST), NEMC, and the Stanford NSCLC Radiogenomics databases. Preoperative CT scans from NLST and NEMC were extensively (re)curated to ensure consistency with clinical metadata and outcomes.

A total of 1,015 patients were used for algorithm development, with 725 reserved for internal validation and 252 from NECM serving as an external validation cohort. All of those patients in the validation cohorts must have had a documented clinical TNM stage and at least 2 years of follow-up after primary treatment. Most patients in both cohorts had stage I disease, with a recurrence rate of 27.8% internally and 6.3% externally.

The preoperative survival model was trained to predict the likelihood of recurrence using CT radiomic features and clinical variables through an eightfold cross-validation strategy. A machine learning–derived risk score threshold was set to optimize the identification of high-risk patients with stage I disease, and performance was evaluated using the concordance index and disease-free survival across the full cohort and within patients with stage I disease alone. The relationship between machine learning–derived risk scores and known pathologic risk factors for recurrence—such as tumor cell grade, lymphovascular invasion, perineural invasion, visceral pleural invasion, spread through air spaces, and PD-L1 tumor proportion score—was assessed using a t-test.

The primary objective was to externally validate the prognostic AI model for enhancing individualized recurrence risk estimations using presurgical CT images and clinical features. Then, a secondary objective was to investigate the correlation between the machine learning–derived risk scores and the aforementioned pathologic features.

Key Findings

The machine learning–based survival model demonstrated superior performance, as compared with conventional staging by tumor size (≤ 2 cm and > 2 cm), for stratifying patients with stage I lung cancer into high- and low-risk groups, as evidenced by its higher hazard ratios (HR) for disease-free survival across both internal (HR = 1.71 vs 1.22) and external (HR = 3.34 vs 1.98) validation datasets. Likewise, it appeared to more effectively stratify those with stage I to III disease into high- and low-risk groups compared with conventional clinical TNM staging (stage I and II–III; internal, HR = 1.85 vs 1.76; external, HR = 3.55 vs 3.12).

In both the internal and external validation cohorts, the machine learning–derived risk scores seemed to be significantly higher in tumors with vs without poor differentiation (P < .0001), lymphovascular invasion (P < .0001), pleural invasion (P < .0001), and spread through air spaces (P = .01). Significantly higher risk scores were also observed among patients with PD-L1 tumor proportion scores of at least 50% (P < .0001), but this was only seen in the external validation group.

Insights and Opportunities 

“The machine learning model predicts recurrence more accurately than TNM staging using preoperative imaging and routinely available clinical data,” Dr. Valter concluded. “It outperforms clinical stage in patients with lung cancer, especially stage I, and is correlated with known pathologic risk factors for recurrence.”

Regarding clinical implications, she stated, “This model could be used to identify high-risk stage I patients for more personalized treatment decisions and follow-up strategies…,” adding that the ultimate goal is to reduce lung cancer–related mortality. 

DISCLOSURE: Dr. Valter has received support to attend conferences from AstraZeneca, Pfizer, and MSD; has been on the advisory board for Medison Pharma; and has received honoraria for lectures by MSD and AstraZeneca. Drs. Gasimova, Heames, Waterfield-Price, and Freitag are employed at Optellum. Dr. Vanakesa has received support to attend conferences from MSD and has received honoraria for lectures by MSD. Dr. Almre has received honoraria for lectures by MSD. Dr. Hodgkinson is employed at Medtronic. Dr. Vachani reported personal fees as a scientific advisor to Johnson & Johnson; grants to his institution from MagArray, Inc., Broncus Medical, Inc., DELFI Diagnostics, and PreCyte, Inc.; and he is an unpaid advisory board member of the LUNGevity Foundation. Dr. Carbone reports receiving institutional grants from Merck to Ohio State University; participating in data safety monitoring boards for European Organisation for Research and Treatment of Cancer (EORTC), AbbVie, and Eli Lilly and Company; and participating in advisory boards for GlaxoSmithKline, Iovance Biotherapeutics, Arcus Biosciences, Roche, AbbVie, Regeneron, Genentech, Novocure, OncoHost, AstraZeneca, Amgen, Daiichi Sankyo, Eli Lilly and Company, Johnson & Johnson, Pfizer, Bristol Myers Squibb, and Merck KGaA. Dr. Oselin has received support to attend conferences from MSD, and has received honoraria for lectures by MSD and Roche. The other study authors reported no conflicts of interest.

REFERENCE

1. Valter A, Kordemets T, Gasimova A, et al: External validation of AI for early-stage lung cancer recurrence prognosis using CT radiomics. ESMO Congress 2025. Abstract 1786O. Presented October 20, 2025.

 

EXPERT POINT OF VIEW 

Commenting on the study of an artificial intelligence (AI) model for prognostic prediction of early-stage lung cancer recurrence using CT radiomics, invited discussant Arsela Prelaj, MD, PhD, of the Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy, emphasized that it involved “really a large dataset, and I couldn’t see another one larger [in this context].” She also congratulated the authors for employing “very good” methodology, including development, internal validation, and external validation—“what we wanted to see.”

Alongside this praise, Dr. Prelaj provided a balanced assessment of the following aspects of the study, acknowledging both strengths and limitations:

  • Dataset: Leveraged a large, externally validated dataset, but it was heterogeneous across stages.
  • Methods: Applied a proven radiomic approach, though novel foundation models may offer new opportunities.
  • Simple biomarker: Used real-world data collected in clinical practice, which is cost-effective but may be less informative regarding underlying biology.

Previous radiomics studies in non–small cell lung cancer (NSCLC) have reported higher concordance (C)-indexes than that observed with the machine learning–based survival model for distinguishing stage I low- vs high-risk patients in external validation, Dr. Prelaj noted, but they were conducted in smaller cohorts. She furthermore highlighted the potential of incorporating digital pathology, reporting that a time-binned deep neural network model in NSCLC achieved a higher external validation C-index within a dataset of more comparable size.1

Other Approaches 

In discussing approaches to recurrence risk prediction using circulating tumor DNA (ctDNA) and other blood-based material, Dr. Prelaj posed the question: “Is recurrence a baseline or longitudinal task?” She further emphasized that “this is a thing we need to think about.” Additionally, she advocated for a multimodal strategy, stressing the importance of it “as [long] as [it is] cost-effective.”

A recent report highlighted how the phenomenon of tumor-infiltrating clonal hematopoiesis (CHIP mutations with high variant-allele frequencies detected in tumor tissue) could improve prediction using “small, simple material,” which Dr. Prelaj described as cost effective.2 She also pointed out that biologic signals for recurrence may originate centrally (eg, from bone marrow), emphasizing the value of organ-on-chip systems in capturing them—a global effort she is coordinating with Sabina Sangaletti, PhD, also of the Fondazione IRCCS Istituto Nazionale dei Tumori.

Looking ahead, Dr. Prelaj recommended exploring the use of foundation models, noting that they may improve results and “can be used to understand more.” She added, “We need to see how we can integrate this [machine learning–based survival model] with ctDNA.” 

DISCLOSURE: Dr. Prelaj has been an invited speaker for AstraZeneca, Daiichi Sankyo, Gilead, IQVIA, Janssen, Lilly, MEDSIR, Novartis, Pfizer, and Roche; has served on advisory boards for Amgen, AstraZeneca, Bayer, BMS, Johnson & Johnson, MSD, and Pfizer; has served as a local principal investigator for AstraZeneca, Spectrum, Bayer, BMS, Lilly, MSD, and Roche; and has nonfinancial interests as Project Lead for APOLLO 11 and I3LUNG, as well as President of the European Interdisciplinary Society of AI in Cancer Research (ESAC).

REFERENCES

1. Lee B, Chun SH, Hong JH, et al: DeepBTS: prediction of recurrence-free survival of non-small cell lung cancer using a time-binned deep neural network. Sci Rep 6:1952, 2020.

2. Pich O, Bernard E, Zagorulya M, et al: Tumor-infiltrating clonal hematopoiesis. N Engl J Med 392:1594-1608, 2025.

The content in this post has not been reviewed by the American Society of Clinical Oncology, Inc. (ASCO®) and does not necessarily reflect the ideas and opinions of ASCO®.
Advertisement

Advertisement




Advertisement