Electronic health record–based artificial intelligence may help uncover new risk factors in the development of early-onset colorectal cancer, according to study findings presented by Parker et al at the AACR Virtual Special Conference: Artificial Intelligence, Diagnosis, and Imaging (Abstract PR-10). Disorders that cause chronic immunosuppression—such as human immunodeficiency virus (HIV), inflammation, obesity, asthma, sinusitis, and dermatitis—were identified as new risk factors for young-onset colorectal cancer by machine-learning models.
Although both incidence and mortality rates in colorectal cancer among people older than 65 have declined, respectively, by 3.3% and 3% annually, among individuals younger than 50, the incidence rate has risen by about 2% annually, and death rates have increased by 1.3% annually. In 2020, approximately 147,950 people were diagnosed with colorectal cancer and 53,200 died from the disease in the United States, including 17,930 cases and 3,640 deaths in individuals younger than age 50.
While obesity, diet, inactivity, inflammatory bowel disease, and family history are often cited as contributing factors in early onset of this cancer, they alone do not explain the rising trends in young-onset colorectal cancer.
Study Methodology
Researchers retrieved data from the electronic health records of 1,227 patients with colorectal cancer and matched 34,157 controls, all under the age of 50, from the OneFlorida Clinical Data Research Network, a clinical research network contributing to the national PCORnet. They then applied four machine learning algorithms to the data; colon cancer and rectal cancer were modeled separately.
"This preliminary study provides early insight into the capacity of artificial intelligence to uncover new risk factors in the population of patients with onset of young-onset colorectal cancer, with more algorithm refinement and risk factor exploration underway."— Parker et al
Tweet this quote
For each patient, the researchers created a prediction window starting from the first recorded encounter in the patient’s electronic health record to end dates of 0, 1, 3, and 5 years prior to the colon or rectal cancer case index date. For each control patient, the researchers matched the patient to cases based on age at an encounter date to close to the case index date. The data were split into a training set (80%) for training the models, and a testing set (20%) used to measure model performance. SHapley Additive exPlanations (SHAP) was used to analyze significant features.
Results
The researchers found notable trends in model prediction results were decreased sensitivity across prediction windows as data per patient decreased, in both the rectal cancer and colon cancer cohorts. Zero-year and 1-year prediction area under the curve (AUC) was significant at 0.64 to 0.75 for all algorithms across rectal cancer and colon cancer. As the prediction window widened, the prediction performance dropped to as low as 0.35 (ie, 5-year prediction). The best performing algorithm across all experiments was the support vector machine.
The top predictors the researchers identified in the colon cancer cohort included hypertension, cough/asthma, chronic sinusitis, anxiety disorder, and atopic dermatitis. The top predictors in the rectal cancer cohort included obesity, female sex, HIV, anxiety disorder, and asthma.
“Disorders with chronic immunosuppression (eg, HIV) or inflammation (eg, obesity, asthma, sinusitis, dermatitis) may represent immune-axis derangements contributing to a favorable state for colorectal cancer. This preliminary study provides early insight into the capacity of artificial intelligence to uncover new risk factors in the population of patients with onset of young-onset colorectal cancer, with more algorithm refinement and risk factor exploration underway,” concluded the study authors.