
Eliezer Van Allen, MD
Last October, the Cancer AI Alliance (CAIA) announced the launch of its collaborative artificial intelligence (AI) platform powered by federated learning to train AI models with millions of de-identified patient datasets from participating cancer centers, while maintaining patient security, privacy, and adherence to regulatory and ethical standards. Formed a year ago with the goal of developing projects focused on using AI to foster precision cancer care and reducing siloed research insights to accelerate cancer discovery, CAIA is currently comprised of four National Cancer Institute (NCI)-designated cancer centers, including Dana-Farber Cancer Institute, Fred Hutchinson Cancer Center, Memorial Sloan Kettering Cancer Center, and The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins and the Johns Hopkins Whiting School of Engineering.
The establishment of the collaborative AI platform powered by a federated learning platform allows AI models to securely travel to data sources located at the four cancer centers without private patient data ever leaving their home institutions. The information gained from training the model on each cancer center’s de-identified dataset is then aggregated centrally to strengthen the AI models to discover patterns and maximize the value of the collective knowledge base, including by revealing trends across more diverse patient populations and rare cancers. Currently, a Gen 1 dataset in the federated learning platform includes clinical data from over 1 million patients, and plans are underway to increase the number of cancer institutions involved in the alliance.
“With the launch of CAIA, we have laid a critical foundation in the effort to accelerate new discoveries, and the combined data from our cancer centers can now power these innovative AI models,” said Eliezer Van Allen, MD, Chief, Division of Population Sciences, Chandra Nohria Family Chair for AI in Cancer Research at Dana-Farber Cancer Institute; and Associate Professor of Medicine at Harvard Medical School. “We are excited to share these models with research centers across the nation and exponentially expand access to the data that will drive progress toward better diagnosis, treatment, and outcomes for patients with cancer everywhere.”
Using the federated platform, over the last year, CAIA has launched eight pilot projects targeting four complex areas of research, including:
- Predicting treatment response;
- Identifying novel biomarkers;
- Analyzing trends in rare cancers, providing the scale necessary to uncover new therapies; and
- Fine-tuning large language models on patient data to predict future diagnoses.
In a wide-ranging interview with The ASCO Post, Dr. Van Allen discussed the potential for AI technology to transform cancer care for patients, reduce disparities in care, and help oncologists reach the goal of providing precision medicine for every patient with the disease.
Overcoming the Historical Challenges That Have Limited Progress in Cancer Care
Please talk about the impetus for forming the CAIA.
Now that we have AI technology, we wanted to address related challenges in cancer research and in clinical care to overcome some of the historical challenges that have limited our ability to accelerate more effective cancer therapies for patients. Those challenges include limited amounts of data sharing across domains to ensure that any AI model or algorithm developed accurately generalizes the data; and the difficulty in executing collaboration across institutions to facilitate data and answer important scientific questions.
The goal of launching CAIA is to try to solve those issues by creating an alliance across four institutions and with technology partnerships to develop an AI model that can impact patient care. The four institutions currently in CAIA each have their own data systems and research ecosystems in place, but each [institution] understood that we could not accelerate cancer discovery and transform clinical care unless we worked together.
Accelerating Drug Discoveries
How might this AI-driven federated research platform expedite more effective treatments for patients with cancer?
Federated learning refers to a way of doing AI model development and AI research that addresses some of the fundamental challenges that are preventing faster treatments for cancer. We want to develop AI models that connect historically siloed cancer research projects and that allows for seamless collaboration among researchers.
We can’t have precision cancer care if we’re limited by data from a single institution, but if we can pool all of those data together and get a clearer understanding of how drugs might work in a variety of patients with similar cancers and symptoms, we have a chance to accelerate our ability to tailor different cancer approaches for a specific patient’s needs. That is how we achieve precision medicine in cancer care. That’s my dream.— ELIEZER VAN ALLEN, MD
Tweet this quote
Federated learning gives us the ability to train AI models locally at each of the four cancer center sites, without sacrificing patient privacy, and still learn from that huge amount of patient data to more rapidly develop generalizable AI models and, hopefully, increase research discoveries. With federated learning, we can instantly go from one cancer center’s amount of patient data to the amount of patient data from four cancer centers.
As we have seen in other domains of AI, so much of the innovation in cancer research is driven by the sheer scale of data. Having patient data from four institutions from different regions of the country gives us the ability to train AI models that generalize to a diverse patient population, and that will build ethical AI tools as well as reduce inequalities in care.
Finding Answers to Previously Unanswerable Questions
The Cancer AI Alliance’s federated platform is supporting eight unique pilot projects that will endeavor to answer questions posed by scientists, oncologists, clinicians, and machine learning researchers from the four participating cancer centers. What are examples of the eight projects being investigated using the federated platform?
One project represents a perfect example of the challenge we face in oncology today, which is understanding why some patients respond well to cancer immunotherapy and others don’t, and where this platform could be useful in addressing the challenge head on.
My colleague, Sasha Gusev, PhD, Associate Professor of Medicine and Lead, Clinical Computational Oncology Group at Harvard Medical School and at Dana-Farber, is leading this investigation. This is a question that has been very hard for us in the field to study because each of us tend to approach that question from our individual data resources from our own cancer centers, and those datasets are too small, not representative enough of the patient population, and too underpowered to answer that question. But by putting that question into a federated platform, Dr. Gusev has the opportunity to ask this question at scale and potentially develop an answer that could be generalized across the patient populations receiving immunotherapy treatment.
Similarly, another one of my colleagues, Sylvan C. Baca, MD, PhD, a physician-scientist at Dana-Farber, is researching the risk profiles that might indicate which patients with prostate cancer will have more clinically aggressive subtypes of the disease than others. And, here again, we’ve been hindered in trying to answer this question because we were limited by small patient datasets from a single institution.
Now, using this platform with increased patient datasets from four institutions, we can develop AI algorithms to ask these questions in new ways and, hopefully, gain new insights into these issues.
Realizing the Goal of Precision Medicine for Every Patient
It sounds like what you are describing is truly realizing the goal in cancer care of providing each patient with personalized treatment based on the specific genetic and molecular profile of a patient’s tumor.
You are painting a picture of what we aspire to build with these kinds of AI platforms. We want to be able to learn from every patient’s cancer experience, so that when the next patient comes in the door, we can do an even better job of successfully treating that patient.
But AI can only help us get to precision medicine if we are able to access a full compendium of every patient’s experience with cancer, and that includes patients with rare cancers who face great challenges. We can’t have precision cancer care if we’re limited by data from a single institution, but if we can pool all of those data together and get a clearer understanding of how drugs might work in a variety of patients with similar cancers and symptoms, we have a chance to accelerate our ability to tailor different cancer approaches for a specific patient’s needs. That is how we achieve precision medicine in cancer care. That’s my dream.
Measuring Advances in Care Using AI
Do you expect that we'll see better outcomes for patients with cancer, especially patients with advanced-stage disease, with AI support? What do you expect that progress will look like—more cancer cures?
The honest answer is, I don’t know. As with any new and transformative technology, sometimes the impact is instantaneous, and sometimes it is only felt many years later or in ways that are hard to measure. For example, the way that I take care of patients with prostate cancer today vs when I started practicing as a medical oncologist more than a decade ago—resulting from the introduction of therapeutics and diagnostics built upon new technologies—has improved outcomes, but it’s still not good enough. We still have a long way to go toward reducing the death rate in this cancer, which remains the second leading cause of cancer death in American men.1
I think the sooner we embrace these new technologic innovations and invest in pursuing the questions that you are asking—the where and how and for whom—the faster we will get the answers, and the faster we will know where the impact will be felt the most.
Making Care More Equitable for All Patients
Will using AI technology in cancer care help address health disparities in minority patients and in patients with low-socioeconomic status to improve outcomes for these patients?
The blue-sky thinking is that we now have these incredible AI platforms that have enabled us to learn from many more patients, including those who have been historically underrepresented in clinical trials. We now have the potential to generalize data across many patient populations and help all patients with cancer. Our hope is that AI is a transformative technology that will make cancer care more equitable for all patients.
The risk is that AI gets deployed and cultivated in more narrow patient populations, and that would be a bad outcome for the field and is unacceptable. At Dana-Farber, we are investing in data collection capture and in AI model deployment capabilities across our various systems to mitigate the risk and make sure that anything we do with this technology can be generalized to a broader population. This is an absolute necessity for our field.
DISCLOSURE: Dr. Van Allen has an advisory/consulting role at Novartis Institute for Biomedical Research, Serinus Bio, and TracerBio, and is on the board of Science Advances. He receives research support from Novartis, BMS, Sanofi, and Nextpoint Therapeutics; has equity in Tango Therapeutics, Genome Medical, Genomic Life, Enara Bio, Manifold Bio, Microsoft, Monte Rosa, Riva Therapeutics, Serinus Bio, Syapse, and TracerBio; and receives speaking fees from TD Cowen. Dr. Van Allen holds institutional patents filed on chromatin mutations and immunotherapy response, and methods for clinical interpretation; and performs intermittent legal consulting on patents for the law firm of Foley Hoag.
REFERENCE
1. American Cancer Society: Key Statistics for Prostate Cancer. Available at www.cancer.org/cancer/types/prostate-cancer/about/key-statistics.html.

