Creating a Comprehensive Catalog of Cancer Genes to Improve Patient Outcomes

A Conversation With Eric S. Lander, PhD

Get Permission

Eric S. Lander, PhD

To provide effective combination therapy in cancers, we are going to need to know what pathways are being activated and how. These days, genomic sequencing of tumors is such a straightforward process, I have no doubt that it will become standard of care at some point.

—Eric S. Lander, PhD

In January, Eric S. Lander, PhD, Director of the Broad Institute of MIT and Harvard in Cambridge, Massachusetts, and his colleagues published the results from their landmark study,1 which explored the feasibility of creating a comprehensive catalog of cancer genes. The researchers collected and analyzed data from the whole-exome sequencing of nearly 5,000 human tumor cancers and their matched normal-tissue samples across 21 cancer types.

The results of their analysis revealed essentially all known cancer genes in these cancer types and identified 33 genes—a 25% increase—that were not previously known to be significantly mutated in cancer, including genes associated with proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing, and protein homeostasis. In addition, the study shows that many key cancer genes still remain to be discovered.

Dr. Lander estimates that to create a comprehensive catalog of cancer genes necessary to enable physicians to select the best combination therapy for each patient’s cancer, scientists will need to examine 100,000 genes—10 times as many as The Cancer Genome Atlas (TCGA) has analyzed so far—to find most of the genes involved in 50 different cancer types.

The ASCO Post talked with Dr. Lander about his study results, what it will take to create a comprehensive catalog of cancer genes, and how such a catalog could guide new drug development, alter clinical trial design, and usher in the era of true precision medicine.

Surprising Findings

Your study identified 33 genes that were previously unknown to be significantly mutated in cancer. Was that surprising to you?

Yes. The idea that there were still very large numbers of mutated cancer genes that were unknown was very surprising. We used stringent statistical methods to enumerate candidate cancer genes and then inspected each gene to identify those genes with a strong biologic connection to cancer.

These were genes in all sorts of very clear cancer pathways, with just the mutational patterns that you would expect given their function. So, the fact that there were 33 more genes, which was about a 25% increase in the number of genes known to be significant in those 21 cancer types, was hugely ­surprising.

The study told us that there are a lot of different cancer genes that we still don’t know about but that can be easily found, simply by looking at larger numbers of tumor samples.


You are estimating that you will need to analyze 100,000 tumor samples to find most of the genes involved in 50 cancer types?

Yes, so that’s about 2,000 samples for each tumor type, which isn’t such a terrifying number. But the number of samples needed will, of course, be different according to the background mutation rate.

For example, in lung cancers and melanomas, where the background mutation rate is high, you need about 7,000 tumor samples for each. In cancers where the background rate is much lower, you may only need 1,000, or, more typically, about 2,000 tumor samples. That 2,000-sample estimate is where the 100,000 samples to analyze 50 cancer types comes from.

Catalog of Cancer Genes

Please talk about the importance of developing a complete catalog of genes involved in cancer.

Having a complete catalog of cancer genes will enable us to look at the biologic pathways of each cancer type and find targets for therapeutic intervention. It lets us recognize what is actually going on in a tumor. For example, there were a number of small GTPase relatives of RAS that were mutated in these cancers. That tells us that there is an important driving force that we did not know about.

To provide effective combination therapy in cancers, we are going to need to know what pathways are being activated and how. These days, genomic sequencing of tumors is such a straightforward process, I have no doubt that it will become standard of care at some point.

After all, if we are currently spending a couple hundred thousand dollars in the care of a patient with cancer, spending what will probably become $1,000 to understand the mutational pattern of their cancer rather than relying on guesswork and making therapeutic decisions in the dark, strikes me as a no-brainer.


How soon will it be before genomic ­sequencing becomes standard of care in oncology?

It is hard to predict exactly, but perhaps within 5 years. I believe that the field should move carefully. What I can say is that if someone I loved had cancer, I would want to have the genomic information on that cancer, and I think that most people are going to feel that way.

However, in order to accurately interpret the genomic information on an individual patient’s cancer and prescribe effective therapy, we need a complete catalog of cancer genes and rigorous analytical methods.

Saturation Analysis

How will you know when the cancer gene catalog is completed?

You keep fishing until the curve flattens out and you don’t find any more genes. This is called saturation analysis. An effective test is to perform “down-sampling” to study how the number of discoveries increases with sample size by repeating the analysis on random subsets of samples of various smaller sizes.

In our study, for example, we found that genes mutated at a frequency of 20% or higher have largely been mined out—the curve has flattened out and is not continuing to rise with sample size. However, for genes mutated at lower frequencies, we are finding that the curve is still going up with sample size. So, when the curve bends over and becomes flat, you are done.

Clinical Trial Design

How might your study findings impact the design of clinical trials?

Most clinical trials will involve genomic analysis for lots of reasons. Companies testing a new therapy in a trial will want to be able to look at the genomic data and determine whether the presence of mutations in any gene correlates with effectiveness of a specific drug. Then patients could be enrolled based on the cellular pathways disrupted in their tumor.

It might also be possible to get U.S. Food and Drug Administration approval on new therapies based on many fewer patients’ participation in clinical trials by targeting those patients who will actually respond to the therapy. I think this is the future for clinical trials.

Takeaway Message

Do you have any closing thoughts on your research?

The big takeaway message from our study is that we clearly have not reached the saturation level in cancer gene identification, and there are a lot of very, very sensible cancer genes still to be found. The good news is it is not going to be that hard to do. The genomic studies of large numbers of tumor samples are no longer prohibitive due to a decrease in the cost of DNA sequencing.

Given the devastating effects of cancer on patients and their family members, completing the genomic analysis of this disease should be a biomedical priority. ■

Disclosure: Dr. Lander reported no potential conflicts of interest.


1. Lawrence MS, Stojanov P, Mermel CH, et al: Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505:495-501, 2014.