Project Data Sphere: Megadata in the Cloud Could Speed Clinical Trials Here on Earth

Get Permission

Martin J. Murphy, Jr, DMedSc, PhD, FASCO

Clifford A. Hudis, MD

Richard L. Schilsky, MD

Howard I. Scher, MD

Organizations That Have Contributed Financially or in Kind to Project Data Sphere

Anything that makes a study more efficient from beginning to end should accelerate the entire drug development and approval process.

—Clifford A. Hudis, MD
The utility of the data [from Project Data Sphere] will depend on their volume and diversity—the more data, the more useful.

—Richard L. Schilsky, MD

Project Data Sphere, which launched on April 8, is a “giant digital laboratory, an enormous library containing data about tens of thousands of patients and hundreds of clinical trials, all of which will be in the public domain,” said Martin J. Murphy, Jr, DMedSc, PhD, FASCO, Chief Executive Officer of the CEO Roundtable on Cancer.

The Roundtable is composed of executives from more than 30 domestic companies in diverse profit and nonprofit industries, as well as several National Cancer Institute (NCI)-designated cancer centers. It was founded in 2001.

Project Data Sphere is an independent endeavor of the Roundtable and its affiliate Life Sciences Consortium, founded in 2005. Its major goal is to advance cancer research, primarily by speeding drug discovery and development via cancer clinical trials, and secondarily by contributing to the education of oncologists.

The data sphere has been designed to provide a place where the cancer community can share, integrate, and analyze historical data about individual patients (whose identity has been obliterated) and comparator arm results from phase III trials conducted by industry and academic institutions.

How It Will Work

“Project Data Sphere, LLC, will enable the research community to bring to light previously unrecognized insights buried within vast amounts of cancer clinical data,” said Howard I. Scher, MD, Chief of the Genitourinary Oncology Service at Memorial Sloan Kettering Cancer Center. “The benefits of sharing comparator arm data could lead to a better understanding of disease progression and endpoints, and maximize a patient’s contribution beyond a single trial to the benefit of others.”

The data repository will be “course changing” in pharmaceutical development, said Dr. Murphy, because it builds on others’ work. He explained that aggregate historical data from many previous trials can be put together by new scientists to avoid repetition. This could be especially useful for childhood cancer, rare cancers, and genetic mutations.

This approach has the potential for other benefits as well, including:

Ready-made comparisons of various treatment options

Creation of pseudo-experimental treatment and control groups to study the effect of risk factors

Use of multiple small-sample studies to develop a valid population estimate for epidemiologic work

Solution of legal and technical problems as a result of enhanced data security and anonymization strategies that have plagued such efforts in the past

The vast majority of clinical trial data now belongs to private companies, academic institutions, and cooperative groups. They are thus not usable by outside researchers or the general public. But, added Dr. Murphy, those organizations have a “moral responsibility” to share data, as well as an obligation to help other patients with cancer. He believes that most people who participate in clinical trials want this too, which, in fact, is one of the reasons they agree to do so.

Nancy Roach, Founder and Chair of the Board of Fight Colorectal Cancer, expressed a similar sentiment. “This is a huge win for the patient community, first because aggregation of big data is always helpful, and also because it extends the reach and benefit of every patient who has ever participated in a clinical trial.”

Project Data Sphere charges no user fees, and anyone can log onto its website (

Will It Work?

The ASCO Post asked Richard L. Schilsky, MD, ASCO Chief Medical Officer, and Clifford A. Hudis, MD, Chief of the Breast Cancer Medicine Service, Memorial Sloan Kettering Cancer Center, and ASCO President, three questions about Project Data Sphere:

1. Of how much practical use do you think the data platform will be? That is, will researchers actually use it as they design clinical trials?

Dr. Schilsky said, “Time will tell. The utility of the data will depend on their volume and diversity—the more data, the more useful. Most of them right now come from prostate cancer trials, with a relatively small number so far loaded into the sphere. As the database grows, understanding of the natural history of the disease will increase, as will the heterogeneity of outcomes in populations with varying characteristics. Such insight will stimulate new hypotheses and facilitate design of future clinical trials.”

Dr. Hudis agreed. “The ability to access large databases of well annotated, high-quality information from completed trials should allow researchers to make even better estimates of control arm performance as they design new studies.”

2. Will it make a significant difference in the duration of trials and the speed with which FDA analyzes data?

Dr. Schilsky said that’s hard to know. “I think it can help refine trial design, particularly by identifying groups of patients at high risk of disease progression. Enriching future trials with such patients could potentially shorten them.”

Again, Dr. Hudis agreed. “Anything that makes a study more efficient from beginning to end should accelerate the entire drug development and approval process.”

3. What will the data sphere mean for nonresearch oncologists in the community?

Dr. Schilsky said again that it’s hard to say. “If community oncologists can access the database, they could glean insights into how standard regimens perform across different groups of patients, or how different regimens perform in a particular patient group. Clearly, such inferences would not be as reliable as head-to-head comparisons of different treatments, but they still might prove useful. To do this, however, community oncologists would need assistance from data analysis experts.”

Dr. Hudis took a slightly different approach: “Apart from offering greater speed and efficiency for the entire drug development system, it allows study participants to know that they are contributing to scientific progress in ways that stretch beyond the study question. Eventually, these kinds of data may be useful for quality measures and comparative effectiveness research as well. All this is directly relevant to all oncologists.”

Stocking the Sphere and Paying the Bills

In making the switch from privately held data to the public sphere, numerous organizations are already participating (see sidebar on page 95). Other life sciences organizations, health advocacy groups, medical data standards organizations, researchers, universities, and technology providers will be brought into the fold.

Dr. Murphy said, “Funding for Project Data Sphere is being borne entirely by members of the CEO Roundtable, who have made significant financial and pro bono contributions.” The law firm of Hogan Lovells gave free legal advice, and SAS is providing the analytic tools and security systems. Project management support is being given by Celgene and Sanofi US.

“We also are grateful for the counsel, commentary, and encouragement we have received over the years from many cancer centers and other organizations, including Memorial Sloan Kettering Cancer Center, Duke University, the Institute of Medicine, NCI, the U.S. Food and Drug Administration, and the White House Office of Science and Technology,” said Dr. Murphy. ■

Disclosure: Drs. Murphy, Schilsky, and Hudis reported no potential conflicts of interest.