On April 12, St. Jude Children’s Research Hospital launched the St. Jude Cloud, an online data-sharing and collaboration platform that provides researchers access to the world’s largest public repository of pediatric cancer genomics data. Developed as a partnership among St. Jude, DNAnexus, and Microsoft, the St. Jude Cloud provides accelerated data mining, analysis, and visualization capabilities in a secure cloud-based environment.
James R. Downing, MD
“Sharing research and scientific discoveries is vital to advancing cures and saving lives, especially in rare diseases like pediatric cancer,” said James R. Downing, MD, St. Jude President and Chief Executive Officer. “St. Jude has shared data and resources since its founding, and collaboration with researchers across the world is at the core of our mission. St. Jude Cloud offers researchers access to genomics data and analysis tools that will drive faster progress toward cures for catastrophic diseases of childhood.”
Pediatric Cancer Genomics Data Repository
The interactivedata-sharing platform allows scientists to explore more than 5,000 whole-genome, 5,000 whole-exome, and 1,200 RNA-sequencing data sets from more than 5,000 pediatric cancer patients and survivors. By 2019, St. Jude expects to make 10,000 whole-genome sequences available on St. Jude Cloud.
These data have been generated from three large St. Jude–supported genomics initiatives: the St. Jude—Washington University Pediatric Cancer Genome Project, designed to understand the genetic origins of childhood cancers; the Genomes for Kids clinical trial, focused on moving whole-genome sequencing into the clinic; and the St. Jude Lifetime Cohort study (St. Jude LIFE), which conducts comprehensive clinical evaluations on thousands of pediatric cancer survivors throughout their lives.
Access to data is simple, fast, and does not require downloading prior to exploration. Researchers may also upload their own data in a private, password-protected environment to explore using tools available on the St. Jude Cloud platform.
As well as high-quality next-generation sequencing data, St. Jude Cloud features a collection of bioinformatics tools to help both experts and nonspecialists gain novel insights from genomics data. These tools include validated data-analysis pipelines and interactive visualization tools to make it easier to make discoveries from large data sets. Data and results can be securely shared with collaborators within the platform.
Collaboration to Advance Cures
The data available on the St. Jude Cloud represent a key resource to understanding the genetic roots of childhood cancer. St. Jude’s partnership with DNAnexus and Microsoft allows access to these data to harness the collective power of the global research community to advance precision medicine for rare pediatric diseases such as cancer.
The data available through St. Jude Cloud are stored on Microsoft Azure, which can handle data sets on the massive scale required for large genomics studies such as those developed by St. Jude. Microsoft understands the complexities of large-scale genomics data and has processed half a petabyte of data for St. Jude Cloud to date.
Peter Lee, PhD, Corporate Vice President of AI and Research at Microsoft, said “Health and technology partnerships are central to the advancement of scientific breakthroughs. We are extremely proud to collaborate with our research partners at St. Jude and DNAnexus, and we look forward to the progress St. Jude Cloud will bring.”
“Collaboration fuels scientific advancements,” said Richard Daly, Chief Executive Officer at DNAnexus. “Whether you are working together across hallways or international borders, researchers need a secure space to foster collaboration and share data and tools.”
“St. Jude Cloud is a powerful resource to drive global research and discovery forward,” said Jinghui Zhang, PhD, Chair of the St. Jude Department of Computational Biology and Co-Leader of the St. Jude Cloud project. “Providing genomic sequencing data to the global research community and making complex computational analysis pipelines easily accessible will lead to progress in eradicating childhood cancer.”