St. Jude Children’s Research Hospital
is offering cloud-based access to the fully sequenced genomes of 10,000 pediatric patients with cancer, in the hopes that sharing the information will lead to the highest possible number of treatment breakthroughs.
Called the St. Jude Cloud
, the growing set of data, categorized by cancer type, is meant to help researchers at the Memphis facility and beyond understand the genetic mutations that drive pediatric cancers and find new drugs to treat the diseases.
In whole-genome sequencing, a child’s normal and tumor genes are sequenced and then compared. Mutations that are present in a child’s tumor but not his or her normal genes may be driving the disease, and could be good candidates to target with drugs, said Jinghui Zhang, Ph.D., chair of the Department of Computational Biology at St. Jude.
So far, 300 research teams around the globe have tapped into St. Jude’s repository, one of the largest in the world that compiles pediatric cancer
A resource containing so many samples is unusual in the pediatric cancer arena, because these diseases are rare. It’s also helpful to researchers because cloud-based sharing avoids the need to request voluminous genomic data by mail or electronically, a difficult and time-consuming task that often causes computers to crash. Finally, the institution’s system offers another advantage: It includes a variety of analytic tools that researchers can use privately to evaluate St. Jude’s information or their own data that they upload.
St. Jude introduced its cloud at last year’s annual American Association for Cancer Research conference and will offer an update in a presentation at the gathering again this year. The conference will take place in Atlanta, from March 29 to April 3.
St. Jude set up the cloud with the help of DNAnexus, which provides cloud platforms for the global genomics industry. In addition, the institution is partnering with Microsoft, which is providing free storage of an amount of data that would otherwise “cost six figures to store each year,” said Scott Newman, Ph.D., group lead for bioinformatics in the Department of Computational Biology.
The beauty of the project is that it allows researchers to use the data without having to set up a storage system on their own or come up with analytical tools, Zhang said. “The majority of researchers in the world don’t write program code,” she said.
The tools allow researchers to conduct tasks such as finding abnormal fusions between genes that could be targetable with drugs; identifying abnormally regulated genes whose expression can lead to disease; determining whether a mutated gene is a cancer driver or a bystander; searching for indications that a cancer expresses the protein PD-L1, making it likely to respond to immunotherapy; and determining whether inherited susceptibility to cancer is present in a subset of patients. Additional tools allow researchers to visualize their results using interactive graphics powered by ProteinPaint, the genomic visualization engine developed at St. Jude.
Seven hundred of the genomes in St. Jude’s cloud came from patients treated in the past, sequenced as part of the St. Jude Children’s Research Hospital-Washington University Pediatric Cancer Genome Project. The remaining 300 were added to the data set right after they were sequenced. Because St. Jude now routinely conducts whole-genome sequencing on all its patients, it plans to continue adding this information — with consent from families — to the cloud in real time. Even as it does its own research, St. Jude is committed to sharing all its findings in the hopes that other researchers will make discoveries that help children. With that in mind, the institution is seeking to break down firewalls between countries so that its cloud will be available to scientists everywhere.
Has the St. Jude Cloud led to any new discoveries yet in labs outside St. Jude? The leaders at the facility don’t know but hope to find out.
“We have no infrastructure to track our success yet,” Zhang said. “It would be good to figure out how to track discoveries made using our data.”