With the help of Chan Zuckerberg Initiative, U of T researchers develop data tools to accelerate science
Which genetic changes predispose to disease? How do characters in a novel relate to each other? Which wine and cheese go well together?
Turns out, there’s an app for that – and it’s about to become far more versatile as University of Toronto researchers work to release it to a wider community with the support of the Chan Zuckerberg Initiative.
Called Cytoscape, the software in question is already an essential tool for viewing networks in biology, including gene networks that hold clues about how different genes co-operate to sustain health and how these networks change in disease. But like most research software, it’s currently a desktop application that has to be installed and updated, and doesn’t work on phones or tablets.
Today, the Chan Zuckerberg Initiative announced it is providing U of T’s Gary Bader and Hannes Röst, both researchers at the Donnelly Centre for Cellular and Biomolecular Research, with US$150,000 each to create a cloud-based Cytoscape and Open MS. Co-founded by Facebook chief executive Mark Zuckerberg and his spouse, Priscilla Chan, the initiative seeks to harness technology to accelerate progress in science.
“The future of data analytics should be that it is easier to do, easier to share information and it should be easier for people to collaborate,” says Bader, a professor of computational biology who is cross-appointed to the department of molecular genetics in the Faculty of Medicine and the department of computer science in the Faculty of Arts & Science, and holds the Ontario Research Chair in Biomarkers of Disease.
Just as web-based cloud computing has transformed how we listen to music and store data, Bader, whose team is developing the web-based Cytoscape Explorer, says that freedom from having to keep track of files and e-mail them back and forth will boost creativity and speed up science.
“Because your document lives on the cloud, the latest version is already there, and you can access it anytime, anywhere. It makes it easier to see what everyone else is doing and you’re exposed to more ideas that changes the way you do things in a positive way.”
Initially designed for genomics researchers, Cytoscape incorporates the basic principles of network theory and can be easily adapted for other applications. Besides biology, it has been used in business, social studies and marketing, as well as mapping how characters in an epic science fiction novel relate to each other.
“We are building the foundation for other people to do research,” says Hannes Röst, an assistant professor of computational biology at the Donnelly Centre (photo by Jovana Drinjakovic)
Bader even adapted the software to find optimal wine and cheese combinations for a dinner party.
Research analytics have been slow to move to the cloud because it is difficult to obtain funding purely for software development unless it promises to reveal new insights. Yet cloud analytics are desperately needed to support increasingly collaborative research – often involving teams scattered around the world.
“We are building the foundation for other people to do research,” says Röst, an assistant professor of computational biology who is also cross-appointed to the departments of molecular genetics and computer science, and whose team is developing OpenMS, a free tool for biomarker analysis.
With more than one million downloads since launching in 2001, Cytoscape’s popularity is only likely to grow with the move to the cloud.
“We really think that making this available on the web will allow users who never previously discovered the software, and never used it on the desktop, to easily access it,” says Bader, who joined the Cytoscape team in the early 2000s and is leading the newly funded project with Dexter Pratt, a software engineer in the group of Trey Ideker, a professor at the University of California, San Diego, and a co-founder of Cytoscape.
Biomarker tracking
If scientists knew what “healthy” looked like at the molecular level, they might be able to spot disease as it begins to develop and potentially halt it.
Molecular profiling of human tissue – blood, for example – produces vast amounts of complex data calling for sophisticated analysis tools such as OpenMS, a leading free software for the analysis of data produced by mass spectrometry, which identifies and counts molecules based on their unique mass-to-charge ratio.
Learn how Hannes Röst wants to turn blood into big data to improve health
Composed of a set of algorithms that can be rearranged into different workflows, Open MS can be tailored to individual user data. But in its current form, it requires a certain level of coding knowledge, discouraging uptake among users without programming experience.
The cloud version will have no such obstacles.
“We want to make OpenMS user-friendly, using a graphic user interface where users can click on buttons to start their analysis instead of typing commands on the command line,” says Röst, who holds the Canada Research Chair in Mass Spectrometry-based Personalized Medicine.
Programming-savvy users will be able to inspect and modify the source code to their needs.
To set up OpenMS on the cloud, Röst will take advantage of so-called Docker containers, which are sets of code that enable standardized software packaging so that it runs the same way on any platform.
The software will be hosted on Niagara, a supercomputer cluster at U of T and part of ComputeCanada, the high-performance computing infrastructure established by the federal government.
The overarching goal of Röst’s research is to identify early biomarkers of diabetes and cancer.
“We want to take people’s body fluids and generate a metabolic profile that we can track over time how people change,” he says.
His team recently acquired a state-of-the-art mass spectrometry instrument worth $1 million, with support from the Canada Foundation for Innovation and U of T’s Faculty of Medicine. The instrument, referred to among lab members as “the space ship” for its futuristic look, can detect trace amounts of biomolecules for more accurate profiling.