DeepBind crunches data to find patterns behind origins of disease
The origin story of DeepBind begins like so many inventions: with a question and then a leap.
Co-creators Babak Alipanahi and Andrew Delong were chewing the fat and wondering why no one had thought to marry deep machine learning with computational biology to figure out why some mutations lead to disease and why others don’t.
“We were both talking about this idea, and both wondering why no one seems to have tried it [but] Babak was the one with the confidence to say ‘You want to do it? Well, let’s just do it!’” recalls Delong.
Delong is the guy in the lab who questions everything, Alipanahi explains. “He’s a curious researcher and very meticulous. If a question is likely to be asked, he wants to be able to answer it.”
The end product of those conversations – DeepBind – took more than a year to create with the support of professor Brendan Frey of U of T’s department of electrical and computer engineering. All three work together at Deep Genomics, one of the University of Toronto’s best-known and successful startups of recent years.
On May 17, DeepBind was among four products recognized as U of T Inventions of the Year. The awards, which recognize their uniqueness, potential for global impact and commercial appeal, were presented at the university’s third annual U of T Celebrates Innovation event in front of an estimated 200 guests, including Ontario Lt.-Gov. Elizabeth Dowdeswell.
The award is “a great honour,” says Alipanahi.
“It’s really encouraging to see a computational technique regarded as an invention,” Delong says. “We view DeepBind as just a proof of concept. It’s a conversation starter in the community. There will be more exciting stuff to come. I’m working on some of it but I’m just one small person at the leading edge of a growing wave.”
DeepBind, which combines artificial intelligence and genomic medicine, is the first-ever deep learning application to study mutations linked to diseases that have proven difficult to analyse in the past because of their complexity, such as haemophilia and skin cancer.
For example, skin cancer is caused by more than one gene. However, having these genes does not necessarily mean a person will develop melanoma. Scientists must also consider UV exposure, which can damage the DNA in skin cells, leading to cancer-causing mutations.
The software modules, which are available free for academic use, are able to handle millions of sequences per experiment and can create “mutation maps” to reveal how genetic variations can cause disease.
The goal of their work, Alipanahi says, was to create a powerful algorithm that was fast and accessible to biologists studying these diseases.
Sometimes this type of work is viewed as a “black box. Like a magical tool that is very powerful but you don’t know much about how it works. We tried to help biologists peer into the box and understand it,” he says.
“We were trying to strike a balance,” Delong elaborates. “We wanted a model that was familiar enough so that the results could be interpreted and the biologists could have confidence. But we also wanted to be innovative. Now that more people are onboard with this research direction, we can really let loose creatively.”
Key to their work was a tremendous amount of public data generated by professor Tim Hughes and associate professor Quaid Morris of molecular genetics, not to mention the general atmosphere at the university where you can sit down and learn from world-renowned innovators like Geoffrey Hinton, considered by many as the “godfather” of deep learning, says Alipanahi.
“Being around them just gives you ideas. It gives you direction.”
It also helps to have great colleagues close at hand to bounce ideas off of, echoes Delong.
“One important lesson in all this is just having the right people sitting together, even if they’re working on different things,” he says. “Babak and I got to know each other and find common interest mainly because we were side-by-side every day, eventually finding a project we were both excited about.”