If Seattle’s prestigious Fred Hutchinson Cancer Research Center is going to use cloud computing power to help eradicate cancer, then Matthew Trunnell — a soft-spoken Midwesterner and former carpenter — is the guy who will lead the charge.
Trunnell is determined to move Fred Hutch into the modern age of data, using the cloud and big data to help beat back America’s second-largest cause of death after heart disease.
Running IT for an organization with 3,000 employees is a big job. Now, add directing data science for a top cancer research center and you’ve got a snapshot of the post held by Trunnell, who joined the Fred Hutchinson Cancer Research Center 17 months ago as chief information officer and vice president of IT. It’s a job that combines the more familiar challenges of keeping desktop computers patched, with more esoteric tasks like analyzing large genomic data sets on a cluster of 500 Linux-based computers.
The two sets of tasks aren’t as separate as they may seem.
“As a person with a background in large-scale computing, what I want to hear is, ‘We need a supercomputer.’ But what I hear instead is, ‘I’d like to be able to use the calendaring system to schedule this piece of lab equipment,'” said Trunnell. “So it becomes clear that research computing starts at the interface to technology.”
The 51-year-old St. Louisan has an exotic background for an IT guy.
He started his career in physics and oceanography, doing computational modeling of planetary waves — those caused by the earth’s rotation. Feeling too limited, he moved to a genomics company that was doing large-scale pathogen sequencing.
“The lead bio-informaticist had a background in medieval English literature, and the field was being transformed by people coming in from computational linguistics and from physics,” said Trunnell, who holds a degree in physics from Princeton University and Master of Science degree in Physical Oceanography from the University of Washington. “So I figured, if he can have an impact, maybe I can too. I’ve spent my career since then straddling the line between IT infrastructure and bioinformatics.”
Before joining Fred Hutch, Trunnell spent nine years as CIO of the Broad Institute of MIT and Harvard, where he was responsible for all aspects of information technology and services. Oh — and he spent one year off during college working as a carpenter, a passion that he still enjoys, as he’s currently spending his spare time building a tiny house in the Seattle area.
The move to the cloud
Scientific computing in the biological realm has a confusing welter of names.
Rather than talking about bioinformatics, which the soft-spoken, articulate Trunnell thinks is too broad a term to be meaningful, he likes to break scientific computing into two halves: data engineering (processing data to make it computable) and data science (using that data to answer questions).
Scientific computing is “quite fascinating,” he said. And changes in IT, especially the move of infrastructure and app development to the cloud, are “interesting in a completely different way.”
“That’s much more a cultural evolution, trying to determine what the world will look like even in five years where the computers our researchers are using are not ones that we own and the tools that they’re using are ones that exist out on the internet,” he says.
At the Hutch, founded in 1975, Trunnell has a staff of roughly 90 people helping run IT and another 30 data scientists, data engineers and software engineers helping with scientific computing.
He quotes liberally from scientific papers that no ordinary IT chief would ever even pick up, while emphasizing that: “I’m not a biologist and certainly not a cancer biologist.”
Moving some computing to the cloud, where rented computing power is consumed over the internet, is “absolutely my strategic focus going forward,” Trunnell said.
But at the moment, the Fred Hutch’s use of the cloud is limited.
In fact, over the years, the institution has developed a culture of science, rather than IT. And while the Fred Hutch will certainly maintain those roots in science and research, Trunnell is determined to also institute the power of technology — specifically big data and the cloud — to wipe out cancer.
“It’s been very laboratory-intensive, with a lot of really good basic science, but it’s only now getting to the point where we’re starting to deal with larger data,” he said.
During Trunnell’s time on the job, Fred Hutch has moved “one or two small production research processes” into the cloud. The heaviest computing loads are imposed by genomics research, which is handled by roughly 500 generic Linux-based servers running as a cluster, with conventional load management.
“In the genomics field, the computation is tied directly to individual bits of data, so we can cut up the computation into an arbitrary number of arbitrarily small pieces and put 500 loosely coupled computers to work on them. We don’t need specialized computers,” he said.
Fred Hutch’s research focuses on curative or preventative therapies for human cancers, especially immunotherapeutics. Several compute-heavy projects are under way there, including a study of the molecular evolution of proteins and viruses, a computational biology public health project, and computer modeling in service of vaccine development.
Building ‘big data’
But Fred Hutch is seeking ever-larger data sets to analyze, as they tend to yield more interesting results. Its own data generation is relatively small, because patient-treatment data is usually created only for billing purposes and because even counting its affiliated institutions — the University of Washington, Seattle Children’s Hospital and the Seattle Cancer Care Alliance — it has had only 400,000 patients.
“One colleague said you need at least 100 million data points from a data-science perspective,” Trunnell said.
So it hopes to add in data shared from consortia created for just that purpose, such as ORIEN and GENIE. ORIEN (the Oncology Research Information Exchange Network) is a research partnership and data-sharing network. GENIE (Genomics Evidence Neoplasia Information Exchange) is a project of the American Association for Cancer Research that aggregates cancer genomic data and clinical outcomes from tens of thousands of cancer patients.
Big data initiatives such as those are giving hope that new therapies might be around the corner, one of the reasons why Trunnell last year declared at a conference in Washington state that “we have entered the age of data.”
Fred Hutch might add genomic data to every patient’s record. It could even add patients’ browsing histories, which have been shown to pre-diagnose those pancreatic cancer patients who searched over time for a progression of symptoms that matched that of the cancer.
Those histories could also include environmental factors, such as whether patients are filling their prescriptions or engaging in risky behaviors like smoking and drinking.
Fred Hutch also is following mobile applications that could also swell its data.
For example, a project at nonprofit research organization Sage Bionetworks is testing an iOS app harnessing the iPhone’s sensors to monitor dexterity, gait and balance as part of learning whether smartphones can help measure the progression of Parkinson’s Disease.
Eventually, the cloud — or rather, several of them — will definitely play a role in analyzing whatever big data the Hutch can compile.
“I don’t expect we will ever be monogamous with our cloud providers,” Trunnell said.
Microsoft’s Azure Data Lake offers appealing services for big data-style computing. “You can build something similar in Amazon yourself, by plugging together the services they have, but this is something Microsoft already offers,” he said.
Amazon Web Services “is far and away the leader in terms of technical capabilities. They have so many different services that are well thought out and run very well. The business model is very much about expecting the end user to assemble these Legos into components. For certain problems, that’s easier than other problems.”
As to Google Cloud, “it’s pretty late to the cloud space and is arguably playing catch-up. (But) it has Bigtable (a noSQL database), BigQuery (an analytics data warehouse) and TensorFlow (an open-source library for numerical computation), which are tremendous resources. There really aren’t equivalents.”
The economics of cloud computing within a research institution are complex.
To help defray administrative costs, Fred Hutch imposes some fees on top of the costs of cloud computing, Trunnell said. That makes it “hard to make the economic argument to go to the cloud — not because the numbers aren’t compelling but it’s very hard to do the math in a way that is consistent,” Trunnell said.
Fred Hutch’s computing loads tend to be “bursty,” varying widely in duration and intensity. That’s because researchers may run large batches of computing one day and then none for several days, rather than steady, consistent loads. That bursty quality means that buying enough hardware to accommodate compute-intensive periods would leave the expensive gear under-used for much of the time. That is “the compelling argument for me” to move toward the cloud, he said.
To resolve the cost issue, “I have to commit to a different cost model, a shift from capital equipment to operating costs.” That’s a familiar refrain among IT professionals seeking to move more into the cloud, where hardware is rented rather than owned.
Satya Nadella brings ‘fantastic motivation’
Having tech-company leaders — Microsoft CEO Satya Nadella and Amazon worldwide sales VP Mike Clayville — on Fred Hutch’s board of trustees is “a reflection of the fact that we’re being recognized in the technology community,” Trunnell said. “And they are recognizing that they have the potential to impact the work we are doing.”
Nadella, one of five recent appointees to the board, “is one of the most data-driven people I have ever met, and as such, he’s a fantastic motivation for the changes that I’m trying to make here,” Trunnell said.
Nadella “told me he starts every morning looking at data from the enormous analytics group Microsoft has built out, analyzing sentiment analysis based on facial-expression analysis in their stores. It’s really going to shape a lot of what I’m doing here.”
Matt McIlwain, managing director of Seattle’s Madrona Venture Group, is vice chair of the board of trustees. He described Trunnell as “a savvy technology leader who also has passion for the intersection of data science, computer science and the life sciences.”
Trunnell’s job gives him “a sense of urgency I haven’t felt before,” he said. It stems from “tantalizing early clinical results from immunotherapy, where we can start talking about curing cancers, not just treating it but actually making it go away,” Trunnell said.
Keeping abreast of rapid advances in technology is another source of anxiety. “I worry I’m not going to be fast enough to keep up with transformative developments like applying machine learning to problems of genomic analysis.”
Trunnell said he believes the prediction last year by Fred Hutch president Dr. Gary Gilliland that it is actually plausible that cures and therapies for most, if not all, human cancers will be developed in 10 years.
Some cultural impediments, such as vendors of electronic health-record systems refusing to share their data, have fallen away, thanks to President Obama’s Precision Medicine Initiative, announced in January 2015, Trunnell observed.
“There are pieces of this problem that have been holding us back at all different levels, whether it be interpreting notes from clinical records or curating data sets or integrating data sets,” he said. “I now see that we can start to bring these new computational tools to bear on (those). The pieces are there. The will is there.”
- On Thursday evening from 6:30 to 8 p.m., Trunnell will make a free, public presentation titled “Biomedical Data Science – Leveraging Big Data in the Understanding and Treatment of Disease” at Fred Hutch’s Weintraub Auditorium, 1100 Fairview Avenue North in Seattle. Seating is limited and an RSVP is required.
- GeekWire will broadcast its weekly podcast from the Fred Hutchinson Cancer Research Center on Nov. 10th from 3 p.m. to 4:30 p.m. We’ll be joined on stage by Dr. Jim Olson, the pioneering research scientist at Fred Hutch who is using scorpion venom to fight cancer. Tickets and details here.