It was a major coup last summer when The University of Washington hooked computer science professor Carlos Guestrin, a leading data scientist and expert in machine learning.
What wasn’t widely known at the time was that the Guestrin-led open source project, an ambitious undertaking known as GraphLab.org, would soon sprout into its very own startup.
But that’s how it’s playing out.
Just 10 months after arriving in Seattle with his wife, University of Washington statistician Emily Fox, Guestrin is announcing that he’s formed a new startup company by the name of GraphLab that builds off the initial research of the open source project that he helped start five years ago.
Guestrin also is announcing that he’s raised $6.75 million in series A funding from Madrona Venture Group — one of Seattle’s leading venture firms — and NEA — the Silicon Valley behemoth that’s best known around the Northwest as the primary backer of Tableau Software.
Guestrin is serving as CEO of the new company, which will operate from the year-old technology incubator space in Fluke Hall on the University of Washington campus. He will continue to hold on to his day job as the Amazon Professor of Machine Learning at the University of Washington, balancing his work on the startup with his professor duties. A number of new staffers are already being added to the team, including experts in big data and machine learning.
“We are at a very exciting stage. I am building my dream,” said Guestrin, who previously worked at Carnegie Mellon University. “This is a project that is exciting, and we can change the world and have a lot of impact.”
So, what’s the big idea behind this big data startup?
GraphLab, like its open source sister GraphLab.org, is designed to help large and small organizations make better sense of data. Mining social networking graphs, GraphLab makes sense of relationships between people, the products they buy or the activities they participate in.
For example, the technology could be used to uncover communities of people in social networks that aren’t previously known to one another, forming groups where common interests reside.
“That idea of connecting different sets of people together is something that you can discover by analyzing the social graphs,” Guestrin tells GeekWire.
The core technology is already at work in some very familiar places.
Pandora, the wildly popular online music service, uses GraphLab.org to make better song recommendations. Online retailers use it to recommend specific products that others may want to buy, while financial institutions use it to get a better handle on fraud prevention.
In a release, Pandora’s Chief Scientist, Eric Bieschke, called the GraphLab team “second to none” and noted that they “consistently raise the bar for graph based machine learning.”
Of course, successfully forming a for-profit business on the back of an open source technology can be tricky. But Guestrin believes he can navigate that path.
“I love the open source community. I am excited about it, and I have benefitted from it, and I think we have done a lot of contributing back to that community,” he said. “I believe it is possible to maintain a for-profit company, and still be supportive of the open source community. And there are other examples of that.”
He said the new GraphLab for-profit entity will very much remain closely connected to the open source effort. But it also plans to offer additional tools and support, on top of what is freely available.
Guestrin declined to disclose specifics on how that might work, but he said in some cases GraphLab could work with individual companies to develop new tools.
Guestrin said that they’ve had “tens of thousands of downloads” of the GraphLab open source code, including some users from major corporations. “It’s been widely, widely used,” he said.
Companies such as Amazon.com and Facebook certainly come to mind as potential users of the new GraphLab service, and Guestrin didn’t rule out that possibility. He’s talked to representatives at both companies informally, and noted that the technological tools could provide even more value.
“In terms of scalability, the system that we have built and designed is significantly stronger than any other system that I am familiar with, even from hints I get from these private companies,” he said. “I think we can provide significant new value from what they have already done internally.”
As a result of the funding round, Madrona’s Matt McIlwain and NEA’s Greg Papadopolous have joined the board. McIlwain called Guestrin an “exceptional talent,” and added that the “graph specific solutions” could help “answer some of the bigger questions of our time.”
As an academic, Guestrin is not typically on the VC fundraising path. But he met with success after what he described as “a lot of pounding.”
“It is an interesting experience because at the same time — based on my visibility in the academic community — I get a combination of automatic credibility and skepticism. It is a balance of saying we have something unique, but yes, we understand that there are some significant business questions that have to be answered in the development of this project in the long run. It was a healthy dosage of both.”
In addition to the board of directors, GraphLab has formed an advisory board that includes:
—Hank Levy: Chairman of the Department of Computer Science & Engineering at The University of Washington.
—Kai Li: Co-founder of Data Domain (acquired by EMC for $2.1 billion in 2009).
—Joe Hellerstein: Co-Founder and CEO of Trifacta Inc., and a Chancellor’s Professor of Computer Science at Berkeley.
—Sujal Patel: Co-founder and former CEO of Isilon Systems (acquired by EMC for $2.5 billion in 2010)
—Chris Stolte: Co-founder and Chief Development Officer at Tableau Software.
Follow us on Twitter @GeekWire.