Seattle’s Allen Institute for Artificial Intelligence says its academic search engine, Semantic Scholar, is now in high gear — thanks to a power boost from Microsoft that helped expand its reach to every field of science.
Over the course of just a few months, Semantic Scholar’s database has gone from indexing 40 million research papers in computer science and biomedicine to taking in more than 175 million papers. The database not only covers the time-honored physical sciences, but also political science and sociology, art and philosophy.
“That’s enabled us to take the research that we’ve done in making AI a tool for overcoming information overload in science [and turn it into] a tool that is now usable by, essentially all scholars around the world,” Doug Raymond, general manager of Semantic Scholar, told GeekWire.
Raymond said the Allen Institute for Artificial Intelligence, or AI2, has also partnered with Impactstory and Unpaywall to improve its coverage of open-access publications. Those partnerships have enabled AI2 to analyze how open-access publications are changing the way scientific findings are distributed.
Semantic Scholar has been pursuing its expansion plan gradually since the first of the year, but based on usage statistics, the search engine’s fans have already noticed the difference.
“Our high-water mark in 2018 was a bit over 2 million monthly active users,” Raymond said. “We’re now at over 6 million. Due in part to our expansion to other areas of science — and also due to, I believe, the novel features that we’re able to support with our AI research — we’re seeing consistent growth in usage throughout the year, which is a validation of our goal to provide something that benefits the advancement of science in general.”
Raymond said the foundation for the expansion was laid late last year, with Microsoft as a key partner.
“This is a fruition of some of our work, starting with the Microsoft Academic Graph and the Microsoft research group that developed the Microsoft Academic Graph, to expand our scope at high quality,” he said. “That’s one of the reasons why we wouldn’t have done this one year or two years ago. We wanted to make sure that the experience was as good or better than it is in the domains we initially covered.”
AI2 had to make sure that the computer models it trained to rank papers about computer science and biomedicine for the most relevant content for a given search would work as well for more than a dozen other fields of research. The institute also partnered with more than a dozen academic publishers to expand the search engine’s reach to previously closed-off corners.
In addition to seeking out relevant research papers, Semantic Scholar now provides links to the news reports, blog items and videos that were sparked by those papers — which comes in especially handy for putting text originally written for an academic audience into more easily understood terms.
Raymond said today’s announcement marks the culmination of AI2’s months-long campaign. “We’ve now hit the bar where we think our experience is high-quality across all domains,” he said.
Semantic Scholar can be used as a tool for conducting original research as well as for finding previously published research. Here are some examples:
- One study determined that it may take 118 years for female computer scientists to reach parity with male counterparts in publishing computer science papers.
- Another study reported evidence of gender-based bias in clinical studies, based on an analysis of data from more than 43,000 published articles and 13,000 clinical trial records.
- Yet another study found that China has already surpassed the United States in the raw volume of AI research published, and is on track to edge out the United States.in high-quality AI research by 2025.
More recently, Semantic Scholar has been helping AI2 track the rise of open-access research. Raymond said the total number of open-access research papers is rising at an average annual rate of 9.9% over the past 10 years, to the point that they now represent 29% of all published articles.
AI2’s analysis also shows disparities in the adoption of open-access publishing, depending on academic discipline: Raymond said 45% of the research papers on biological topics are available via open access, compared with about 15% of papers focusing on the humanities.
The search engine can also be used as a springboard for new applications and online services. An early example is Supp.AI, a search tool that helps doctors and consumers find out how drugs and nutritional supplements could interact.
“All universities, all scientists today should be looking at the many ways that AI and deep learning can advance research and scientific progress,” AI2 CEO Oren Etzioni said in a news release.
Like the other software tools developed at AI2, Semantic Scholar is free to the public, in line with the open-science philosophy that guided the late Microsoft co-founder and philanthropist Paul Allen when he created the institute in 2013.
“We’ve never charged, and we have no intention of ever making our services subscription-based or charging for them,” Raymond said. “We think of Semantic Scholar as a free tool for any universities or scholars to use. … We’re a non-profit, and our mission is to make science more accessible.”