Microsoft is going beyond the concepts of image recognition and machine learning to make artificial intelligence smarter, and building systems that can read text, comprehend the context behind it and even ask and answer questions.
This effort is in part the fruits of its acquisition in January of Maluuba, a Montreal-based company that uses deep learning to develop natural-language understanding. According to a Microsoft blog post, two research teams, one at the company’s Redmond, Wash., headquarters and one in its Beijing, China, research lab, are involved in a competition run by Stanford University that uses information from Wikipedia to test how well AI systems can answer questions.
“We’re trying to develop what we call a literate machine: A machine that can read text, understand text and then learn how to communicate, whether it’s written or orally,” Kaheer Suleman, co-founder of Maluuba said in the blog post.”
This kind of knowledge can help people quickly get through dense material such as libraries of legal documents, automotive manuals and medical studies.
In the blog post, Microsoft’s Allison Linn writes that machine reading is harder than concepts like image recognition because it requires knowledge outside of the specific set of data that the system is trying to interpret.
Microsoft said this kind of technology could revolutionize the modern search engine. It also could fit into the voice assistant work that Microsoft, Amazon, Google, Apple and others are deeply investing in. Such context-heavy learning could give Cortana a boost as it seeks to unseat Amazon in the digital voice market.
According to the blog post, the roots of Microsoft’s current machine reading efforts go back more than 20 years. The increased power that comes from cloud computing, large swaths of available data to test and advances in deep learning algorithms are enabling breakthroughs in this field.
While Microsoft cautions that its AI is still behind the comprehension of a knowledgable human, it is making great strides. Microsoft last fall said its researchers had tested software that recognizes human speech as well as humans do, with a record low error rate of 6.3 percent. Twenty years ago, the best rate achieved was greater than 43 percent.