Microsoft and Alibaba AI programs beat humans in Stanford reading comprehension test for 1st time

Harry Shum, EVP for Microsoft AI & Research

Machines can already outplay us in chess, poker and other games, and now they are becoming better readers as well.

AI programs from both Microsoft and Alibaba outperformed humans earlier this month on a reading comprehension data set developed at Stanford. “Crowdworkers” scraped more than 500 Wikipedia articles to produce more than 100,000 question-and-answer sets for the test.

Here’s a sample question: “What year did Genghis Khan die?” (Spoiler alert: It’s 1227.)

“This is the first time that a machine has outperformed humans on such a test,” Alibaba said in a statement.

Microsoft’s score of 82.6 and Alibaba’s grade of 82.4 beat out the human standard of 82.3. Other notable AI programs participating in the test and closing in on beating human scores come from the Allen Institute for Artificial Intelligence, Tencent, Salesforce and others.

A strong start to 2018 with the first model (SLQA+) to exceed human-level performance on @stanfordnlp SQuAD's EM metric! Next challenge: the F1 metric, where humans still lead by ~2.5 points!https://t.co/Uq10Dm2Ss5

— Pranav Rajpurkar (@pranavrajpurkar) January 11, 2018

As noted by CNN, Alibaba has put the technology into practice already, using it to answer customer service questions during its massive Singles Day holiday shopping event. In a statement, Alibaba said the technology can be “gradually applied to numerous applications such as customer service, museum tutorials and online responses to medical inquiries from patients, decreasing the need for human input in an unprecedented way.”

Microsoft’s top entry comes out of Microsoft Research Asia. Here’s how the AI program called R-Net works, according to Microsoft.

We first match the question and passage with gated attention-based recurrent networks to obtain the question-aware passage representation. Then we propose a self-matching attention mechanism to refine the representation by matching the passage against itself, which effectively encodes information from the whole passage. We finally employ the pointer networks to locate the positions of answers from the passages.

Microsoft has invested heavily in artificial intelligence, with its AI and Research group formed by Microsoft CEO Satya Nadella in 2016 as a fourth engineering division at the company. In its first year, the group grew by 60 percent to more than 8,000 people.

Microsoft and Alibaba AI programs beat humans in Stanford reading comprehension test for 1st time

Most Popular on GeekWire

Job Listings on GeekWork

Related Stories

Most Popular on GeekWire

Job Listings on GeekWork