Microsoft has learned a lot about chatbot technology since the unfortunate rollout of the short-lived “Tay” chatbot one year ago this week, when Internet users were able to teach Tay to make racist and misogynistic remarks. This weekend, the company offered insights on the lessons learned from that experience, as well as the huge amount of artificial intelligence work Microsoft is now undertaking.
With encouragement from CEO Satya Nadella, the company’s artificial intelligence team moved on from Tay and started offering a new chatbot aimed at millennials called Zo late last year. Zo is based on the company’s popular XiaoIce Chinese-language bot (which Microsoft rolled out on WeChat in 2014).
“Tay is gone, Zo is the one we are embracing and supporting,” said Xuedong Huang, Microsoft technical fellow of artificial intelligence, during a presentation on Saturday at the AI NEXT tech conference in Bellevue, Wash. “AI is (about) learning from data. We learned from what happened. (With Tay), we didn’t do a super, super good job. With Zo, we are doing a much better job.”
Microsoft doubled-down on artificial intelligence last fall with the formation of a new 5,000-person AI and Research Group.
Huang demonstrated a number of major applications of Microsoft’s AI and chat technologies, including a new implementation Microsoft is now using on its company-wide support website. He showed how you can just click on the “Get started” button to start an immediate chatbot session. Once in the chat, you can ask questions such as “How do I upgrade from Windows 8?”
In the demonstration, not all the answers provided the best possible resolution to the question, but Huang says it is a work in progress and does demonstrate the promise of the technology — and the way Microsoft is committed to using bots to meet mainstream enterprise business requirements.
Li Deng, Microsoft’s chief scientist of artificial intelligence, told the conference that today’s AI and chat solutions are the culmination of several decades of evolution in artificial intelligence — with each stage of AI marking a new generation of solutions.
Deng said that the first generation of AI lasted from the early part of the 1990s until almost the turn of the decade, and was primarily centered on rules and templates.
Such systems offered limited function in terms of the inputs allowed and the outputs provided (think of early voice-based train schedule information systems) and relied on those with expert domain knowledge to design them. They were hard to scale to more than one area of expertise, or domain. You couldn’t, for example, use the framework for a flight-booking system to architect a telephone banking system. Data was only used to design the rules that created the system — not to learn and evolve the way the system operated.
“The problem was this approach was that it was very brittle,” he said. “There are still many systems today based on this approach because it’s easy to interpret the results.”
The next generation of AI came in the late 1990s and was based on “data-driven, shallow learning” and used in conjunction with speech recognition, he said. Its goals were really about using data to reduce the cost of hand-crafting complex dialogues and making systems less sensitive to speech recognition errors. It also did try to leverage data a lot more.
“If you have a lot of data, you automatically learn a lot of things,” he said. “It pushed the state of the art in a very nice direction because it was data-driven.”
But Deng explained that this second generation of AI produced results that were not always easy to interpret or debug — and these systems were hard to update. They also didn’t really replace first-generation systems, but rather ran in parallel with them.
All of that leads to today’s third generation of artificial intelligence, from which Microsoft’s current chat solutions came. Deng says that in these systems, the key difference is data-driven deep learning.
Deng says this deep learning approach powers today’s conversational bots into four categories: social chatbots (such as Microsoft’s Zo and XiaoIce); InfoBots (aimed primarily at retrieving information); task completion bots (that are all about helping you accomplish a particular task, such as booking a flight or troubleshooting a technical issue); and personal assistant bots (which can combine informational retrieval and task completion with recommendations — such as suggesting the best Italian restaurant near you).
Microsoft faces tough competition in all of the above categories for use of bot technology — and notably AI driven by voice. Deng acknowledged the broad field by identifying Amazon’s Echo, Apple’s Siri, Google Now, VocalIQ and IBM Watson Analytics as competitors that play in the same space as Microsoft’s Cortana.
Deng also talked about how Microsoft has combined AI with speech recognition and machine translation technologies. By taking speech recognition technology to turn spoken words into text, Microsoft can then use its machine translation tools to translate those words from one language to another.
Speech synthesis technology then allows those newly-translated words to be converted into spoken words in another language. All of those functions have now been rolled into the latest features for Microsoft Translator, which rolled out late last year for Android, Amazon, iOS and Windows devices.