Amazon is crafting its Alexa voice-based digital assistant to be like one of its employees, and now, with the introduction of several new services at the Amazon Web Services re:Invent conference, it’s taking steps to put that “employee” almost anywhere.
“We have a very long-term view of how we think she (Alexa) should interact with our customers,” said Mike George, VP of Alexa, Echo and the Appstore, in an interview with GeekWire at the conference. “We want her to possess the same traits that we like in our employees. We want her to be smart, and helpful, and a little humble, and we want her to be funny once in a while. . . . We want her to be able to answer any questions that you ask.”
Alexa got her start in 2014 when Amazon released the Echo, a hands-free, voice-controlled device that communicates by means of technology in the cloud. Alexa will play music, provide information, control a smartphone and allow Prime members to re-order items. She learns speech patterns and personal preferences and over time increases her vocabulary. Then came Alexa in the Fire TV and Fire TV Stick, Amazon Dot (a compact version of the Echo) and Tap (a touch-to-activate variant on the Echo), and the Fire Tablet.
But those are only Amazon’s own products. Amazon in June 2015 made the Alexa APIs available, letting developers incorporate her capabilities into both Amazon and non-Amazon devices, such as the Triby smart speaker. Developers have so far created and posted to an online marketplace more than 5,400 “skills” that Alexa can execute, such as playing Jeopardy, asking Lyft for a ride and keeping a shopping list current. Amazon envisions Alexa being available in “things like consumer electronics, appliances, automobiles and wearables,” George said.
He declined to specify how many non-Amazon devices now have Alexa capabilities but said, “We have a great deal of activity going on right now,” including with startups. One is Nucleus, a maker of home intercoms.
“We’re seeing interest from pretty much everyone . . . (even) spaces we didn’t necessarily set out to interact with,” George said. “We’re seeing that once someone interacts with Echo in their home, they want to interact in their car, at their work, in their hotel and in their whatever. We’re investing a great deal of time and energy in (voice interaction), because every data point we get from our customers is that it’s taking things they’ve found complicated and making them simple.”
Amazon hopes Alexa will help it win the battle for home-control networks, which is littered with the remains of many companies and projects that have tried and failed over the years. “We’re getting great feedback from customers that what we’ve done to control smart-home devices has simplified it, so that it’s within the reach of mere humans to actually set up” a controllable system, George said.
At re:Invent, Amazon took steps to further Alexa’s reach. It released Lex, a service allowing developers to build complex natural-language conversational speech and text-to-speech capabilities into applications. Lex is based on the same deep-learning technologies behind Alexa. Another new offering, Polly, gives developers the same text-to-speech capabilities that work in Alexa, using deep learning to synthesize speech that sounds a lot like a real human voice. The draw behind these services is that developers can take advantage of Amazon’s work in machine learning without having to do that complex, difficult work themselves.
Amazon CTO Werner Vogels offered a full description of the new services in this blog post.
“We’re seeing a great deal of activity and the simplification of smart home controls, in both things we’re doing and things we’re doing through third parties,” George said. “I think over time she’s just going to get more and more naturally conversational and be able to do much more, answer many more questions.”
Machine learning underlies Alexa’s technology. That means “it learns to better adapt to your speech, the things that people say, so as she gets smarter and smarter, understanding all the different variants or ways that someone might express a desire to know what time it is, or a desire to listen to a particular type of music,” said Al Lindsay, VP of Amazon speech and Echo software, during the interview.
Machine learning is also being used by Amazon to improve the naturalness of Alexa’s spoken voice and her ability to navigate complexities like the differing pronunciation of “read” depending on its use in a sentence.
George and Lindsay didn’t have much to say about competing voice services: Apple’s Siri, Google’s Assistant and Microsoft’s Cortana. “I’m glad we’re all investing in voice, though,” George said. “I think it’s going to help customers in general.”
What will Alexa be like by the time next year’s re:Invent rolls around?
“She’ll be smarter, able to answer more questions, probably a little more conversational,” George said. “She’ll have many more capabilities, from us and from third parties. You’ll see Alexa manifest herself on more and more of other people’s hardware, doing more and more things.”