G’day, Alexa: Amazon’s digital voice assistant is getting an Australian accent for its Down Under debut this week, but there’s more in store for users around the world.
Thanks to SSML, or speech synthesis markup language, Alexa developers can already make an Amazon Echo or other voice-controlled device whisper, or speak faster or slower, or speak with a super-cheery voice. And Alexa’s users can change her speaking style to British English, or German, or Japanese.
Like Google’s AI assistant, Alexa can now associate voices with specific people: Users can follow the instructions in the Alexa mobile app to train devices so that they distinguish your voice from others.
Outside developers will be getting access to that feature, known as the Your Voice API. That means voice identification could soon be popping up in third-party skills, said Nikko Strom, a senior principal scientist at Amazon and founding member of the Alexa team.
That’s likely to be a big deal, considering that there are more than 25,000 third-party skills already available for Alexa. Strom touched on the implications during a talk today at the AI NextCon conference in Bellevue, Wash. For example, could Alexa verify financial transactions using voice alone?
“Well, it depends,” Strom said. “It’s not perfect. It depends on how risk-averse you are. I don’t think we would enable this for the transfer of large sums of money between banks, for example.”
Voice authorization can be enabled for shopping at Amazon, however. “You don’t have to say your PIN number if we can identify you as the right person, but this is an optional feature,” Strom said. “In that case, there would be a message sent to you on your phone or somewhere, so you can cancel it if you think it was done in error.”
Generally speaking, “voiceprints are not safe enough for large, important transactions,” he said.
Strom said Amazon is constantly fine-tuning Alexa’s AI smarts in an effort to make the user experience less clunky and more robust. “The common theme here is known the context,” he said.
For example, based on your personal profile, Alexa could respond differently to a request to “play ‘The Lost City of Z.'” For some users, Alexa will start reading the audiobook on Amazon Echo. For others, Alexa will show the movie version via Amazon Fire TV.
“This works now. … The exact same item will retrieve different types of content, depending on what device you’re on,” Strom said. “It would be very natural if there was a ‘Wizard of Oz’ human behind the scenes. They would know exactly what to do. But it’s actually not as easy to do this automatically with an AI. We solved this with machine learning on the back end.”
Alexa has come a long way since its introduction to the world, a little more than three years ago, but there are still lots of things it can’t help you with.
For example, there’s no use asking it where Amazon will be putting its second headquarters, known as HQ2. It takes a human to come up with a playful answer to the question of the day.
“Yeah, I know everything about it,” Strom joked. “I know who’s going to win.”