Now that Amazon has rolled out two back-to-back hardware products with voice recognition software, we can expect more as the Seattle retailer gears up for a smartphone unveiling in the next few months.
First, there was the Kindle Fire TV, which allows voice search from your remote control to find TV shows and movies. The second, more obscure launch, was the Amazon Dash, which gives Amazon Fresh customers the ability to add items to a grocery list by either scanning a barcode or speaking into microphone of a device.
Voice recognition is a key component of smartphone, with the technology acting as a personal assistant and safety feature while driving, so with these two launches, it is clearly just beginning for Amazon, which could launch a competitor to Apple’s Siri or Microsoft’s Cortana when its smartphone finally launches in the next few months.
What surprised people the most about its voice recognition technology is how well it worked on video streaming set-top box. For example, in a GeekWire review, BuddyTV CEO Andy Liu wrote: “In my tests, the voice search worked really well — even with ambient noise in the background.”
So, how did Amazon come completely out of left field to roll-out such a smooth product?
To answer that question, we did a little digging around, and our sources tell us that Amazon has made a handful of acquisitions over the past three years in order to own a fairly robust voice recognition system. As one source put it, Amazon now owns more technology in this field than Apple, and definitely owns more than Facebook, which will inevitably need it now that it has bought Oculus.
In fact, Amazon’s know-how may be on par with Microsoft, which also has been showing off with the recent launch of Cortana, and Google.
The three major components of Amazon’s technology are:
• Yap: Amazon acquired the Siri-like competitor in late 2011. It never acknowledged the acquisition of the Charlotte, N.C.-based company, but there was enough of a paper trail for it to be confirmed.
• IVONA: Amazon bought the Polish company that was providing the technology for Amazon Kindle features including “text to speech,” in which a computerized voice automatically reads the words of a book aloud. The 2013 deal sparked speculation that Amazon was buying the tech for its smartphone lineup.
Together, all three provide for a fairly comprehensive offering, from recognizing someone’s speech, to translating what that person has said, to providing answers by either speaking or displaying text.
For good measure, Amazon also inked a deal with Nuance Communications, which is the standalone leader of speech recognition (It also has a deal with Apple). The deal was disclosed in a SEC filing by Nuance, so it’s not known to what extent Amazon is using the technology, if it has at all.
Some of Amazon’s research is being conducted in the Boston area, arguably the home of voice recognition. Other companies with offices there include Nuance, Microsoft and Apple, according to reports by The Boston Globe.
Amazon began building a stealthy office in 2011, hiring Bill Barton, who has holds at least two patents in using different modalities. Also in that office is Jeff Adams, who previously worked at Yap and Nuance. LinkedIn now lists Adams as senior manager at A2Z.com, who works on assembling and managing a top-tier research group for speech and language technology.
That’s not to say that Amazon doesn’t still have some work cut out for it.
It neglected to say during its first pitch on the Kindle Fire TV that voice search only pulls up results from Amazon’s Instant video collection and Vevo. That means you can’t use your voice to search for content from Netflix, Showtime or others, reports Business Insider.
The article also points out that whenever you perform a voice search using the Fire TV, it’s saving a recording of your voice. It does this to improve the quality of search results, according to Amazon’s website (You can delete those recordings, if you find it a little creepy).
Finally, while the technology looks to be good enough for mainstream, Amazon is not providing voice recognition capabilities to third-party developers to integrate within their applications for the Kindle Fire TV. It may have plans to do so eventually, but for now there’s no APIs on the developer portal for voice, and a search does not turn up any documentation.
With Amazon making what is clearly a multi-million dollar investment in the space, you can bet this is just the beginning.