The listening capacity of digital assistants like Alexa and Siri has become a major privacy sticking point in the last year. A group of researchers out of Northeastern University and Imperial University of London have been studying smart speakers for the last six months to learn more about what triggers them, and whether or not they are “listening” all the time.
The ongoing study found “no evidence to support” the possibility that digital assistants are always listening. The devices did get activated often in the study, primarily by words that sound similar to the phrases meant to wake them up — for Alexa, examples include words with “k” sounds such as “exclamation”, “kevin’s car” and “congresswoman” — but they only stay on for short intervals of a few seconds up to nearly a minute.
The researchers set up several smart speakers — a couple of Echos, a Google Home Mini, Apple HomePod and a Microsoft Cortana-powered Harman Kardon Invoke — and built an area where they could monitor and record when and how they were activated. The team played 125 hours of shows on Netflix with heavy dialog, including The Office, Gilmore Girls and Grey’s Anatomy and recorded which phrases — outside of the traditional “wake words” — activated the devices.
The study found that the Echo Dot 2nd Generation and the Invoke stayed awake the longest, between 20 and 43 seconds. The rest of the devices had shorter activation period, with nearly half of them lasting 6 seconds or less.
While words that sounded like the wake phrases triggered devices, it was challenging to get repeat results. The team repeated its experiments 12 times, and only 8.44 percent of the activations occurred consistently.
The definition of “listening” in this context can get confusing, even for the people who make the devices. Under questioning on an episode of PBS Frontline last week, Amazon devices chief Dave Limp was asked how Amazon could convince millions of people to install “listening devices” in their home. Limp appeared to misstep when answering the question, insisting that Alexa isn’t a listening device, before describing how it’s “listening,” then backtracking.
“I would first disagree with the premise. It’s not a listening device,” Limp said. “The device in its core has a detector on it — we call it internally a ‘wake word engine’ — and that detector is listening — not really listening, it’s detecting one thing and one thing only, which is the word you’ve said that you want to get the attention of that Echo.”
The question of how these virtual assistants are monitoring for wake words will become even more important as they spread to different types of Internet of Things devices, such as a smart ring and Alexa-enabled eyeglasses unveiled by Amazon last year.
Throughout 2019, reports surfaced that teams of employees at tech giants including Google, Amazon, Apple and more listen in to some audio clips of utterances made to their smart speakers to improve the digital assistants. Amid privacy concerns, the companies began allowing users to opt out of having their audio clips reviewed.
The report is only step one of a larger project. Future updates will look into how many activations lead to audio recordings sent to the cloud versus processed only on the device; whether cloud providers accurately show all cases of recorded audio; if the speakers adapt and adjust to what they’ve heard; and how the gender, ethnicity, accent and other factors affect activations of the speakers.