Google continues to focus on ways it can use its artificial intelligence research to court cloud customers, announcing Tuesday that it will make the text-to-speech technology inside some of its more popular applications to developers using Google Cloud.
Cloud Text-to-Speech can be used by new-app developers to read this article out loud to users, or as part of a connected devices that would like to address its users with natural language. The technology, which can be found in Google Maps and Google Assistant, supports 32 different voices across 12 languages and gives developers a few knobs to tweak to adjust speaking rate or pitch.
Google is also using technology developed by its Deepmind AI unit in Cloud Text-to-Speech. WaveNet is designed to make computer-generated voices sound less like a computer, and a new version available in Cloud Text-to-Speech uses Google’s Cloud TPU machine-learning processors to generate realistic speech output after being trained to recognize the characteristics of actual voices.
Amazon Web Services offers a similar service called Amazon Polly, and Microsoft Azure also has a Bing API for these types of applications. It wasn’t immediately clear how Cloud Text-to-Speech compares to those two services, but Google said that human testers rating the output of the new WaveNet service thought it did much better at sounding like a real person than the earlier version.
Voice input and output is becoming a huge part of consumer electronics and cloud services, thanks to the surge in sales of devices like Amazon’s Echo and Google Home. It’s also becoming a bigger and bigger part of smartphone use as microphones get better, and accurately reproducing text and speech from various sources is probably going to be stable stakes for cloud providers in a short time.