Starting today, third-party developers will have access to the same speech recognition technology that powers Google’s products. Available in Google Cloud, the Cloud Search API has also been updated with new features and improved performance.
Launched in open beta last summer, Cloud Speech allows developers to convert audio to text with a simple to use API. Neural network models can recognize over 80 languages and variants, with transcription available immediately after speaking.
The API is built on the core technology that powers speech recognition for Google Assistant, Search, and Now, though it has been adapted to better fit the needs of Cloud customers.
Developer feedback has helped Google improve transcription accuracy for long-form audio and process data 3x faster than the initial version. Additionally, more audio file formats are supported, including WAV, OPUS, and Speex.
With context-aware recognition that tailors listening according to the scenario, Google notes that early adopters have primarily used the API to control apps and devices with voice search, commands, and Interactive Voice Response. Cloud Speech can run on a variety of IoT devices including cars, TVs, speakers, as well as phones and PCs.
The second frequent use case is with speech analytics that allows for “real-time insights from call centers.” Some businesses have used this in particular to monitor customer interactions and increase sales.
Pricing details for the API are available on the Google Cloud Platform site.