Exploring the Cloud Speech API
Please refer to the Setting up a rest client section from Chapter 3, Cloud Vision API, to set up a REST API client, either Postman or cURL, before you continue. Now that we have all the required setup done, let's get started with exploring the API. In this section, we are going to upload a single channel,
Linear16
encoded, with a 44100 sample rate, in base64
format, to Cloud Speech API and get its transcription. There are three ways we can convert audio to text using the Cloud Speech API:
- Synchronous speech recognition
- Asynchronous speech recognition
- Streaming speech recognition
Synchronous speech recognition
If our audio file is less than 1 minute, Synchronous speech recognition is a good fit. The results of the request are near real-time, that is, the transcription results are sent along with the response to this request. We are going to use Synchronous speech Recognition in our application and exploration.
Asynchronous speech recognition
If our audio file is longer...