This directory contains samples for Google Cloud Speech API. Google Cloud Speech API enables easy integration of Google speech recognition technologies into developer applications. Send audio and receive a text transcription from the Cloud Speech API service.
Authentication is typically done through Application Default Credentials, which means you do not have to change the code to authenticate as long as your environment has credentials. You have a few options for setting up authentication:
When running locally, use the Google Cloud SDK
gcloud beta auth application-default login
When running on App Engine or Compute Engine, credentials are already set-up. However, you may need to configure your Compute Engine instance with additional scopes.
You can create a Service Account key file. This file can be used to authenticate to Google Cloud Platform services from any environment. To use the file, set the
GOOGLE_APPLICATION_CREDENTIALSenvironment variable to the path to the key file, for example:export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json
Install PortAudio. This is required by the PyAudio library to stream audio from your computer's microphone. PyAudio depends on PortAudio for cross-platform compatibility, and is installed differently depending on the platform.
For Mac OS X, you can use Homebrew:
brew install portaudio
Note: if you encounter an error when running pip install that indicates it can't find portaudio.h, try running pip install with the following flags:
pip install --global-option='build_ext' \ --global-option='-I/usr/local/include' \ --global-option='-L/usr/local/lib' \ pyaudioFor Debian / Ubuntu Linux:
apt-get install portaudio19-dev python-all-dev
Windows may work without having to install PortAudio explicitly (it will get installed with PyAudio).
For more details, see the PyAudio installation page.
Install pip and virtualenv if you do not already have them.
Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.
$ virtualenv env $ source env/bin/activateInstall the dependencies needed to run the samples.
$ pip install -r requirements.txt
To run this sample:
$ python transcribe.py
usage: transcribe.py [-h] [--encoding {LINEAR16,FLAC,MULAW,AMR,AMR_WB}]
[--sample_rate SAMPLE_RATE]
input_uri
Transcribes a FLAC audio file stored in Google Cloud Storage using GRPC.
Example usage:
python transcribe.py --encoding=FLAC --sample_rate=16000 gs://speech-demo/audio.flac
positional arguments:
input_uri
optional arguments:
-h, --help show this help message and exit
--encoding {LINEAR16,FLAC,MULAW,AMR,AMR_WB}
How the audio file is encoded. See https://github.com/
googleapis/googleapis/blob/master/google/cloud/speech/
v1beta1/cloud_speech.proto#L67
--sample_rate SAMPLE_RATETo run this sample:
$ python transcribe_async.py
usage: transcribe_async.py [-h] [--encoding {LINEAR16,FLAC,MULAW,AMR,AMR_WB}]
[--sample_rate SAMPLE_RATE]
input_uri
Sample that transcribes a FLAC audio file stored in Google Cloud Storage,
using async GRPC.
Example usage:
python transcribe_async.py --encoding=FLAC --sample_rate=16000 gs://speech-demo/audio.flac
positional arguments:
input_uri
optional arguments:
-h, --help show this help message and exit
--encoding {LINEAR16,FLAC,MULAW,AMR,AMR_WB}
How the audio file is encoded. See https://github.com/
googleapis/googleapis/blob/master/google/cloud/speech/
v1beta1/cloud_speech.proto#L67
--sample_rate SAMPLE_RATETo run this sample:
$ python transcribe_streaming.py