Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Latest commit

 

History

History
 
 

README.rst

Google Cloud Speech API Python Samples

This directory contains samples for Google Cloud Speech API. Google Cloud Speech API enables easy integration of Google speech recognition technologies into developer applications. Send audio and receive a text transcription from the Cloud Speech API service.

Setup

Authentication

Authentication is typically done through Application Default Credentials, which means you do not have to change the code to authenticate as long as your environment has credentials. You have a few options for setting up authentication:

  1. When running locally, use the Google Cloud SDK

    gcloud beta auth application-default login
  2. When running on App Engine or Compute Engine, credentials are already set-up. However, you may need to configure your Compute Engine instance with additional scopes.

  3. You can create a Service Account key file. This file can be used to authenticate to Google Cloud Platform services from any environment. To use the file, set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path to the key file, for example:

    export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json

Install PortAudio

Install PortAudio. This is required by the PyAudio library to stream audio from your computer's microphone. PyAudio depends on PortAudio for cross-platform compatibility, and is installed differently depending on the platform.

  • For Mac OS X, you can use Homebrew:

    brew install portaudio
    

    Note: if you encounter an error when running pip install that indicates it can't find portaudio.h, try running pip install with the following flags:

    pip install --global-option='build_ext' \
        --global-option='-I/usr/local/include' \
        --global-option='-L/usr/local/lib' \
        pyaudio
    
  • For Debian / Ubuntu Linux:

    apt-get install portaudio19-dev python-all-dev
    
  • Windows may work without having to install PortAudio explicitly (it will get installed with PyAudio).

For more details, see the PyAudio installation page.

Install Dependencies

  1. Install pip and virtualenv if you do not already have them.

  2. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.

    $ virtualenv env
    $ source env/bin/activate
  3. Install the dependencies needed to run the samples.

    $ pip install -r requirements.txt

Samples

Transcribe

To run this sample:

$ python transcribe.py

usage: transcribe.py [-h] [--encoding {LINEAR16,FLAC,MULAW,AMR,AMR_WB}]
                     [--sample_rate SAMPLE_RATE]
                     input_uri

Transcribes a FLAC audio file stored in Google Cloud Storage using GRPC.

Example usage:
    python transcribe.py --encoding=FLAC --sample_rate=16000         gs://speech-demo/audio.flac

positional arguments:
  input_uri

optional arguments:
  -h, --help            show this help message and exit
  --encoding {LINEAR16,FLAC,MULAW,AMR,AMR_WB}
                        How the audio file is encoded. See https://github.com/
                        googleapis/googleapis/blob/master/google/cloud/speech/
                        v1beta1/cloud_speech.proto#L67
  --sample_rate SAMPLE_RATE

Transcribe async

To run this sample:

$ python transcribe_async.py

usage: transcribe_async.py [-h] [--encoding {LINEAR16,FLAC,MULAW,AMR,AMR_WB}]
                           [--sample_rate SAMPLE_RATE]
                           input_uri

Sample that transcribes a FLAC audio file stored in Google Cloud Storage,
using async GRPC.

Example usage:
    python transcribe_async.py --encoding=FLAC --sample_rate=16000             gs://speech-demo/audio.flac

positional arguments:
  input_uri

optional arguments:
  -h, --help            show this help message and exit
  --encoding {LINEAR16,FLAC,MULAW,AMR,AMR_WB}
                        How the audio file is encoded. See https://github.com/
                        googleapis/googleapis/blob/master/google/cloud/speech/
                        v1beta1/cloud_speech.proto#L67
  --sample_rate SAMPLE_RATE

Transcribe streaming

To run this sample:

$ python transcribe_streaming.py