0% found this document useful (0 votes)

179 views

Spoken Language Processing in Python Chapter2

This document discusses the SpeechRecognition library in Python for processing spoken language. It provides an overview of the library and how to use its main features. The Recognizer class can be used to recognize speech from audio files or data using built-in functions that interface with speech APIs like Google, Bing, etc. Examples are given for recognizing speech from different languages, dealing with non-speech audio, showing all recognition results, handling multiple speakers, and adjusting for noisy audio. The document aims to help users get started with speech recognition in Python.

Uploaded by

Fgpeqw

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

179 views

Spoken Language Processing in Python Chapter2

Uploaded by

Fgpeqw

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

SpeechRecognition

Python library
S P OK EN LAN GUAGE P ROCES S IN G IN P YTH ON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Why the SpeechRecognition library?
Some existing python libraries

CMU Sphinx

Kaldi

SpeechRecognition

Wav2letter++ by Facebook

SPOKEN LANGUAGE PROCESSING IN PYTHON

Getting started with SpeechRecognition
Install from PyPi:

$ pip install SpeechRecognition

Compatible with Python 2 and 3

We'll use Python 3

SPOKEN LANGUAGE PROCESSING IN PYTHON

Using the Recognizer class
# Import the SpeechRecognition library
import speech_recognition as sr

# Create an instance of Recognizer

recognizer = sr.Recognizer()

# Set the energy threshold

recognizer.energy_threshold = 300

SPOKEN LANGUAGE PROCESSING IN PYTHON

Using the Recognizer class to recognize speech
Recognizer class has built-in functions which interact with speech APIs
recognize_bing()

recognize_google()

recognize_google_cloud()

recognize_wit()

Input: audio_file

Output: transcribed speech from audio_file

SPOKEN LANGUAGE PROCESSING IN PYTHON

SpeechRecognition Example
Focus on recognize_google()

Recognize speech from an audio le with SpeechRecognition:

# Import SpeechRecognition library

import speech_recognition as sr

# Instantiate Recognizer class

recognizer = sr.Recognizer()

# Transcribe speech using Goole web API

recognizer.recognize_google(audio_data=audio_file
language="en-US")

Learning speech recognition on DataCamp is awesome!

SPOKEN LANGUAGE PROCESSING IN PYTHON

Your turn!
S P OK EN LAN GUAGE P ROCES S IN G IN P YTH ON
Reading audio les
with
SpeechRecognition
S P OK EN LAN GUAGE P ROCES S IN G IN P YTH ON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
The AudioFile class
import speech_recognition as sr

# Setup recognizer instance

recognizer = sr.Recognizer()

# Read in audio file

clean_support_call = sr.AudioFile("clean-support-call.wav")

# Check type of clean_support_call

type(clean_support_call)

SPOKEN LANGUAGE PROCESSING IN PYTHON

From AudioFile to AudioData
recognizer.recognize_google(audio_data=clean_support_call)

AssertionError: ``audio_data`` must be audio data

# Convert from AudioFile to AudioData

with clean_support_call as source:

# Record the audio

clean_support_call_audio = recognizer.record(source)

# Check the type

type(clean_support_call_audio)

SPOKEN LANGUAGE PROCESSING IN PYTHON

Transcribing our AudioData
# Transcribe clean support call
recognizer.recognize_google(audio_data=clean_support_call_audio)

hello I'd like to get some help setting up my account please

SPOKEN LANGUAGE PROCESSING IN PYTHON

Duration and offset
duration and offset both None by default

# Leave duration and offset as default

with clean_support_call as source:
clean_support_call_audio = recognizer.record(source,
duration=None,
offset=None)

# Get first 2-seconds of clean support call

with clean_support_call as source:
clean_support_call_audio = recognizer.record(source,
duration=2.0)

hello I'd like to get

SPOKEN LANGUAGE PROCESSING IN PYTHON

Let's practice!
S P OK EN LAN GUAGE P ROCES S IN G IN P YTH ON
Dealing with
different kinds of
audio
S P OK EN LAN GUAGE P ROCES S IN G IN P YTH ON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
What language?
# Create a recognizer class
recognizer = sr.Recognizer()

# Pass the Japanese audio to recognize_google

text = recognizer.recognize_google(japanese_good_morning,
language="en-US")

# Print the text

print(text)

Ohio gozaimasu

SPOKEN LANGUAGE PROCESSING IN PYTHON

What language?
# Create a recognizer class
recognizer = sr.Recognizer()

# Pass the Japanese audio to recognize_google

text = recognizer.recognize_google(japanese_good_morning,
language="ja")

# Print the text

print(text)

?????????

SPOKEN LANGUAGE PROCESSING IN PYTHON

Non-speech audio
# Import the leopard roar audio file
leopard_roar = sr.AudioFile("leopard_roar.wav")

# Convert the AudioFile to AudioData

with leopard_roar as source:
leopard_roar_audio = recognizer.record(source)

# Recognize the AudioData

recognizer.recognize_google(leopard_roar_audio)

UnknownValueError:

SPOKEN LANGUAGE PROCESSING IN PYTHON

Non-speech audio
# Import the leopard roar audio file
leopard_roar = sr.AudioFile("leopard_roar.wav")

# Convert the AudioFile to AudioData

with leopard_roar as source:
leopard_roar_audio = recognizer.record(source)

# Recognize the AudioData with show_all turned on

recognizer.recognize_google(leopard_roar_audio,
show_all=True)

[]

SPOKEN LANGUAGE PROCESSING IN PYTHON

Showing all
# Recognizing Japanese audio with show_all=True
text = recognizer.recognize_google(japanese_good_morning,
language="en-US",
show_all=True)
# Print the text
print(text)

{'alternative': [{'transcript': 'Ohio gozaimasu', 'confidence': 0.89041114},

{'transcript': 'all hail gozaimasu'},
{'transcript': 'ohayo gozaimasu'},
{'transcript': 'olho gozaimasu'},
{'transcript': 'all Hale gozaimasu'}],
'final': True}

SPOKEN LANGUAGE PROCESSING IN PYTHON

Multiple speakers
# Import an audio file with multiple speakers
multiple_speakers = sr.AudioFile("multiple-speakers.wav")

# Convert AudioFile to AudioData

with multiple_speakers as source:
multiple_speakers_audio = recognizer.record(source)

# Recognize the AudioData

recognizer.recognize_google(multiple_speakers_audio)

one of the limitations of the speech recognition library is that it doesn't

recognise different speakers and voices it will just return it all as one block
of text

SPOKEN LANGUAGE PROCESSING IN PYTHON

Multiple speakers
# Import audio files separately
speakers = [sr.AudioFile("s0.wav"), sr.AudioFile("s1.wav"), sr.AudioFile("s2.wav")]

# Transcribe each speaker individually

for i, speaker in enumerate(speakers):
with speaker as source:
speaker_audio = recognizer.record(source)
print(f"Text from speaker {i}: {recognizer.recognize_google(speaker_audio)}")

Text from speaker 0: one of the limitations of the speech recognition library
Text from speaker 1: is that it doesn't recognise different speakers and voices
Text from speaker 2: it will just return it all as one block a text

SPOKEN LANGUAGE PROCESSING IN PYTHON

Noisy audio
If you have trouble hearing the speech, so will the APIs

# Import audio file with background nosie

noisy_support_call = sr.AudioFile(noisy_support_call.wav)

with noisy_support_call as source:

# Adjust for ambient noise and record
recognizer.adjust_for_ambient_noise(source,
duration=0.5)
noisy_support_call_audio = recognizer.record(source)

# Recognize the audio

recognizer.recognize_google(noisy_support_call_audio)

hello ID like to get some help setting up my calories

SPOKEN LANGUAGE PROCESSING IN PYTHON

Let's practice!
S P OK EN LAN GUAGE P ROCES S IN G IN P YTH ON

Risk Assessment - Pull Out Test
No ratings yet
Risk Assessment - Pull Out Test
12 pages
Credit Risk Modeling in Python Chapter3
No ratings yet
Credit Risk Modeling in Python Chapter3
35 pages
Python Basics
No ratings yet
Python Basics
24 pages
De Mod 5 Deploy Workloads With Databricks Workflows
No ratings yet
De Mod 5 Deploy Workloads With Databricks Workflows
19 pages
Designing Machine Learning Workflows in Python Chapter2
No ratings yet
Designing Machine Learning Workflows in Python Chapter2
39 pages
Analyzing IoT Data in Python Chapter3
No ratings yet
Analyzing IoT Data in Python Chapter3
30 pages
Introduction To Data Visualization With Seaborn Chapter3
100% (1)
Introduction To Data Visualization With Seaborn Chapter3
32 pages
Spoken Language Processing in Python Chapter3
No ratings yet
Spoken Language Processing in Python Chapter3
26 pages
Spoken Language Processing in Python Chapter4
No ratings yet
Spoken Language Processing in Python Chapter4
46 pages
Spoken Language Processing in Python Chapter1
No ratings yet
Spoken Language Processing in Python Chapter1
17 pages
Designing Machine Learning Workflows in Python Chapter1
No ratings yet
Designing Machine Learning Workflows in Python Chapter1
32 pages
Introduction To Data Visualization With Seaborn Chapter2
No ratings yet
Introduction To Data Visualization With Seaborn Chapter2
38 pages
Analyzing IoT Data in Python Chapter4
No ratings yet
Analyzing IoT Data in Python Chapter4
34 pages
Designing Machine Learning Workflows in Python Chapter3
No ratings yet
Designing Machine Learning Workflows in Python Chapter3
42 pages
Designing Machine Learning Workflows in Python Chapter4
No ratings yet
Designing Machine Learning Workflows in Python Chapter4
38 pages
Cleaning Data With PySpark Chapter3
No ratings yet
Cleaning Data With PySpark Chapter3
25 pages
Introduction To Data Visualization With Seaborn Chapter1
No ratings yet
Introduction To Data Visualization With Seaborn Chapter1
26 pages
Introduction To Data Visualization With Matplotlib Chapter2
No ratings yet
Introduction To Data Visualization With Matplotlib Chapter2
27 pages
List Comprehension in Python
No ratings yet
List Comprehension in Python
8 pages
Analyzing IoT Data in Python Chapter1
100% (1)
Analyzing IoT Data in Python Chapter1
27 pages
Python PPT 01
No ratings yet
Python PPT 01
286 pages
Cleaning Data With PySpark Chapter2
100% (1)
Cleaning Data With PySpark Chapter2
25 pages
Analyzing IoT Data in Python Chapter2
No ratings yet
Analyzing IoT Data in Python Chapter2
35 pages
Pandas: Reference Sheet
No ratings yet
Pandas: Reference Sheet
9 pages
Experiment No: 1 Introduction To Data Analytics and Python Fundamentals Page-1/11
No ratings yet
Experiment No: 1 Introduction To Data Analytics and Python Fundamentals Page-1/11
8 pages
Power BI Cheat Sheet
No ratings yet
Power BI Cheat Sheet
10 pages
Cleaning Data With PySpark Chapter1
0% (1)
Cleaning Data With PySpark Chapter1
20 pages
Extraction, Transformation, and Load (ETL) Specification
No ratings yet
Extraction, Transformation, and Load (ETL) Specification
8 pages
Python Functions
No ratings yet
Python Functions
29 pages
Introduction To Data Visualization With Python
No ratings yet
Introduction To Data Visualization With Python
47 pages
Databricks Champions Program Guide v2 (5)
No ratings yet
Databricks Champions Program Guide v2 (5)
10 pages
Financial Analytics With Python
100% (1)
Financial Analytics With Python
40 pages
Pandas Guide
No ratings yet
Pandas Guide
64 pages
Cleaning Data With PySpark Chapter4
No ratings yet
Cleaning Data With PySpark Chapter4
23 pages
Testing in Python - Unit Test & Script
No ratings yet
Testing in Python - Unit Test & Script
5 pages
Building Chatbots in Python Chapter2 PDF
No ratings yet
Building Chatbots in Python Chapter2 PDF
41 pages
Pyspark RDD Cheat Sheet Python For Data Science
No ratings yet
Pyspark RDD Cheat Sheet Python For Data Science
1 page
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Pyspark Learning Hub
No ratings yet
Pyspark Learning Hub
7 pages
ETL Testing: Online, Classroom, Corporate Mr. 40 Days
No ratings yet
ETL Testing: Online, Classroom, Corporate Mr. 40 Days
13 pages
Python - 1 Year - Unit-2
No ratings yet
Python - 1 Year - Unit-2
116 pages
Learning Apache Spark With Python
No ratings yet
Learning Apache Spark With Python
10 pages
Snow SQL
No ratings yet
Snow SQL
3 pages
Technologies For Handling Big Data: Prepared By: Saidatul Rahah Hamidi
No ratings yet
Technologies For Handling Big Data: Prepared By: Saidatul Rahah Hamidi
49 pages
Building Chatbots in Python Chapter4
No ratings yet
Building Chatbots in Python Chapter4
20 pages
Cloud Practitioner: Aws Certified
No ratings yet
Cloud Practitioner: Aws Certified
18 pages
T2 File Handling
No ratings yet
T2 File Handling
15 pages
Python Jinja Tutorial
No ratings yet
Python Jinja Tutorial
10 pages
Data Modeling ER
33% (3)
Data Modeling ER
89 pages
Slide 13 - Kafka
No ratings yet
Slide 13 - Kafka
109 pages
Steps in SHA-256 Algorithm
No ratings yet
Steps in SHA-256 Algorithm
5 pages
L02 - Spark SQL For Data Processing: CBG1C04 Big Data Programming
No ratings yet
L02 - Spark SQL For Data Processing: CBG1C04 Big Data Programming
23 pages
Primo SQL Masterclass
No ratings yet
Primo SQL Masterclass
94 pages
What Are DBT Sources
No ratings yet
What Are DBT Sources
109 pages
Day 4-01-Spark
No ratings yet
Day 4-01-Spark
43 pages
Pyspark Study Material
No ratings yet
Pyspark Study Material
5 pages
Databricks Question
No ratings yet
Databricks Question
7 pages
Python Pandas Interview Questions and Answers
No ratings yet
Python Pandas Interview Questions and Answers
20 pages
De Mod 2 Transform Data With Spark
No ratings yet
De Mod 2 Transform Data With Spark
32 pages
How Speech Recognition Works: Hidden Markov Model
No ratings yet
How Speech Recognition Works: Hidden Markov Model
25 pages
SpeechRecognition
No ratings yet
SpeechRecognition
5 pages
How To Build A Home Recording Studio
From Everand
How To Build A Home Recording Studio
Ken Theriot
3.5/5 (3)
Preparing Your Gures To Share With Others: Ariel Rokem
No ratings yet
Preparing Your Gures To Share With Others: Ariel Rokem
35 pages
Changing Plot Style and Color: Erin Case
No ratings yet
Changing Plot Style and Color: Erin Case
54 pages
Chapter3 PDF
No ratings yet
Chapter3 PDF
36 pages
Introduction To Data Visualization With Matplotlib: Ariel Rokem
No ratings yet
Introduction To Data Visualization With Matplotlib: Ariel Rokem
30 pages
Customer Segmentation in Python Chapter3
No ratings yet
Customer Segmentation in Python Chapter3
25 pages
Customer Segmentation in Python Chapter4
No ratings yet
Customer Segmentation in Python Chapter4
37 pages
Credit Risk Modeling in Python Chapter4
100% (1)
Credit Risk Modeling in Python Chapter4
35 pages
Advanced NLP With Spacy Chapter4
No ratings yet
Advanced NLP With Spacy Chapter4
26 pages
IIM Calcutta
No ratings yet
IIM Calcutta
4 pages
R Studio Reference Sheets Compilation
No ratings yet
R Studio Reference Sheets Compilation
21 pages
Bandplan HF
No ratings yet
Bandplan HF
2 pages
Links - Amazon Website
No ratings yet
Links - Amazon Website
9 pages
Intro To Sales Systems & Operations
No ratings yet
Intro To Sales Systems & Operations
23 pages
Deloitte DK Vendor Relationship Management
No ratings yet
Deloitte DK Vendor Relationship Management
5 pages
Ecdis CBT e PDF
50% (2)
Ecdis CBT e PDF
2 pages
CCSA 156-215.80-512Q (2020feb06 Revised)
100% (1)
CCSA 156-215.80-512Q (2020feb06 Revised)
230 pages
Isaca NIST-COBIT-2019 Dumps
No ratings yet
Isaca NIST-COBIT-2019 Dumps
5 pages
Aerospace Material Specification: Cartridges, Grease, 14 Ounce (For Cartridge-Type Grease Gun) FSC 4930
No ratings yet
Aerospace Material Specification: Cartridges, Grease, 14 Ounce (For Cartridge-Type Grease Gun) FSC 4930
10 pages
The Economical Solution To Rockfall
No ratings yet
The Economical Solution To Rockfall
8 pages
What Is The Difference Between MCB, MCCB, ELCB, and RCCB - EEP
No ratings yet
What Is The Difference Between MCB, MCCB, ELCB, and RCCB - EEP
5 pages
08 Robot Sensor Motor
No ratings yet
08 Robot Sensor Motor
29 pages
Developing Examination Management System: Senior Capstone Project, A Case Study
No ratings yet
Developing Examination Management System: Senior Capstone Project, A Case Study
7 pages
PHE Dye Penetrant Testing
No ratings yet
PHE Dye Penetrant Testing
46 pages
7PA27420AA000 Datasheet en
No ratings yet
7PA27420AA000 Datasheet en
2 pages
Documento_completo
No ratings yet
Documento_completo
5 pages
003 NAPS UFSAR Chapter 3
No ratings yet
003 NAPS UFSAR Chapter 3
1,438 pages
Searchsearch: User Settings
No ratings yet
Searchsearch: User Settings
4 pages
StylingCV Professional Resume Builder (2022)
No ratings yet
StylingCV Professional Resume Builder (2022)
1 page
Riverside, CA RFP For Citywide WiFi Network
No ratings yet
Riverside, CA RFP For Citywide WiFi Network
86 pages
Final Report Crowd Management Arduino
No ratings yet
Final Report Crowd Management Arduino
70 pages
Computer (Four Basic Components)
No ratings yet
Computer (Four Basic Components)
20 pages
YMTC
No ratings yet
YMTC
10 pages
English Electricity Tariff 2024
No ratings yet
English Electricity Tariff 2024
60 pages
DT8211A
No ratings yet
DT8211A
10 pages
Chapter 1. Introduction 1.1 History of Fiber Optics
No ratings yet
Chapter 1. Introduction 1.1 History of Fiber Optics
8 pages
Razor Views
No ratings yet
Razor Views
80 pages
RackStation Comparison
No ratings yet
RackStation Comparison
6 pages

Spoken Language Processing in Python Chapter2

Uploaded by

Spoken Language Processing in Python Chapter2

Uploaded by

SpeechRecognition

SPOKEN LANGUAGE PROCESSING IN PYTHON

$ pip install SpeechRecognition

Compatible with Python 2 and 3

We'll use Python 3

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Create an instance of Recognizer

# Set the energy threshold

SPOKEN LANGUAGE PROCESSING IN PYTHON

Output: transcribed speech from audio_file

SPOKEN LANGUAGE PROCESSING IN PYTHON

Recognize speech from an audio le with SpeechRecognition:

# Import SpeechRecognition library

# Instantiate Recognizer class

# Transcribe speech using Goole web API

Learning speech recognition on DataCamp is awesome!

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Setup recognizer instance

# Read in audio file

# Check type of clean_support_call

SPOKEN LANGUAGE PROCESSING IN PYTHON

AssertionError: ``audio_data`` must be audio data

# Convert from AudioFile to AudioData

# Record the audio

# Check the type

SPOKEN LANGUAGE PROCESSING IN PYTHON

hello I'd like to get some help setting up my account please

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Leave duration and offset as default

# Get first 2-seconds of clean support call

hello I'd like to get

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Pass the Japanese audio to recognize_google

# Print the text

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Pass the Japanese audio to recognize_google

# Print the text

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Convert the AudioFile to AudioData

# Recognize the AudioData

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Convert the AudioFile to AudioData

# Recognize the AudioData with show_all turned on

SPOKEN LANGUAGE PROCESSING IN PYTHON

{'alternative': [{'transcript': 'Ohio gozaimasu', 'confidence': 0.89041114},

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Convert AudioFile to AudioData

# Recognize the AudioData

one of the limitations of the speech recognition library is that it doesn't

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Transcribe each speaker individually

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Import audio file with background nosie

with noisy_support_call as source:

# Recognize the audio

hello ID like to get some help setting up my calories

SPOKEN LANGUAGE PROCESSING IN PYTHON

You might also like