ASR System: Rashmi Kethireddy
ASR System: Rashmi Kethireddy
ASR System: Rashmi Kethireddy
Rashmi Kethireddy
Acoustic Modeling
What it does ?
- HMM+GMM
- HMM+DNN
- Connectionist Temporal Classification (CTC)
- Sequence to sequence neural network model
Acoustic Modeling - Markov assumption
Markov assumption states that the future state only depends on the current
state.
Example :
Here the state sequence is hidden. Let us find the probability for state sequence “hot hot cold”
P(3 1 3|hot hot cold) = P(3|hot)×P(1|hot)×P(3|cold) = 0.4 x 0.2 x 0.1
P(3 1 3,hot hot cold) = P(hot|start)×P(hot|hot)×P(cold|hot)×P(3|hot)×P(1|hot)×P(3|cold)
Πh = ⅓, Πc = ⅔
The Baum-Welch algorithm solves this by iteratively estimating the counts. We will start with an estimate for the
transition and observation probabilities and then use these estimated probabilities to derive better and better
probabilities.
Solution 3 : Backward Algorithm
Solution 3 :Backward Algorithm
Acoustic modeling : Solution 3
Third problem for HMMs: learning the parameters of an HMM, that is, the A and B matrices
Forward-Backward Algorithm
HMM-GMM
Pronunciation Modeling
Maps each word to a word sequence. We call it Pronunciation dictionary.
Example corpus :
WFSA (Weighted Finite State Automata) : It will output weights that is the
probability
WFST (Weighted Finite State Transducer) : outputs another string from the set
of alphabets from output set and probability.
● Input sequence abcd maps to XYZW with transition probabilities 0.1, 0.2,
0.5, 0.1
WFST Operations
● Composition
○ Compose two transducers
● Determinization
○ Unique start state
○ No two transitions from a state share the same label
○ No epsilon
● Minimization
○ Removal of Equivalent states.
Composition
Determinization
Minimization