Predicting search term reliability for spoken term detection systems
Spoken term detection is an extension of text-based searching that allows users to type keywords and search audio files containing recordings of spoken language. Performance is dependent on many external factors such as the acoustic channel, language, ...
Continuous speech recognition using linear dynamic models
Hidden Markov models (HMMs) with Gaussian mixture distributions rely on an assumption that speech features are temporally uncorrelated, and often assume a diagonal covariance matrix where correlations between feature vectors for adjacent frames are ...
A nonlinear autoregressive model for speaker verification
Gaussian Mixture Models (GMM) have been the most popular approach in speaker recognition and verification for over two decades. The inefficiencies of this model for signals such as speech are well documented and include an inability to model temporal ...
Wavelet basis selection for enhanced speech parametrization in speaker verification
We study the inherent properties of nine wavelet functions and subsequently evaluate their applicability as basis functions in a speech parametrization scheme that is advantageous for speaker verification. Particularly, the inherent properties of nine ...
A novel sub-band adaptive filtering for acoustic echo cancellation based on empirical mode decomposition algorithm
Acoustic echo cancellation is one of the most severe requirements in hands-free telephone and teleconference communication. This paper proposes an Empirical Mode Decomposition (EMD)-based sub-band adaptive filtering structure, which applies the EMD-...
Robust regression fusion of GMM-UBM and GMM-SVM normalized scores using G729 bit-stream for speaker recognition over IP
A novel approach, based on robust regression with normalized score fusion (namely Normalized Scores following Robust Regression Fusion: NSRRF), is proposed for enhancement of speaker recognition over IP networks, which can be used both in Network ...
Speech enhancement with an adaptive Wiener filter
- Marwa A. Abd El-Fattah,
- Moawad I. Dessouky,
- Alaa M. Abbas,
- Salaheldin M. Diab,
- El-Sayed M. El-Rabaie,
- Waleed Al-Nuaimy,
- Saleh A. Alshebeili,
- Fathi E. Abd El-Samie
This paper proposes an adaptive Wiener filtering method for speech enhancement. This method depends on the adaptation of the filter transfer function from sample to sample based on the speech signal statistics; the local mean and the local variance. It ...
Film segmentation and indexing using autoassociative neural networks
In this paper, Autoassociative Neural Network (AANN) models are explored for segmentation and indexing the films (movies) using audio features. A two-stage method is proposed for segmenting the film into sequence of scenes, and then indexing them ...
Automatic detection of breathy voiced vowels in Gujarati speech
This paper proposes a method for automatic detection of breathy voiced vowels in continuous Gujarati speech. As breathy voice is a specific phonetic feature predominantly present in Gujarati among Indian languages, it can be used for identifying ...
Spoken keyword detection using autoassociative neural networks
Spoken keywords detection is essential to organize efficiently lots of hours of audio contents such as meetings, radio news, etc. These systems are developed with the purpose of indexing large audio databases or of detecting keywords in continuous ...