Research Interests:
Research Interests:
We show that signal flow graph theory provides a simple way to relate two popular algorithms used for adapting dynamic neural networks, real-time backpropagation and backpropagation-through-time. Starting with the flow graph for real-time... more
We show that signal flow graph theory provides a simple way to relate two popular algorithms used for adapting dynamic neural networks, real-time backpropagation and backpropagation-through-time. Starting with the flow graph for real-time backpropagation, we use a simple transposition to produce a second graph. The new graph is shown to be interreciprocal with the original and to correspond to the backpropagation-through-time algorithm. Interreciprocity provides a theoretical argument to verify that both flow graphs implement the same overall weight update.
Research Interests:
We present an approach to cluster the training datafor automatic speech recognition (ASR). A relativeentropybased distance metric between training dataclusters is defined. This metric is used to hierarchicallycluster the training data.... more
We present an approach to cluster the training datafor automatic speech recognition (ASR). A relativeentropybased distance metric between training dataclusters is defined. This metric is used to hierarchicallycluster the training data. The metric can alsobe used to select the closest training data clustersgiven a small amount of data from the test speaker.The selected clusters are then used to estimate a
Research Interests:
Deriving backpropagation algorithms for time-dependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. In this paper, we present a unified approach to... more
Deriving backpropagation algorithms for time-dependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. In this paper, we present a unified approach to derive such algorithms via a set of simple block diagram manipulation rules.
Research Interests:
Research Interests:
We propose a technique to port channel characteristics from one language to another. This allows us to build acoustic models in a target language that are robust to an environment for which we have no data in that language. The approach... more
We propose a technique to port channel characteristics from one language to another. This allows us to build acoustic models in a target language that are robust to an environment for which we have no data in that language. The approach consists in training broad phonetic class maximum likelihood linear regression (MLLR) transformations from a source language, and applying them in the target language. These transforms encapsulate the acoustic specificities of the environment without capturing language-specific characteristics that are difficult to port across languages. As a case study, we consider the problem of building in-the-car GSM models for UK English, assuming that we have no GSM, and no car data in UK English, but that we have such data in German. We show that this technique can greatly reduce the error rate of the recognition system on English GSM car data.
Research Interests:
Research Interests:
Research Interests:
Basic studies in denotational mathematics and mathematical engineering have led to the theory of abstract intelligence (aI), which is a set of mathematical models of natural and computational intelligence in cognitive informatics (CI) and... more
Basic studies in denotational mathematics and mathematical engineering have led to the theory of abstract intelligence (aI), which is a set of mathematical models of natural and computational intelligence in cognitive informatics (CI) and cognitive computing (CC). intelligence triggers the recent breakthroughs in cognitive systems such as cognitive computers, cognitive robots, cognitive neural networks, and cognitive learning. This paper reports a set of position statements presented in the plenary panel (Part II) of IEEE ICCI*CC'16 on Cognitive Informatics and Cognitive Computing at Stanford University. The summary is contributed by invited panelists who are part of the world's renowned scholars in the transdisciplinary field of CI and CC.
Research Interests:
Research Interests:
This paper describes a new approach to acoustic modeling for large vocabulary continuous speech recognition (LVCSR) systems. Each phone is modeled with a large Gaussian mixture model (GMM) whose context-dependent mixture weights are... more
This paper describes a new approach to acoustic modeling for large vocabulary continuous speech recognition (LVCSR) systems. Each phone is modeled with a large Gaussian mixture model (GMM) whose context-dependent mixture weights are estimated with a sentence-level discriminative training criterion. The estimation problem is cast in a neural network framework, which enables the incorporation of the appropriate constraints on the mixture weight vectors, and allows a straight-forward training procedure, based on steepest descent. Experiments conducted on the Callhome-English and Switchboard databases show a significant improvement of the acoustic model performance, and a somewhat lesser improvement with the combined acoustic and language models.
Research Interests:
Research Interests:
Research Interests:
Research Interests:
In the context of automatic speaker recognition, we propose a model transformation technique that renders speaker models more robust to acoustic mismatches and to data scarcity by appropriately increasing their variances. We use a stereo... more
In the context of automatic speaker recognition, we propose a model transformation technique that renders speaker models more robust to acoustic mismatches and to data scarcity by appropriately increasing their variances. We use a stereo database containing speech recorded simultaneously under different acoustic conditions to derive a synthetic variance distribution. This distribution is then used to modify the variances of other speaker models from other telephone databases. The technique is illustrated with experiments conducted on a locally collected database and on the NIST'95 and '96 subsets of the Switchboard corpus.