Tony Robinson
Cambridge Alumni, Engineering, Alumnus
Research Interests:
This paper investigates the scaling properties of Recurrent Neural Network Language Models (RNNLMs). We discuss how to train very large RNNs on GPUs and address the questions of how RNNLMs scale with respect to model size, training-set... more
This paper investigates the scaling properties of Recurrent Neural Network Language Models (RNNLMs). We discuss how to train very large RNNs on GPUs and address the questions of how RNNLMs scale with respect to model size, training-set size, computational costs and memory. Our analysis shows that despite being more costly to train, RNNLMs obtain much lower perplexities on standard benchmarks than n-gram models. We train the largest known RNNs and present relative word error rates gains of 18% on an ASR task. We also present the new lowest perplexities on the recently released billion word language modelling benchmark, 1 BLEU point gain on machine translation and a 17% relative hit rate gain in word prediction.
Research Interests:
Research Interests:
Research Interests:
Research Interests:
This chapter contains sections titled: Objectives of the operatorHydraulic ground-crop sprayersRotary atomizersGranule applicatorsBand sprayersKnapsack sprayersSprayer faultsErrors in applying herbicidesHerbicide... more
This chapter contains sections titled: Objectives of the operatorHydraulic ground-crop sprayersRotary atomizersGranule applicatorsBand sprayersKnapsack sprayersSprayer faultsErrors in applying herbicidesHerbicide driftDecontaminationsprayers and disposal of waste materialStorage of herbicidesReferences and further readinObjectives of the operatorHydraulic ground-crop sprayersRotary atomizersGranule applicatorsBand sprayersKnapsack sprayersSprayer faultsErrors in applying herbicidesHerbicide driftDecontaminationsprayers and disposal of waste materialStorage of herbicidesReferences and further readin
... DOLE [SIL] WHO ANNOUNCED HE IS RESIGNING FROM THE SENATE TO DEVOTE FULL TIME TO HIS QUEST FOR THE WHITE HOUSE [SIL ... Gethin Williams and Steve Renals at the Univer-sity of Sheffield report using acoustic-based confidence measures... more
... DOLE [SIL] WHO ANNOUNCED HE IS RESIGNING FROM THE SENATE TO DEVOTE FULL TIME TO HIS QUEST FOR THE WHITE HOUSE [SIL ... Gethin Williams and Steve Renals at the Univer-sity of Sheffield report using acoustic-based confidence measures derived from the ...
Research Interests:
Research Interests:
Research Interests:
Research Interests:
ABSTRACT ABBOT is the hybrid connectionist-hidden Markov model largevocabulary speech recognition system developed at Cambridge University. In this system, a recurrent network maps each acoustic vector to an estimate of the posterior... more
ABSTRACT ABBOT is the hybrid connectionist-hidden Markov model largevocabulary speech recognition system developed at Cambridge University. In this system, a recurrent network maps each acoustic vector to an estimate of the posterior probabilities of the phone classes. The maximum likelihood word string is then extracted using Markov models. As in traditional hidden Markov models, the Markov process is used to model the lexical and language model constraints. This paper describes the system which participated in the ...
ABSTRACT
Research Interests:
Research Interests:
Research Interests:
Research Interests:
Research Interests:
Research Interests:
Abstract ABBOT is the hybrid connectionist-hidden Markov model (HMM) large-vocabulary continuous speech recognition (CSR) system developed at Cambridge University. This system uses a recurrent network to estimate the acoustic observation... more
Abstract ABBOT is the hybrid connectionist-hidden Markov model (HMM) large-vocabulary continuous speech recognition (CSR) system developed at Cambridge University. This system uses a recurrent network to estimate the acoustic observation probabilities within an HMM framework. A major advantage of this approach is that good performance is achieved using context-independent acoustic models and requiring many fewer parameters than comparable HMM systems. This paper presents substantial performance improvements ...
Research Interests: Computer Science, Performance, Markov Processes, Speech Recognition, Probability, and 14 moreRecurrent Neural Network, Hidden Markov Models, Performance Improvement, hidden Markov model, Hmm, Decoding, System Development, Acoustical Engineering, Context Modeling, Merging, Language Model, Decoder, Grammars, and Acoustic Modeling
ABSTRACT
Research Interests:
Research Interests:
Research Interests:
Research Interests:
ABSTRACT This paper describes the THISL system that participated in the TREC-7 evaluation, Spoken Document Retrieval (SDR) Track, and presents the results obtained, together with some analysis. The THISL system is based on the ABBOT... more
ABSTRACT This paper describes the THISL system that participated in the TREC-7 evaluation, Spoken Document Retrieval (SDR) Track, and presents the results obtained, together with some analysis. The THISL system is based on the ABBOT speech recognition system and the thislIR text retrieval system. In this evaluation we were concerned with investigating the suitability for SDR of a recognizer running at less than ten times realtime, the use of multiple transcriptions and word graphs, the effect of simple query expansion ...
Research Interests:
Automatic summarisation of spoken audio is a fairly new research pursuit, in large part due to the relative novelty of technology for accurately decoding audio into text. Techniques that account for the peculiarities and potential... more
Automatic summarisation of spoken audio is a fairly new research pursuit, in large part due to the relative novelty of technology for accurately decoding audio into text. Techniques that account for the peculiarities and potential ambiguities of decoded audio (high error rates, lack of syntactic boundaries) appear promising for culling summary information from audio for content-based browsing and skimming. This paper combines acoustic confidence measures with simple information retrieval and extraction techniques in order to obtain accurate, readable summaries of broadcast news programs. It also demonstrates how extracted summaries, full-text speech recogniser output and audio files can be usefully linked together through an audio-visual interface. The results suggest that information extraction based on statistical information can produce viable summaries of decoded audio. 1. APPLICATION CONTEXT Managing this contemporary explosion of audio and video materials calls for intelligent...
The speech waveform can be modelled as a piecewise-stationary linear stochastic state space system, and its parameters can be est imated using an expectation-maximisation (EM) algorithm. One problem is the ini- tialisation of the EM... more
The speech waveform can be modelled as a piecewise-stationary linear stochastic state space system, and its parameters can be est imated using an expectation-maximisation (EM) algorithm. One problem is the ini- tialisation of the EM algorithm. Standard initialisation s chemes can lead to poor formant trajectories. But these trajectories howev er are impor- tant for vowel intelligibility. The aim of
Research Interests:
Research Interests: Speech Recognition, Speech Processing, Recurrent Neural Network, Hidden Markov Models, Real Time Systems, and 10 moreSpontaneous speech, Error Analysis, Real Time, Decoding, System Development, Speech Signal Processing, Acoustical Engineering, Radio Broadcasting, System performance, and Broadcast news
Research Interests:
This thesis extends the error propagation network to deal with time varying or dynamic patterns. Examples are given of supervised, reinforcement driven and unsupervised learning. Chapter 1 presents an overview of connectionist models.... more
This thesis extends the error propagation network to deal with time varying or dynamic patterns. Examples are given of supervised, reinforcement driven and unsupervised learning.
Chapter 1 presents an overview of connectionist models.
Chapter 2 introduces the error propagation algorithm for general node types.
Chapter 3 discusses the issue of data representation in connectionist models.
Chapter 4 describes the use of several types of networks applied to the problem of the recognition of steady state vowels from multiple speakers.
Chapter 5 extends the error propagation algorithm to deal with time varying input. Three possible architectures are explored which deal with learning sequences of known length and sequences of unknown and possibly indefinite length. Several simple examples are given.
Chapter 6 describes the use of two dynamic nets to form a speech coder. The popular method of Differential Pulse Code Modulation for speech coding employs two linear filters to encode and decode speech. By generalising these to non-linear filters, implemented as dynamic nets, a reduction in the noise imposed by a limited bandwidth channel is achieved.
Chapter 7 describes the application of a dynamic net to the recognition of a large subset of the phonemes of English from continuous speech. The dynamic net is found to give a higher recognition rate both in comparison with a fixed window net and with the established k nearest neighbour technique.
Chapter 8 describes a further development of dynamic nets which allows them to be trained by a reinforcement signal which expresses the correctness of the output of the net. Two possible architectures are given and an example of learning to play the game of noughts and crosses is presented.
Chapter 1 presents an overview of connectionist models.
Chapter 2 introduces the error propagation algorithm for general node types.
Chapter 3 discusses the issue of data representation in connectionist models.
Chapter 4 describes the use of several types of networks applied to the problem of the recognition of steady state vowels from multiple speakers.
Chapter 5 extends the error propagation algorithm to deal with time varying input. Three possible architectures are explored which deal with learning sequences of known length and sequences of unknown and possibly indefinite length. Several simple examples are given.
Chapter 6 describes the use of two dynamic nets to form a speech coder. The popular method of Differential Pulse Code Modulation for speech coding employs two linear filters to encode and decode speech. By generalising these to non-linear filters, implemented as dynamic nets, a reduction in the noise imposed by a limited bandwidth channel is achieved.
Chapter 7 describes the application of a dynamic net to the recognition of a large subset of the phonemes of English from continuous speech. The dynamic net is found to give a higher recognition rate both in comparison with a fixed window net and with the established k nearest neighbour technique.
Chapter 8 describes a further development of dynamic nets which allows them to be trained by a reinforcement signal which expresses the correctness of the output of the net. Two possible architectures are given and an example of learning to play the game of noughts and crosses is presented.