Skip to main content

Ian McLoughlin

University of Kent, Computing, Faculty Member

Nanyang Technological University, School of Computer Engineering, Emeritus

University of Birmingham, Electrical and Computer Engineering, Alumnus

University of Science and Technology of China, EEIS@USTC, Adjunct

Followers

198

Following

14

Co-authors

6

Public Views

Professor Ian Vince McLoughlin worked with the electronics R

less

InterestsView All (13)

Uploads

About Me by Ian McLoughlin

Background, CV, paper list, links to more information

Research by Ian McLoughlin

My communications research

Various topics in wireless communications and networking explained (with links to papers)

My speech research

This webpage overviews most of my recent speech work, with links to various papers.

Books by Ian McLoughlin

Applied Speech and Audio Processing (with MATLAB examples)

Applied Speech and Audio Processing is a MATLAB-based, one-stop resource that blends speech and h... more Applied Speech and Audio Processing is a MATLAB-based, one-stop resource that blends speech and hearing research in describing the key techniques of speech and audio processing. This practically oriented text provides MATLAB examples throughout to illustrate the concepts discussed and to give the reader hands-on experience with important techniques. Chapters on basic audio processing and the characteristics of speech and hearing lay the foundations of speech signal processing, which are built upon in subsequent sections explaining audio handling, coding, compression, and analysis techniques. The final chapter explores a number of advanced topics that use these techniques, including psychoacoustic modelling, a subject which underpins MP3 and related audio formats. With its hands-on nature and numerous MATLAB examples, this book is ideal for graduate students and practitioners working with speech or audio systems.

Papers by Ian McLoughlin

Nonintrusive quality assessment of noise suppressed speech with Mel-Filtered energies and support vector regression

Speech Recognition for Smart Homes

Improvements Relating to Radio Communication Systems

Evaluation of ITU-T G. 728 as a voice over IP codec for Chinese speech

by Krzysztof Pawlikowski and Ian McLoughlin

Abstract—Voice-over-IP is expected to become a popular service offered by the internet. Thus, it ... more Abstract—Voice-over-IP is expected to become a popular service offered by the internet. Thus, it is important to ensure high quality of service. In this paper, we look at two standards proposed for evaluating the intelligibility of Chinese speech. Adopting the philosophy and methodology of the Diagnostic Rhyme Test (DRT) for testing English speech, the Chinese Diagnostic Rhyme Test (CDRT) evaluates the six elementary phonemic attributes of Chinese words.

Reconfigurable Processing Framework for Space-Time Block Codes

by kishore mehrotra and Ian McLoughlin

LSP-based speech modification for intelligibility enhancement

Proceedings of 13th International Conference on Digital Signal Processing, 2000

CELP coders commonly use line spectral pairs (LSP) to represent linear prediction parameters, giv... more CELP coders commonly use line spectral pairs (LSP) to represent linear prediction parameters, giving stable filters and efficient coding. However, manipulation of LSPs can alter frequencies within the represented signals. This paper describes two computationally efficient LSP-based processing methods designed to enhance the intelligibility of speech degraded by acoustic interference

Channel prediction for mitigating feedback link issues in transmit antenna selection systems

2009 Ieee 20th International Symposium on Personal Indoor and Mobile Radio Communications, Sep 1, 2009

Low-Cost Space-Borne Processing on a Reconfigurable Parallel Architecture

Ersa, 2004

Reconstruction of pitch for whisper-to-speech conversion of Chinese

The 9th International Symposium on Chinese Spoken Language Processing, Sep 1, 2014

Reconfigurable, Fault Tolerant and High Performance Payload for Space Missions

Square-rich fixed point polynomial evaluation on FPGAs

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays - FPGA '14, 2014

Deep bottleneck features for spoken language identification

PloS one, 2014

A key problem in spoken language identification (LID) is to design effective representations whic... more A key problem in spoken language identification (LID) is to design effective representations which are specific to language information. For example, in recent years, representations based on both phonotactic and acoustic features have proven their effectiveness for LID. Although advances in machine learning have led to significant improvements, LID performance is still lacking, especially for short duration speech utterances. With the hypothesis that language information is weak and represented only latently in speech, and is largely dependent on the statistical properties of the speech content, existing representations may be insufficient. Furthermore they may be susceptible to the variations caused by different speakers, specific content of the speech segments, and background noise. To address this, we propose using Deep Bottleneck Features (DBF) for spoken LID, motivated by the success of Deep Neural Networks (DNN) in speech recognition. We show that DBFs can form a low-dimensio...

FPGA implementation of space-time block coding systems

Proceedings of the IEEE 6th Circuits and Systems Symposium on Emerging Technologies: Frontiers of Mobile and Wireless Communication (IEEE Cat. No.04EX710), 2004

Mouth State Detection From Low-Frequency Ultrasonic Reflection

Circuits, Systems, and Signal Processing, 2014

ABSTRACT This paper develops, simulates and experimentally evaluates a novel method based on non-... more ABSTRACT This paper develops, simulates and experimentally evaluates a novel method based on non-contact low frequency (LF) ultrasound which can determine, from airborne reflection, whether the lips of a subject are open or closed. The method is capable of accurately distinguishing between open and closed lip states through the use of a low-complexity detection algorithm, and is highly robust to interfering audible noise. A novel voice activity detector is implemented and evaluated using the proposed method and shown to detect voice activity with high accuracy, even in the presence of high levels of background noise. The lip state detector is evaluated at a number of angles of incidence to the mouth and under various conditions of background noise. The underlying mouth state detection technique relies upon an inaudible LF ultrasonic excitation, generated in front of the face of a user, either reflecting back from their face as a simple echo in the closed mouth state or resonating inside the open mouth and vocal tract, affecting the spectral response of the reflected wave when the mouth is open. The difference between echo and resonance behaviours is used as the basis for automated lip opening detection, which implies determining whether the mouth is open or closed at the lips. Apart from this, potential applications include use in voice generation prosthesis for speech impaired patients, or as a hands-free control for electrolarynx and similar rehabilitation devices. It is also applicable to silent speech interfaces and may have use for speech authentication.

Shaping Spectral Leakage for IEEE 802.11p Vehicular Communications

2014 IEEE 79th Vehicular Technology Conference (VTC Spring), 2014

Performance investigation and implementation of a real-time adaptive MIMO-DFE system

2006 IEEE Singapore International Conference on Communication Systems, ICCS 2006, 2006

Abstract This paper presents the performance investigation and FPGA implementation aspects of a r... more

Background, CV, paper list, links to more information

My communications research

Various topics in wireless communications and networking explained (with links to papers)

My speech research

This webpage overviews most of my recent speech work, with links to various papers.

Applied Speech and Audio Processing (with MATLAB examples)

Applied Speech and Audio Processing is a MATLAB-based, one-stop resource that blends speech and h... more Applied Speech and Audio Processing is a MATLAB-based, one-stop resource that blends speech and hearing research in describing the key techniques of speech and audio processing. This practically oriented text provides MATLAB examples throughout to illustrate the concepts discussed and to give the reader hands-on experience with important techniques. Chapters on basic audio processing and the characteristics of speech and hearing lay the foundations of speech signal processing, which are built upon in subsequent sections explaining audio handling, coding, compression, and analysis techniques. The final chapter explores a number of advanced topics that use these techniques, including psychoacoustic modelling, a subject which underpins MP3 and related audio formats. With its hands-on nature and numerous MATLAB examples, this book is ideal for graduate students and practitioners working with speech or audio systems.

Nonintrusive quality assessment of noise suppressed speech with Mel-Filtered energies and support vector regression

Speech Recognition for Smart Homes

Improvements Relating to Radio Communication Systems

Evaluation of ITU-T G. 728 as a voice over IP codec for Chinese speech

by Krzysztof Pawlikowski and Ian McLoughlin

Abstract—Voice-over-IP is expected to become a popular service offered by the internet. Thus, it ... more Abstract—Voice-over-IP is expected to become a popular service offered by the internet. Thus, it is important to ensure high quality of service. In this paper, we look at two standards proposed for evaluating the intelligibility of Chinese speech. Adopting the philosophy and methodology of the Diagnostic Rhyme Test (DRT) for testing English speech, the Chinese Diagnostic Rhyme Test (CDRT) evaluates the six elementary phonemic attributes of Chinese words.

Reconfigurable Processing Framework for Space-Time Block Codes

by kishore mehrotra and Ian McLoughlin

LSP-based speech modification for intelligibility enhancement

Proceedings of 13th International Conference on Digital Signal Processing, 2000

CELP coders commonly use line spectral pairs (LSP) to represent linear prediction parameters, giv... more CELP coders commonly use line spectral pairs (LSP) to represent linear prediction parameters, giving stable filters and efficient coding. However, manipulation of LSPs can alter frequencies within the represented signals. This paper describes two computationally efficient LSP-based processing methods designed to enhance the intelligibility of speech degraded by acoustic interference

Channel prediction for mitigating feedback link issues in transmit antenna selection systems

2009 Ieee 20th International Symposium on Personal Indoor and Mobile Radio Communications, Sep 1, 2009

Low-Cost Space-Borne Processing on a Reconfigurable Parallel Architecture

Ersa, 2004

Reconstruction of pitch for whisper-to-speech conversion of Chinese

The 9th International Symposium on Chinese Spoken Language Processing, Sep 1, 2014

Reconfigurable, Fault Tolerant and High Performance Payload for Space Missions

Square-rich fixed point polynomial evaluation on FPGAs

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays - FPGA '14, 2014

Deep bottleneck features for spoken language identification

PloS one, 2014

A key problem in spoken language identification (LID) is to design effective representations whic... more A key problem in spoken language identification (LID) is to design effective representations which are specific to language information. For example, in recent years, representations based on both phonotactic and acoustic features have proven their effectiveness for LID. Although advances in machine learning have led to significant improvements, LID performance is still lacking, especially for short duration speech utterances. With the hypothesis that language information is weak and represented only latently in speech, and is largely dependent on the statistical properties of the speech content, existing representations may be insufficient. Furthermore they may be susceptible to the variations caused by different speakers, specific content of the speech segments, and background noise. To address this, we propose using Deep Bottleneck Features (DBF) for spoken LID, motivated by the success of Deep Neural Networks (DNN) in speech recognition. We show that DBFs can form a low-dimensio...

FPGA implementation of space-time block coding systems

Proceedings of the IEEE 6th Circuits and Systems Symposium on Emerging Technologies: Frontiers of Mobile and Wireless Communication (IEEE Cat. No.04EX710), 2004

Mouth State Detection From Low-Frequency Ultrasonic Reflection

Circuits, Systems, and Signal Processing, 2014

ABSTRACT This paper develops, simulates and experimentally evaluates a novel method based on non-... more ABSTRACT This paper develops, simulates and experimentally evaluates a novel method based on non-contact low frequency (LF) ultrasound which can determine, from airborne reflection, whether the lips of a subject are open or closed. The method is capable of accurately distinguishing between open and closed lip states through the use of a low-complexity detection algorithm, and is highly robust to interfering audible noise. A novel voice activity detector is implemented and evaluated using the proposed method and shown to detect voice activity with high accuracy, even in the presence of high levels of background noise. The lip state detector is evaluated at a number of angles of incidence to the mouth and under various conditions of background noise. The underlying mouth state detection technique relies upon an inaudible LF ultrasonic excitation, generated in front of the face of a user, either reflecting back from their face as a simple echo in the closed mouth state or resonating inside the open mouth and vocal tract, affecting the spectral response of the reflected wave when the mouth is open. The difference between echo and resonance behaviours is used as the basis for automated lip opening detection, which implies determining whether the mouth is open or closed at the lips. Apart from this, potential applications include use in voice generation prosthesis for speech impaired patients, or as a hands-free control for electrolarynx and similar rehabilitation devices. It is also applicable to silent speech interfaces and may have use for speech authentication.

Shaping Spectral Leakage for IEEE 802.11p Vehicular Communications

2014 IEEE 79th Vehicular Technology Conference (VTC Spring), 2014

Performance investigation and implementation of a real-time adaptive MIMO-DFE system

2006 IEEE Singapore International Conference on Communication Systems, ICCS 2006, 2006

Abstract This paper presents the performance investigation and FPGA implementation aspects of a r... more

Link layer error mitigation in rural UHF-MIMO linking systems

IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2007

A Perspective on the Experiential Learning of Computer Architecture

Proceedings - 2010 IEEE/ACM International Conference on Green Computing and Communications, GreenCom 2010, 2010 IEEE/ACM International Conference on Cyber, Physical and Social Computing, CPSCom 2010, 2010

Multi-touch wall displays for informational and interactive collaborative space

Communications in Computer and Information Science, 2012

Page 1. Multi-touch wall displays for informational and interactive collaborative space Ian Vince... more

Mobile Communications Using Source-Selected Multi-Antenna AF Relays Over Dual-Hop Nakagami- {Mathematical expression} Channels

Wireless Personal Communications, 2013

ABSTRACT This paper presents a scheme for dual-hop amplify-and-forward multi-antenna, multi-relay... more ABSTRACT This paper presents a scheme for dual-hop amplify-and-forward multi-antenna, multi-relay selection over Nakagami-m fading channels. A source-selected best relay performs maximal ratio combining on received data, applies variable gain, and then uses beamforming to transmit to a destination device. Such a configuration is beneficial for end-to-end communication using single antenna mobile terminals with a multi-antenna relay infrastructure. Closed form expressions for performance metrics are derived that cater for arbitrary number of relays, arbitrary number of receive and transmit antennas and different fading parameters. Results are verified through simulation. Furthermore, the influence of multiple antennas, the effects of fading, power imbalance between hops, and the beneficial impact of additional relays are explored.