A major issue in evaluating speech enhancement and hearing compensation algorithms is to come up ... more A major issue in evaluating speech enhancement and hearing compensation algorithms is to come up with a suitable metric that predicts intelligibility as judged by a human listener. Previous methods such as the widely used Speech Transmission Index (STI) fail to account for masking effects that arise from the highly nonlinear cochlear transfer function. We therefore propose a Neural Articulation Index (NAI) that estimates speech intelligibility from the instantaneous neural spike rate over time, produced when a signal is processed by an auditory neural model. By using a well developed model of the auditory periphery and detection theory we show that human perceptual discrimination closely matches the modeled distortion in the instantaneous spike rates of the auditory nerve. In highly rippled frequency transfer conditions the NAI's prediction error is 8% versus the STI's prediction error of 10.8%.
Accurate models of normal and impaired neural representations of sound are useful tools in unders... more Accurate models of normal and impaired neural representations of sound are useful tools in understanding how acoustic stimuli are encoded in the brain, predicting speech intelligibility, and developing and testing speech processing schemes for hearing aids. In this paper we review recent developments in modelling the effects of hair cell impairment on neural responses to speech stimuli in the auditory
Through the use of DSP chips and multiple microphones, hearing aids now o2er the possibility of p... more Through the use of DSP chips and multiple microphones, hearing aids now o2er the possibility of performing signal-to-noise enhancement. Evaluating di2erent algorithms before they are instantiated on a hearing aid is essential. However, commercially available tests of hearing in noise do not allow for speech perception evaluation with a variety of signals, noise types, signal and noise locations, and reverberation.
Proceedings of the 3rd International IEEE EMBS Conference on Neural Engineering, 2007
A fall-off in speech intelligibility at higher-than-normal presentation levels has been observed ... more A fall-off in speech intelligibility at higher-than-normal presentation levels has been observed for listeners with and without hearing loss (Studebaker et al., 1999; Dubno et al., 2005; Molis and Summers, 2003; Shanks et al., 2002). Speech intelligibility predictors based on the acoustic signal properties, such as the articulation index and speech transmission index, cannot directly account for the effects of
The Neurophysiological Bases of Auditory Perception, 2010
... J Assoc Res Otolaryngol 10 (3). 40.7 Reply Rasha Ibrahim Thanks for your comments. Our presen... more ... J Assoc Res Otolaryngol 10 (3). 40.7 Reply Rasha Ibrahim Thanks for your comments. Our present results indicate Envelope recovery from TFS cues at the peripheral level. It is interesting to investigate if any Envelope recovery can occur at the central level also. ...
The Journal of the Acoustical Society of America, 2007
The temporal response of auditory-nerve (AN) fibers to a steady-state vowel is investigated using... more The temporal response of auditory-nerve (AN) fibers to a steady-state vowel is investigated using a computational auditory-periphery model. The model predictions are validated against a wide range of physiological data for both normal and impaired fibers in cats. The model incorporates two parallel filter paths, component 1 (C1) and component 2 (C2), which correspond to the active and passive modes of basilar membrane vibration, respectively, in the cochlea. The outputs of the two filters are subsequently transduced by two separate functions, added together, and then low-pass filtered by the inner hair cell (IHC) membrane, which is followed by the IHC-AN synapse and discharge generator. The C1 response dominates at low and moderate levels and is responsible for synchrony capture and multiformant responses seen in the vowel responses. The C2 response dominates at high levels and contributes to the loss of synchrony capture observed in normal and impaired fibers. The interaction between C1 and C2 responses explains the behavior of AN fibers in the transition region, which is characterized by two important observations in the vowel responses: First, all components of the vowel undergo the C1/C2 transition simultaneously, and second, the responses to the nonformant components of the vowel become substantial.
The Journal of the Acoustical Society of America, 2014
A phenomenological model of the auditory periphery in cats was previously developed by Zilany and... more A phenomenological model of the auditory periphery in cats was previously developed by Zilany and colleagues [J. Acoust. Soc. Am. 126, 2390-2412 (2009)] to examine the detailed transformation of acoustic signals into the auditory-nerve representation. In this paper, a few issues arising from the responses of the previous version have been addressed. The parameters of the synapse model have been readjusted to better simulate reported physiological discharge rates at saturation for higher characteristic frequencies [Liberman, J. Acoust. Soc. Am. 63, 442-455 (1978)]. This modification also corrects the responses of higher-characteristic frequency (CF) model fibers to low-frequency tones that were erroneously much higher than the responses of low-CF model fibers in the previous version. In addition, an analytical method has been implemented to compute the mean discharge rate and variance from the model's synapse output that takes into account the effects of absolute refractoriness.
The Journal of the Acoustical Society of America, 2009
There is growing evidence that the dynamics of biological systems that appear to be exponential o... more There is growing evidence that the dynamics of biological systems that appear to be exponential over short time courses are in some cases better described over the long-term by power-law dynamics. A model of rate adaptation at the synapse between inner hair cells and auditory-nerve (AN) fibers that includes both exponential and power-law dynamics is presented here. Exponentially adapting components with rapid and short-term time constants, which are mainly responsible for shaping onset responses, are followed by two parallel paths with power-law adaptation that provide slowly and rapidly adapting responses. The slowly adapting power-law component significantly improves predictions of the recovery of the AN response after stimulus offset. The faster power-law adaptation is necessary to account for the "additivity" of rate in response to stimuli with amplitude increments. The proposed model is capable of accurately predicting several sets of AN data, including amplitude-modulation transfer functions, long-term adaptation, forward masking, and adaptation to increments and decrements in the amplitude of an ongoing stimulus.
The Journal of the Acoustical Society of America, 2006
This paper presents a computational model to simulate normal and impaired auditory-nerve (AN) fib... more This paper presents a computational model to simulate normal and impaired auditory-nerve (AN) fiber responses in cats. The model responses match physiological data over a wider dynamic range than previous auditory models. This is achieved by providing two modes of basilar membrane excitation to the inner hair cell (IHC) rather than one. The two modes are generated by two parallel filters, component 1 (C1) and component 2 (C2), and the outputs are subsequently transduced by two separate functions. The responses are then added and passed through the IHC low-pass filter followed by the IHC-AN synapse model and discharge generator. The C1 filter is a narrow-band, chirp filter with the gain and bandwidth controlled by a nonlinear feed-forward control path. This filter is responsible for low and moderate level responses. A linear, static, and broadly tuned C2 filter followed by a nonlinear, inverted and nonrectifying C2 transduction function is critical for producing transition region and high-level effects. Consistent with Kiang's two-factor cancellation hypothesis, the interaction between the two paths produces effects such as the C1/C2 transition and peak splitting in the period histogram. The model responses are consistent with a wide range of physiological data from both normal and impaired ears for stimuli presented at levels spanning the dynamic range of hearing.
The Journal of the Acoustical Society of America, 2013
ABSTRACT Léger et al. [J. Acoust. Soc. Am. (2012)] measured the intelligibility of speech in stea... more ABSTRACT Léger et al. [J. Acoust. Soc. Am. (2012)] measured the intelligibility of speech in steady and spectrally or temporally modulated maskers for stimuli filtered into low- (<1.5 kHz) and mid-frequency (1-3 kHz) regions. Listeners with high-frequency hearing loss but near to clinically normal audiograms in the low- and mid-frequency regions showed poorer performance than a control group with normal hearing, but showed preserved spectral and temporal masking release. Here, we investigated whether a physiologically accurate model of the auditory periphery [Zilany et al., J. Acoust. Soc. Am. (2009)] can explain these masking release data. Intelligibility was predicted using the Neurogram SIMilarity (NSIM) metric of Hines and Harte [Speech Commun. (2010) and (2012)]. This metric can make use of either an "all-information" neurogram with small time bins or a "mean-rate" neurogram with large time bins. The average audiograms of the different groups of listeners from the study of Léger et al. were simulated in the model by applying different mixes of outer and/or inner hair cell impairment. Very accurate predictions of the human data for both normal-hearing and hearing-impaired groups were obtained from the all-information NSIM metric (i.e., taking into account phase-locking information) with threshold shifts produced predominantly by OHC impairment (and minimal IHC impairment).
A major issue in evaluating speech enhancement and hearing compensation algorithms is to come up ... more A major issue in evaluating speech enhancement and hearing compensation algorithms is to come up with a suitable metric that predicts intelligibility as judged by a human listener. Previous methods such as the widely used Speech Transmission Index (STI) fail to account for masking effects that arise from the highly nonlinear cochlear transfer function. We therefore propose a Neural Articulation Index (NAI) that estimates speech intelligibility from the instantaneous neural spike rate over time, produced when a signal is processed by an auditory neural model. By using a well developed model of the auditory periphery and detection theory we show that human perceptual discrimination closely matches the modeled distortion in the instantaneous spike rates of the auditory nerve. In highly rippled frequency transfer conditions the NAI's prediction error is 8% versus the STI's prediction error of 10.8%.
Accurate models of normal and impaired neural representations of sound are useful tools in unders... more Accurate models of normal and impaired neural representations of sound are useful tools in understanding how acoustic stimuli are encoded in the brain, predicting speech intelligibility, and developing and testing speech processing schemes for hearing aids. In this paper we review recent developments in modelling the effects of hair cell impairment on neural responses to speech stimuli in the auditory
Through the use of DSP chips and multiple microphones, hearing aids now o2er the possibility of p... more Through the use of DSP chips and multiple microphones, hearing aids now o2er the possibility of performing signal-to-noise enhancement. Evaluating di2erent algorithms before they are instantiated on a hearing aid is essential. However, commercially available tests of hearing in noise do not allow for speech perception evaluation with a variety of signals, noise types, signal and noise locations, and reverberation.
Proceedings of the 3rd International IEEE EMBS Conference on Neural Engineering, 2007
A fall-off in speech intelligibility at higher-than-normal presentation levels has been observed ... more A fall-off in speech intelligibility at higher-than-normal presentation levels has been observed for listeners with and without hearing loss (Studebaker et al., 1999; Dubno et al., 2005; Molis and Summers, 2003; Shanks et al., 2002). Speech intelligibility predictors based on the acoustic signal properties, such as the articulation index and speech transmission index, cannot directly account for the effects of
The Neurophysiological Bases of Auditory Perception, 2010
... J Assoc Res Otolaryngol 10 (3). 40.7 Reply Rasha Ibrahim Thanks for your comments. Our presen... more ... J Assoc Res Otolaryngol 10 (3). 40.7 Reply Rasha Ibrahim Thanks for your comments. Our present results indicate Envelope recovery from TFS cues at the peripheral level. It is interesting to investigate if any Envelope recovery can occur at the central level also. ...
The Journal of the Acoustical Society of America, 2007
The temporal response of auditory-nerve (AN) fibers to a steady-state vowel is investigated using... more The temporal response of auditory-nerve (AN) fibers to a steady-state vowel is investigated using a computational auditory-periphery model. The model predictions are validated against a wide range of physiological data for both normal and impaired fibers in cats. The model incorporates two parallel filter paths, component 1 (C1) and component 2 (C2), which correspond to the active and passive modes of basilar membrane vibration, respectively, in the cochlea. The outputs of the two filters are subsequently transduced by two separate functions, added together, and then low-pass filtered by the inner hair cell (IHC) membrane, which is followed by the IHC-AN synapse and discharge generator. The C1 response dominates at low and moderate levels and is responsible for synchrony capture and multiformant responses seen in the vowel responses. The C2 response dominates at high levels and contributes to the loss of synchrony capture observed in normal and impaired fibers. The interaction between C1 and C2 responses explains the behavior of AN fibers in the transition region, which is characterized by two important observations in the vowel responses: First, all components of the vowel undergo the C1/C2 transition simultaneously, and second, the responses to the nonformant components of the vowel become substantial.
The Journal of the Acoustical Society of America, 2014
A phenomenological model of the auditory periphery in cats was previously developed by Zilany and... more A phenomenological model of the auditory periphery in cats was previously developed by Zilany and colleagues [J. Acoust. Soc. Am. 126, 2390-2412 (2009)] to examine the detailed transformation of acoustic signals into the auditory-nerve representation. In this paper, a few issues arising from the responses of the previous version have been addressed. The parameters of the synapse model have been readjusted to better simulate reported physiological discharge rates at saturation for higher characteristic frequencies [Liberman, J. Acoust. Soc. Am. 63, 442-455 (1978)]. This modification also corrects the responses of higher-characteristic frequency (CF) model fibers to low-frequency tones that were erroneously much higher than the responses of low-CF model fibers in the previous version. In addition, an analytical method has been implemented to compute the mean discharge rate and variance from the model's synapse output that takes into account the effects of absolute refractoriness.
The Journal of the Acoustical Society of America, 2009
There is growing evidence that the dynamics of biological systems that appear to be exponential o... more There is growing evidence that the dynamics of biological systems that appear to be exponential over short time courses are in some cases better described over the long-term by power-law dynamics. A model of rate adaptation at the synapse between inner hair cells and auditory-nerve (AN) fibers that includes both exponential and power-law dynamics is presented here. Exponentially adapting components with rapid and short-term time constants, which are mainly responsible for shaping onset responses, are followed by two parallel paths with power-law adaptation that provide slowly and rapidly adapting responses. The slowly adapting power-law component significantly improves predictions of the recovery of the AN response after stimulus offset. The faster power-law adaptation is necessary to account for the "additivity" of rate in response to stimuli with amplitude increments. The proposed model is capable of accurately predicting several sets of AN data, including amplitude-modulation transfer functions, long-term adaptation, forward masking, and adaptation to increments and decrements in the amplitude of an ongoing stimulus.
The Journal of the Acoustical Society of America, 2006
This paper presents a computational model to simulate normal and impaired auditory-nerve (AN) fib... more This paper presents a computational model to simulate normal and impaired auditory-nerve (AN) fiber responses in cats. The model responses match physiological data over a wider dynamic range than previous auditory models. This is achieved by providing two modes of basilar membrane excitation to the inner hair cell (IHC) rather than one. The two modes are generated by two parallel filters, component 1 (C1) and component 2 (C2), and the outputs are subsequently transduced by two separate functions. The responses are then added and passed through the IHC low-pass filter followed by the IHC-AN synapse model and discharge generator. The C1 filter is a narrow-band, chirp filter with the gain and bandwidth controlled by a nonlinear feed-forward control path. This filter is responsible for low and moderate level responses. A linear, static, and broadly tuned C2 filter followed by a nonlinear, inverted and nonrectifying C2 transduction function is critical for producing transition region and high-level effects. Consistent with Kiang's two-factor cancellation hypothesis, the interaction between the two paths produces effects such as the C1/C2 transition and peak splitting in the period histogram. The model responses are consistent with a wide range of physiological data from both normal and impaired ears for stimuli presented at levels spanning the dynamic range of hearing.
The Journal of the Acoustical Society of America, 2013
ABSTRACT Léger et al. [J. Acoust. Soc. Am. (2012)] measured the intelligibility of speech in stea... more ABSTRACT Léger et al. [J. Acoust. Soc. Am. (2012)] measured the intelligibility of speech in steady and spectrally or temporally modulated maskers for stimuli filtered into low- (<1.5 kHz) and mid-frequency (1-3 kHz) regions. Listeners with high-frequency hearing loss but near to clinically normal audiograms in the low- and mid-frequency regions showed poorer performance than a control group with normal hearing, but showed preserved spectral and temporal masking release. Here, we investigated whether a physiologically accurate model of the auditory periphery [Zilany et al., J. Acoust. Soc. Am. (2009)] can explain these masking release data. Intelligibility was predicted using the Neurogram SIMilarity (NSIM) metric of Hines and Harte [Speech Commun. (2010) and (2012)]. This metric can make use of either an "all-information" neurogram with small time bins or a "mean-rate" neurogram with large time bins. The average audiograms of the different groups of listeners from the study of Léger et al. were simulated in the model by applying different mixes of outer and/or inner hair cell impairment. Very accurate predictions of the human data for both normal-hearing and hearing-impaired groups were obtained from the all-information NSIM metric (i.e., taking into account phase-locking information) with threshold shifts produced predominantly by OHC impairment (and minimal IHC impairment).
Uploads
Papers by Ian Bruce