A multichannel subspace approach with signal presence probability for speech enhancement

Hong, Jungpyo

doi:10.1007/s11045-019-00640-z

A multichannel subspace approach with signal presence probability for speech enhancement

Published: 18 March 2019

Volume 30, pages 2045–2058, (2019)
Cite this article

Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Jungpyo Hong ORCID: orcid.org/0000-0001-6812-8877¹

243 Accesses
1 Citation
Explore all metrics

Abstract

For the last few decades, speech enhancement based on microphone arrays has primarily utilized prior information about system models, e.g., array geometry and source location. However, estimation of the time delay to align microphone inputs is largely affected by reverberation and microphone mismatch. Preprocessing time aligning, e.g., fixed beamforming (the first branch of the generalized sidelobe canceller), is not desirable in general applications. Recently, interest has shifted to linear filtering, which works with only second-order statistics of noisy input and estimated noise. This paper proposes a linear filter design based on a multichannel subspace approach for speech enhancement. The contribution of the proposed multichannel subspace methods is threefold. First, a linear filter is applied to the multichannel frequency domain using a spatiospectral correlation matrix. Next, three types of multichannel signal presence probability (MC-SPP) are derived in the subspace domain. Third, incorporating the MC-SPPs into the gain modification of the linear filter achieves further improved noise reduction performance. Of the gain modifications, the proposed gain modification with subspace probability related to the eigenvector corresponding to the maximum eigenvalue realized the best noise reduction performance. The evaluation on average improved the proposed subspace-based methods by approximately 4 dB in overall SNR while maintaining a similar cepstral distance measured over the minimum variance distortionless response with the state-of-the-art relative transfer function estimation in adverse noisy environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement

Blind Signal Separation with Speech Enhancement

An Improved Signal Subspace Algorithm for Speech Enhancement

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Affes, S., & Grenier, Y. (1997). A signal subspace tracking algorithm for microphone array processing of speech. IEEE Transactions on Speech and Audio Processing, 5(5), 425–437.
Article Google Scholar
Allen, J., & Berkley, D. (1979). Image method for efficiently simulating small-room acoustics. The Journal of the Acoustical Society of America, 65, 943–950.
Article Google Scholar
Asano, F., Hayamizu, S., Yamada, T., & Nakamura, S. (2000). Speech enhancement based on the subspace method. IEEE Transactions on Speech and Audio Processing, 8(5), 497–507.
Article Google Scholar
Bartels, R. H., & Stewart, G. (1972). Solution of the matrix equation $\text{ AX }+\text{ XB } = \text{ C }$. Communications of the ACM, 15(9), 820–822.
Article MATH Google Scholar
Benesty, J., Chen, J., & Huang, Y. (2007). Microphone array signal processing. Heidelberg, Berlin: Springer.
Google Scholar
Benesty, J., Makino, S., & Chen, J. (2005). Speech enhancement. Heidelberg, Berlin: Springer.
Google Scholar
Borowicz, A., & Petrovsky, A. (2005). Perceptually constrained subspace method for enhancing speech degraded by colored noise. In Proceedings of 2005 AES, Barcelona, Spain.
Cohen, I. (2002). Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator. IEEE Signal Processing Letters, 9(4), 113–116.
Article Google Scholar
Cohen, I. (2003). Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging. IEEE Transactions on Speech and Audio Processing, 11(5), 466–475.
Article Google Scholar
Dmochowski, J., Benesty, J., & Affes, S. (2007). Direction of arrival estimation using the parameterized spatial correlation matrix. IEEE Transactions on Audio, Speech, and Language Processing, 15(4), 1327–1339.
Article Google Scholar
Ephraim, Y., & Trees, H. L. V. (1995). A signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 3(4), 251–266.
Article Google Scholar
Gannot, S., Burshtein, D., & Weinstein, E. (2001). Signal enhancement using beamforming and nonstationarity with application to speech. IEEE Transactions on Signal Processing, 49(8), 1614–1626.
Article Google Scholar
Habets, E., & Gannot, S. (2007). Generating sensor signals in isotropic noise fields. The Journal of the Acoustical Society of America, 122, 3464–3470.
Article Google Scholar
Hirsch, H. G., & Pearce, D. (2000). The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions. In ISCA ITRW ASR2000.
Hu, L., & Loizou, P. C. (2002). A subspace approach for enhancing speech corrupted by colored noise. IEEE Signal Processing Letters, 9(7), 204–206.
Article Google Scholar
Hu, L., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing, 11(4), 334–341.
Article Google Scholar
IEEE Subcommittee. (1969). IEEE recommended practice for speech quality measurements. IEEE Transactions on Audio and Electroacoustics, AE–17(3), 225–246.
Google Scholar
Johnson, D., & Dudgeon, D. (1993). Array signal processing: Concepts and techniques. Englewood Clifs, NJ: Prentice-Hall.
MATH Google Scholar
Kim, D. K., & Chang, J. H. (2011). A subspace approach based on embedded prewhitening for voice activity detection. The Journal of the Acoustical Society of America, 130(5), EL304–EL310.
Article Google Scholar
Kim, N. S., & Chang, J. H. (2000). Spectral enhancement based on global soft decision. IEEE Signal Processing Letters, 7(6), 108–110.
Google Scholar
Kitawaki, N., Nagabuchi, H., & Itoh, K. (1988). Objective quality evaluation for low bit-rate speech coding systems. IEEE Journal on Selected Areas in Communications, 6(2), 262–273.
Article Google Scholar
Krueger, A., Warsitz, E., & Haeb-Umbach, R. (2011). Speech enhancement with a GSC-like structure employing eigenvector-based transfer function ratios estimation. IEEE Transactions on Audio, Speech, and Language Processing, 19(1), 206–219.
Article Google Scholar
Lehmann, E. A., & Johansson, A. M. (2008). Prediction of energy decay in room impulse responses simulated with an image-souce model. The Journal of the Acoustical Society of America, 123(1), 269–277.
Article Google Scholar
Lev-Ari, H., & Ephraim, Y. (2003). Extension of the signal subspace speech enhancement approach to colored noise. IEEE Signal Processing Letters, 10(4), 104–106.
Article Google Scholar
Loizou, P. C. (2007). Speech enhancement. Boca Raton, FL: CRC Press.
Book Google Scholar
Markovich-Golan, S., & Gannot, S. (2015). Performance analysis of the covariance subtraction method for relative transfer function estimation and comparison to the covariance whitening method. In Proceedings of 2015 ICASSP.
Serizel, R., Moonen, M., Dijk, B., & Wouters, J. (2014). Low-rank approximation based multichannel Wiener filter algorithms for noise reduction with application in cochlear implants. IEEE Transactions on Audio, Speech, and Language Processing, 22(4), 785–799.
Article Google Scholar
Souden, M., Chen, J., Benesty, J., & Affes, S. (2010). Gaussian model-basedmultichannel speech presence probability. IEEE Transactions on Audio, Speech, and Language Processing, 18(5), 1072–1077.
Article Google Scholar
Souden, M., Chen, J., Benesty, J., & Affes, S. (2011). An integrated solution for online multichannel noise tracking and reduction. IEEE Transactions on Audio, Speech, and Language Processing, 19(7), 2159–2169.
Article Google Scholar
Varzandeh, R., Taseska, M., & Habets, E. (2017). An interative multichannel subspace-based covariance subtraction method for relative transfer function estimation. In Proceedings of HSCMA.
Wang, H., & Kaveh, M. (1985). Coherent signal-subspace processing for the detection and estimation of angles of arrival of multiple wideband sources. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP–33(4), 823–831.
Article Google Scholar
Warsitz, E., & Haeb-Umbach, R. (2007). Blind acoustic beamforming based on generalized eigenvalue decomposition. IEEE Transactions on Audio, Speech, and Language Processing, 15(5), 1529–1539.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
Jungpyo Hong

Authors

Jungpyo Hong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jungpyo Hong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hong, J. A multichannel subspace approach with signal presence probability for speech enhancement. Multidim Syst Sign Process 30, 2045–2058 (2019). https://doi.org/10.1007/s11045-019-00640-z

Download citation

Received: 04 August 2018
Revised: 25 February 2019
Accepted: 26 February 2019
Published: 18 March 2019
Issue Date: October 2019
DOI: https://doi.org/10.1007/s11045-019-00640-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multichannel subspace approach with signal presence probability for speech enhancement

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement

Blind Signal Separation with Speech Enhancement

An Improved Signal Subspace Algorithm for Speech Enhancement

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A multichannel subspace approach with signal presence probability for speech enhancement

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement

Blind Signal Separation with Speech Enhancement

An Improved Signal Subspace Algorithm for Speech Enhancement

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation