Abstract
It is an intractable task to achieve high-efficiency and high-quality speech recovery for the existing underdetermined systems. To solve this problem, this paper proposes a harmonics extraction based underdetermined speech recovery algorithm, which consists of 4 stages. In the 1st stage, spectrum correction technique is adopted to extract the harmonic components from the mixtures’ short time Fourier transform (STFT); In the 2nd stage, a phase-coherence criterion is applied on these harmonic components to identify the single source components; In the 3rd stage, these single source patterns are further categorized into multiple groups by means of the adaptive k-means clustering, from which the mixing matrix is estimated; In the last stage, this estimated matrix is further combined with the subspace projection algorithm, which resultantly yields the final source recovery. The high efficiency lies in that the harmonics extraction is properly combined with single source component identification. The speech recovery experiment demonstrated that, compared to the original subspace projection algorithm, the proposed method can acquire a higher recovery quality, which presents a potential application in other harmonic related fields.
Similar content being viewed by others
References
Abrard F, Deville Y (2005) A time frequency blind signal separation method applicable to underdetermined mixtures of dependent sources. Signal Process 85(7):1389–1403
Aïssa-El-Bey A, Linh-Trung N, Abed-Meraim K et al (2007) Underdetermined blind separation of nondisjoint sources in the time-frequency domain. IEEE Trans Signal Process 55(3):897–907
Bofill P, Zibulevsky M (2001) Underdetermined blind source separation using sparse representations. Signal Process 81(11):2353–2362
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI-1(2):224–227
Erdogan AT (2007) Globally convergent deflationary instantaneous blind source separation algorithm for digital communication signals. IEEE Trans Signal Process 55 (5):2182–2192
Fu W, Chen J, Yang B (2017) Source recovery of underdetermined blind source separation based on SCMP algorithm. IET Signal Process 11(7):877–883
Ge S, Min H (2014) Fourth-order cumulant of tensor decomposition method for blind identification of underdetermined separation. Acta Electronica Sinica 42(5):992–997
Hild KE II, Attias HT, Nagarajan SS (2008) An expectation-maximization method for spatio-temporal blind source separation using an AR-MOG source model. IEEE Trans Neural Netw 19(3):508–519
Huang Z, Shaofang H, Zuo C et al (2014) Dual key speech encryption algorithm based underdetermined BSS. Sci World J 2014(8):1–8
Hyvärinen A, Oja E (1997) A fast fixed-point algorithm for independent component analysis. Neural Comput 9(7):1483–1492
Liu B, Reju VG, Khong AW (2014) A linear source recovery method for underdetermined mixtures of uncorrelated AR-model signals without sparseness. IEEE Trans Signal Process 62(19):4947–4958
Ozerov A, Fevotte C (2010) Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans Audio Speech Lang Process 18(3):550–563
Qiao ZJ, Lei YG, Lin J, Jia F (2016) An adaptive unsaturated bistable stochastic resonance method and its application in mechanical fault diagnosis. Mech Syst Signal Process 84(Part A):731–746
Sawada H, Araki S, Makino S (2011) Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans Audio Speech Lang Process 19(3):516–527
Siegel LJ, Bessey A (1982) Voiced/unvoiced/mixed excitation classification of speech. IEEE Trans Acoust Speech Signal Process 30(3):451–460
Tang JJ, Hui L, Dai XC (2014) Blind separation algorithm of single channel MPSK signals based on MCMC. J Signal Process 30(11):1321–1328
Vaseghi SV (2008) Advanced digital signal processing and noise reduction. Wiley, England
Wang X, Huang Z, Zhou Y (2014) Underdetermined DOA estimation and blind separation of non-disjoint sources in time frequency domain based on sparse representation method. IEEE Syst Eng Electron 25(1):17–25
Weihong F, Lifen M, Li A (2014) Blind estimation of underdetermined mixing matrix based on improved K-means clustering. J Syst Eng Electron 36(11):2143–2148
Xie S, Yang L, Yang J et al (2012) Time-frequency approach to underdetermined blind source separation. IEEE Trans Neural Netw Learn Syst 23(2):306–316
Yılmaz O, Rickard S (2004) Blind separation of speech mixtures via time-frequency masking. IEEE Trans Signal Process 52(7):1830–1847
Yu J (2016) Machinery fault diagnosis using joint global and local/nonlocal discriminant analysis with selective ensemble learning. J Sound Vib 382(10):340–356
Zhang F, Geng Z, Yuan W (2001) The algorithm of interpolating windowed FFT for harmonic analysis of electric power system. IEEE Trans Power Delivery 16(2):160–164
Zhou G, Yang Z, Xie S et al (2011) Mixing matrix estimation from sparse mixtures with unknown number of sources. IEEE Trans Neural Netw 22(2):211–221
Acknowledgements
This work was financially supported by Qingdao National Laboratory for Marine Science and Technology under Grant No. QNLM2016OPR0411.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, L., Huang, X. Harmonics extraction based speech recovery for underdetermined mixing systems. Multimed Tools Appl 77, 22267–22280 (2018). https://doi.org/10.1007/s11042-018-5919-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5919-3