Abstract
We apply sparse, fast and flexible adaptive lapped orthogonal transforms to underdetermined audio source separation using the time-frequency masking framework. This normally requires the sources to overlap as little as possible in the time-frequency plane.
In this work, we apply our adaptive transform schemes to the semi-blind case, in which the mixing system is already known, but the sources are unknown. By assuming that exactly two sources are active at each time-frequency index, we determine both the adaptive transforms and the estimated source coefficients using ℓ1 norm minimisation. We show average performance of 12–13 dB SDR on speech and music mixtures, and show that the adaptive transform scheme offers improvements in the order of several tenths of a dB over transforms with constant block length. Comparison with previously studied upper bounds suggests that the potential for future improvements is significant.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bofill, P., Zibulevsky, M.: Underdetermined blind source separation using sparse representations. Signal Process. 81(11), 2353–2362 (2001)
Bofill, P.: Identifying single source data for mixing matrix estimation in instantaneous blind source separation. In: Koutník, J., Kůrková, V., Neruda, R. (eds.) ICANN 2008, Part I. LNCS, vol. 5163, pp. 759–767. Springer, Heidelberg (2008)
Huang, Y., Pollak, I., Bouman, C.A., Do, M.N.: Best basis search in lapped dictionaries. IEEE Trans. Signal Process. 54(2), 651–664 (2006)
Gribonval, R.: Piecewise linear source separation. In: Proc. SPIE (Wavelets X), vol. 5207, pp. 297–310 (2003)
ISO: Information technology—Coding of audio-visual objects—Part 3: Audio (ISO/IEC 14496-3:2005). ISO, Geneva, Switzerland (2005)
Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Academic Press, London (1999)
Nesbit, A., Plumbley, M.D., Vincent, E.: Oracle evaluation of flexible adaptive transforms for underdetermined audio source separation. In: Proc. ICArn 2008, pp. 17–20 (2008)
Nesbit, A., Vincent, E., Plumbley, M.D.: Benchmarking flexible adaptive time-frequency transforms for underdetermined audio source separation. In: ICASSP 2009 (submitted, 2009)
Vincent, E., Gribonval, R.: Blind criterion and oracle bound for instantaneous audio source separation using adaptive time-frequency representations. In: Proc. WASPAA 2007, pp. 110–113 (2007)
Vincent, E., Gribonval, R., Plumbley, M.D.: Oracle estimators for the benchmarking of source separation algorithms. Signal Process. 87(8), 1933–1950 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nesbit, A., Vincent, E., Plumbley, M.D. (2009). Extension of Sparse, Adaptive Signal Decompositions to Semi-blind Audio Source Separation. In: Adali, T., Jutten, C., Romano, J.M.T., Barros, A.K. (eds) Independent Component Analysis and Signal Separation. ICA 2009. Lecture Notes in Computer Science, vol 5441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00599-2_76
Download citation
DOI: https://doi.org/10.1007/978-3-642-00599-2_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00598-5
Online ISBN: 978-3-642-00599-2
eBook Packages: Computer ScienceComputer Science (R0)