Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1032604.1032620acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Automatic classification of speech and music using neural networks

Published: 13 November 2004 Publication History

Abstract

The importance of automatic discrimination between speech signals and music signals has evolved as a research topic over recent years. The need to classify audio into categories such as speech or music is an important aspect of many multimedia document retrieval systems. Several approaches have been previously used to discriminate between speech and music data. In this paper, we propose the use of the mean and variance of the discrete wavelet transform in addition to other features that have been used previously for audio classification. We have used Multi-Layer Perceptron (MLP) Neural Networks as a classifier. Our initial tests have shown encouraging results that indicate the viability of our approach.

References

[1]
Carey, M. J., Parris, E. S. and Lloyd-Thomas, H., A Comparison of Features for Speech, Music Discrimination. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 99), Vol. 1, 1999.
[2]
Chou, W. and Gu, L., Robust Singing Detection In Speech/Music Discriminator Design. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 01), Vol. 2, 2001.
[3]
El-Maleh, K., Klein, M., Petrucci, G. and Kabal, P., Speech/Music Discrimination For Multimedia Applications. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 00), Vol. 6, 2000.
[4]
Harb, H. and Chen, L., Robust Speech Music Discrimination Using Spectrum's First Order Statistics And Neural Networks. In Proceedings of the Seventh International Symposium on Signal Processing and Its Applications, Vol. 2, 2003.
[5]
Harb, H., Chen, L. and Auloge, J. Y., Speech/Music/Silence and Gender Detection Algorithm. In Proceedings of the 7th International Conference on Distributed Multimedia Systems (DMS 01), 2001.
[6]
Haykin, S., Neural Networks: A Comprehensive Foundation. Prentice Hall, 1999.
[7]
Karneback, S., Discrimination between speech and music based on a low frequency modulation feature. In Proceedings of the European Conference on Speech Communication and Technology, 2001.
[8]
Panagiotakis, C. and Tziritas, G., A Speech/Music Discriminator Based On RMS And Zero-Crossings. IEEE Transactions on Multimedia, 2004.
[9]
Parris, E. S., Carey, M. J. and Lloyd-Thomas, H., Feature Fusion For Music Detection. In Proceedings of the European Conference on Speech Communication and Technology, 1999.
[10]
Pinquier, J., Rouas, J. -L. and André-Obrecht, R., A Fusion Study in Speech/Music Classification. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 03), Vol. 2, 2003.
[11]
Pinquier, J., Rouas, J.-L. and André-Obrecht, R., Robust Speech / Music Classification in Audio Documents. In Proceedings of the International Conference on Spoken Language Processing (ICSLP 02), Vol. 3, 2002.
[12]
Pinquier, J., Sénac, C. and André-Obrecht, R., Speech and Music Classification in Audio Documents. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 02), Vol. 4, 2002.
[13]
Saad, E. M., El-Adawy, M. I., Abu-El-Wafa, M. E. and Wahba, A. A., A Multifeature Speech/Music Discrimination System. In Proceedings of the Canadian Conference on Electrical and Computer Engineering (CCECE 02), Vol. 2, 2002.
[14]
Saunders, J., Real-Time Discrimination of Broadcast Speech/Music. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 96), Vol. 2, 1996.
[15]
Scheirer, E. and Slaney, M., Construction and Evaluation of A Robust Multifeatures Speech/Music Discriminator. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 97), Vol. 2, 1997.
[16]
Wang, W. Q., Gao, W. and Ying, D. W., A Fast and Robust Speech/Music Discrimination Approach. In Proceedings of the International Conference on Information, Communications and Signal Processing, 2003.

Cited By

View all
  • (2018)Concepts, Methods, and Performances of Particle Swarm Optimization, Backpropagation, and Neural NetworksApplied Computational Intelligence and Soft Computing10.1155/2018/95472122018Online publication date: 3-Sep-2018
  • (2018)An RNN-Based Speech-Music Discrimination Used for Hybrid Audio CoderMultiMedia Modeling10.1007/978-3-319-73603-7_7(81-92)Online publication date: 13-Jan-2018
  • (2017)Time-Slot Based Intelligent Music Recommender in Indian MusicIntelligent Analysis of Multimedia Information10.4018/978-1-5225-0498-6.ch012(319-351)Online publication date: 2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MMDB '04: Proceedings of the 2nd ACM international workshop on Multimedia databases
November 2004
118 pages
ISBN:1581139756
DOI:10.1145/1032604
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. audio features
  2. audio signal processing
  3. content-based indexing
  4. music speech classification
  5. neural networks

Qualifiers

  • Article

Conference

CIKM04
Sponsor:

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Concepts, Methods, and Performances of Particle Swarm Optimization, Backpropagation, and Neural NetworksApplied Computational Intelligence and Soft Computing10.1155/2018/95472122018Online publication date: 3-Sep-2018
  • (2018)An RNN-Based Speech-Music Discrimination Used for Hybrid Audio CoderMultiMedia Modeling10.1007/978-3-319-73603-7_7(81-92)Online publication date: 13-Jan-2018
  • (2017)Time-Slot Based Intelligent Music Recommender in Indian MusicIntelligent Analysis of Multimedia Information10.4018/978-1-5225-0498-6.ch012(319-351)Online publication date: 2017
  • (2016)A Review of Physical and Perceptual Feature Extraction Techniques for Speech, Music and Environmental SoundsApplied Sciences10.3390/app60501436:5(143)Online publication date: 12-May-2016
  • (2014)ANN Optimization Experiments for ClassificationMedical Diagnosis Using Artificial Neural Networks10.4018/978-1-4666-6146-2.ch014(200-212)Online publication date: 2014
  • (2014)Optimization AlgorithmsMedical Diagnosis Using Artificial Neural Networks10.4018/978-1-4666-6146-2.ch013(182-199)Online publication date: 2014
  • (2010)A wavelet-based parameterization for speech/music discriminationComputer Speech and Language10.1016/j.csl.2009.05.00324:2(341-357)Online publication date: 1-Apr-2010
  • (2008)Speech/Music Discrimination Based on Discrete Wavelet TransformProceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications10.1007/978-3-540-87881-0_19(205-211)Online publication date: 2-Oct-2008
  • (2007)Audio Environment Classication for Hearing Aids using Artificial Neural Networks with Windowed Input2007 IEEE Symposium on Computational Intelligence in Image and Signal Processing10.1109/CIISP.2007.369314(183-188)Online publication date: Apr-2007

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media