Abstract
Currently, it is critical to find the correct features from the audio, in order to analyze the information contained in it. This paper analyzes several feature types in audio from different points of view: time series, sound engineering, etc. In particular, the description of audio as a set of time series is not very common in the literature, and it is one of the aspects studied in this paper. Particularly, this paper proposes an automated method for feature engineering in audios, to extract, analyze and select the best features in a given context. Specifically, this paper develops a hybrid scheme of extraction of audio descriptors based on different principles and defines an automatic approach for the analysis and selection of these descriptors in a given audio context. Finally, our approach was tested on grouping tasks and compared to previous works on audio classification problems, with encouraging results.
Similar content being viewed by others
References
Moffat D, Ronan D, Reiss J (2015) An evaluation of audio feature extraction toolboxes,” DAFx 2015—Proceedings of the 18th International Conference on Digital Audio Effects
Seyerlehner K, Schedl M (2009) Block-level audio feature for music genre classification. In: online Proc. of the 5th Annual Music Information Retrieval Evaluation eXchange (MIREX-09)
Pearce A, Brookes T, Mason R (2017) Timbral attributes for sound effect library searching. In: Audio Engineering Society Conference: 2017 AES International Conference on Semantic Audio. Audio Engineering Society
Liu Q, Li R, Hu H, Gu D (2016) Extracting semantic information from visual data: a survey. Robotics 5(1):8. https://doi.org/10.3390/robotics5010008
Aguilar J, Salazar C, Velasco H, Monsalve-Pulido J, Montoya E (2020) Comparison and evaluation of different methods for the feature extraction from educational contents. Computation 8(2):30. https://doi.org/10.3390/computation8020030
Deldjoo Y, Dacrema MF, Constantin MG, Eghbal-Zadeh H, Cereda S, Schedl M, Ionescu B, Cremonesi P (2019) Movie genome: alleviating new item cold start in movie recommendation. User Model User-Adapt Inter 29(2):291–343
Seyerlehner K, Widmer G, Schedl M, Knees P (2010) Automatic music tag classification based on block-level. In: Proceedings of Sound and Music Computing 2010
Fulcher BD, Jones NS (2014) Highly comparative feature-based time-series classification. IEEE Trans Knowl Data Eng 26(12):3026–3037
Hyndman RJ, Wang E, Laptev N (2015) Large-scale unusual time series detection. In: 2015 IEEE international conference on data mining workshop (ICDMW). IEEE, pp 1616–1619
Wang X, Smith K, Hyndman R (2006) Characteristic-based clustering for time series data. Data Min Knowl Discov 13(3):335364. https://doi.org/10.1007/s10618-005-0039-x
Wülfing J, Riedmiller M (2012) Unsupervised learning of local features for music classification. In: Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR). pp 139–144. https://doi.org/10.5281/zenodo.1414782
Costa YM, Oliveira LS, Koerich AL, Gouyon F (2012) Comparing textural features for music genre classification. In: The 2012 International Joint Conference on Neural Networks (IJCNN). IEEE, pp 1–6
Muaidi H, Al-Ahmad A, Khdoor T, Alqrainy S, Alkoffash M (2014) Arabic audio news retrieval system using dependent speaker mode, mel frequency cepstral coefficient and dynamic time warping techniques. Res J Appl Sci Eng Technol 7(24):5082–5097
Serizel R, Bisot V, Essid S, Richard G (2018) Acoustic features for environmental sound analysis. In: Virtanen T, Plumbley M, Ellis D (eds) Computational analysis of sound scenes and events. Springer, Cham, pp 71–101
Aguilar J (2001) A general ant colony model to solve combinatorial optimization problems. Rev Colombiana de Comput 2(1):7–18
Aguilar J (1998) Definition of an energy function for the random neural to solve optimization problems. Neural Netw 11(4):731737
Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302. https://doi.org/10.1109/tsa.2002.800560
Li T, Ogihara M, Li Q (2003) A comparative study on content-based music genre classification. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval —SIGIR ’03. ACM Press [Online]. Available: https://doi.org/10.1145/860435.860487
Lidy T, Rauber A (2005) Evaluation of feature extractors and psycho-acoustic transformations for music genre classification. In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). pp 34–41
Pampalk E, Flexer A, Widmer G (2005) Improvements of audio-based music similarity and genre classificaton. In:Proc. 6th Int. Conf. Music Information Retrieval, pp 628–633, 01
Bergstra J, Casagrande N, Erhan D, Eck D, Kégl B (2006) Aggregate features and ADABOOST for music classification. Mach Learn 65(2–3):473–484. https://doi.org/10.1007/s10994-006-9019-7
Holzapfel A, Stylianou Y (2008) Musical genre classification using nonnegative matrix factorization-based features. IEEE Trans Audio, Speech, Language Process 16(2):424–434. https://doi.org/10.1109/tasl.2007.909434
Kobayashi T, Kubota A, Suzuki Y (Dec. 2018) Audio feature extraction based on sub-band signal correlations for music genre classification. In: 2018 IEEE International Symposium on Multimedia (ISM). IEEE. [Online]. Available: https://doi.org/10.1109/ism.2018.00-15
Morales L, Ouedraogo CA, Aguilar J, Chassot C, Medjiah S, Drira K (2019) Experimental comparison of the diagnostic capabilities of classification and clustering algorithms for the qos management in an autonomic iot platform. Serv Orient Comput Appl 13(3):199–219
Morales L, Aguilar J (2020) An automatic merge technique to improve the clustering quality performed by lamda. IEEE Access 8:162917–162944
Pampalk E, Rauber A, Merkl D (2002) Content-based organization and visualization of music archives. In: Proceedings of the tenth (ACM) international conference on Multimedia—(MULTIMEDIA) 02. ACM Press. [Online]. Available: https://doi.org/10.1145/641007.641121
Pampalk E, Dixon S, Widmer G (2003) On the evaluation of perceptual similarity measures for music. In: of: Proceedings of the sixth international conference on digital audio effects (DAFx-03), pp 7–12
Mandel MI, Ellis DP (2005) Song-level features and support vector machines for music classification. Proc. 6th Int. Conf. Music Information Retrieval, pp 594–599
Li T, Ogihara M (2006) Toward intelligent music information retrieval. IEEE Trans Multimed 8(3):564–574. https://doi.org/10.1109/tmm.2006.870730
Tzanetakis G (2008) Marsyas-0.2: a case study in implementing music information retrieval systems. In: Shen J, Shepherd J, Cui B, Liu L (eds) Intelligent music information systems: tools and methodologies. IGI Global. pp 31–49
Panagakis I, Benetos E, Kotropoulos C (2008) Music genre classification: a multilinear approach. In: Proceedings of the 9th International Society for Music Information Retrieval Conference (ISMIR). pp 583–588
Acknowledgements
This work has been supported by the project 64366: “Contenidos de aprendizaje inteligentes a travs del uso de herramientas de Big Data, Analtica Avanzada e IA”—Ministry of Science—Government of Antioquia—Republic of Colombia.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jiménez, M., Aguilar, J., Monsalve-Pulido, J. et al. An automatic approach of audio feature engineering for the extraction, analysis and selection of descriptors. Int J Multimed Info Retr 10, 33–42 (2021). https://doi.org/10.1007/s13735-020-00202-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-020-00202-1