Abstract
For multimedia analysis of electronic disguised method is a speech editing process in which the characteristics of voice have been changed. Frequency spectrum characteristics of the speech signal during electronic disguised have also changed. In this paper proposed a method for deriving an algorithm for extracted the efficiency of disguised voice from its normal voice. By using practical approaches for disguising the voice by a different semitone. Mel-frequency cepstral coefficients (MFCC), delta Mel-frequency cepstral coefficients (ΔMFCC), double delta Mel-frequency cepstral coefficients (ΔΔMFCC) based feature extraction techniques compute the acoustic feature and its statistical moments mean and correlation coefficient. Acoustic feature and its statistical moments passed through the different types of the algorithm-based classifier. By using different classifier find the efficiency of disguised voice.
Similar content being viewed by others
References
Audacity: free audio editor and recorder [online]" in http://audacity.sourceforge.net
Crochiere RE, Rabiner LR (1981) Interpolation and decimation of digital signals—a tutorial review. Proc IE E 69(3):300–331
Entezari-Maleki R, Rezaei A, Minaei-Bidgoli B (2009) Comparison of classification methods based on the type of attributes and sample size. J Converg Inform Technol 4(3):09–17
Gonzalez-Rodriguez J, Ramos-Castro D, Garcia-Gomar M, Ortega-Garcia J (2004) On robust estimation of likelihood ratios: the ATVs-UPM system at 2003 NFI/TNO forensic evaluation. In proc. IEEE Int Workshop Speaker Language Recognit: 1–8
Grimaldi M, Cummins F (2008) Speaker identification using instantaneous frequencies. IEEE Trans Audio Speech Lang Process 16(6):1097–1111
Haojun Wu, Yong Wang & Jiwu Huang (2013). Blind detection of electronic disguised voice. IEEE international conference on acoustics, speech and signal processing (ICASSP), 3016–3017
Jingxu C, Hongchen Y, Zhanjiang S (2004) The speaker automatic identified system and its forensic application. Proc Int Symp Comput Inf (1) 96–100
Kajarekar SS, Ferrer L, Shriberg E, Sonmez K, Stolcke A, Venkataraman A (2005) SRI’s 2004 NIST speaker recognition evaluation system. Proc IEEE ICASSP (1): 173–176
Kajarekar SS, Bratt H, Shriberg E, de Leon R (2006). A study of intentional voice modifications for evading automatic speaker recognition. Proc IEEE Int Workshop Speaker Lang Recognit: 1–6
Kiang MY (2003) A comparative assessment of classification methods. Decision Supp Syst Elsevier 35:441–454
Künzel HJ, Gonzalez-Rodriguez J, Ortega-García J (2004).Effect of voice disguise on the performance of a forensic automatic speaker recognition system. Proc IEEE Int Workshop Speaker Lang Recognit: 1–4
Liao X, Qin Z, Ding L (2017) Data embedding in digital images using critical functions. Signal Process Image Commun. https://doi.org/10.1016/j.image.2017.07.006
Rajeev Ranjan, Rajesh K. Dubey (2016) Isolated word recognition using HMM for Maithili dialect,” IEEE. Int Conf Signal Process Commun: 322–328
R. Rodman (1998) Speaker recognition of disguised voices: a program for research. Proc Consortium Speech Technol Conjunct Conf Speaker Recognition Man Mach Direct Forensic: 9–22
Seresht HR, Ahadi SM, Seyedin S (2017) Spectro-temporal power spectrum features for noise robust ASR. Circ Syst Sign Process 36(8):3222–3242
Tan T (2010) The effect of voice disguise on automatic speaker recognition. IEEE Int CISP (8): 3538–3541
Wu H, Wang Y, Huang J (2014) Identification of electronic disguised voices. IEEE Trans Inform Foren Sec 9(3):489–500
Zhang C, Tan T (2008) Voice disguise and automatic speaker recognition. Elsevier: Sci Direct: Foren Sci Int 175(2–3):118–122
Zhang C, Tan T (2008) Voice disguise and automatic speaker recognition. Forensic Sci Int 175(2):118–122
Zhu X, Beauregard G, Wyse L (2007) Real-time signal estimation from modified short-time Fourier transforms magnitude spectra. IEEE Trans Audio Speech Lang Process 15(5):1645–1653
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Singh, M.K., Singh, A.K. & Singh, N. Multimedia analysis for disguised voice and classification efficiency. Multimed Tools Appl 78, 29395–29411 (2019). https://doi.org/10.1007/s11042-018-6718-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6718-6