Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Multimedia analysis for disguised voice and classification efficiency

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

For multimedia analysis of electronic disguised method is a speech editing process in which the characteristics of voice have been changed. Frequency spectrum characteristics of the speech signal during electronic disguised have also changed. In this paper proposed a method for deriving an algorithm for extracted the efficiency of disguised voice from its normal voice. By using practical approaches for disguising the voice by a different semitone. Mel-frequency cepstral coefficients (MFCC), delta Mel-frequency cepstral coefficients (ΔMFCC), double delta Mel-frequency cepstral coefficients (ΔΔMFCC) based feature extraction techniques compute the acoustic feature and its statistical moments mean and correlation coefficient. Acoustic feature and its statistical moments passed through the different types of the algorithm-based classifier. By using different classifier find the efficiency of disguised voice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Audacity: free audio editor and recorder [online]" in http://audacity.sourceforge.net

  2. Crochiere RE, Rabiner LR (1981) Interpolation and decimation of digital signals—a tutorial review. Proc IE E 69(3):300–331

    Article  Google Scholar 

  3. Entezari-Maleki R, Rezaei A, Minaei-Bidgoli B (2009) Comparison of classification methods based on the type of attributes and sample size. J Converg Inform Technol 4(3):09–17

    Google Scholar 

  4. Gonzalez-Rodriguez J, Ramos-Castro D, Garcia-Gomar M, Ortega-Garcia J (2004) On robust estimation of likelihood ratios: the ATVs-UPM system at 2003 NFI/TNO forensic evaluation. In proc. IEEE Int Workshop Speaker Language Recognit: 1–8

  5. Grimaldi M, Cummins F (2008) Speaker identification using instantaneous frequencies. IEEE Trans Audio Speech Lang Process 16(6):1097–1111

    Article  Google Scholar 

  6. Haojun Wu, Yong Wang & Jiwu Huang (2013). Blind detection of electronic disguised voice. IEEE international conference on acoustics, speech and signal processing (ICASSP), 3016–3017

  7. Jingxu C, Hongchen Y, Zhanjiang S (2004) The speaker automatic identified system and its forensic application. Proc Int Symp Comput Inf (1) 96–100

  8. Kajarekar SS, Ferrer L, Shriberg E, Sonmez K, Stolcke A, Venkataraman A (2005) SRI’s 2004 NIST speaker recognition evaluation system. Proc IEEE ICASSP (1): 173–176

  9. Kajarekar SS, Bratt H, Shriberg E, de Leon R (2006). A study of intentional voice modifications for evading automatic speaker recognition. Proc IEEE Int Workshop Speaker Lang Recognit: 1–6

  10. Kiang MY (2003) A comparative assessment of classification methods. Decision Supp Syst Elsevier 35:441–454

    Article  Google Scholar 

  11. Künzel HJ, Gonzalez-Rodriguez J, Ortega-García J (2004).Effect of voice disguise on the performance of a forensic automatic speaker recognition system. Proc IEEE Int Workshop Speaker Lang Recognit: 1–4

  12. Liao X, Qin Z, Ding L (2017) Data embedding in digital images using critical functions. Signal Process Image Commun. https://doi.org/10.1016/j.image.2017.07.006

  13. Rajeev Ranjan, Rajesh K. Dubey (2016) Isolated word recognition using HMM for Maithili dialect,” IEEE. Int Conf Signal Process Commun: 322–328

  14. R. Rodman (1998) Speaker recognition of disguised voices: a program for research. Proc Consortium Speech Technol Conjunct Conf Speaker Recognition Man Mach Direct Forensic: 9–22

  15. Seresht HR, Ahadi SM, Seyedin S (2017) Spectro-temporal power spectrum features for noise robust ASR. Circ Syst Sign Process 36(8):3222–3242

    Article  Google Scholar 

  16. Tan T (2010) The effect of voice disguise on automatic speaker recognition. IEEE Int CISP (8): 3538–3541

  17. Wu H, Wang Y, Huang J (2014) Identification of electronic disguised voices. IEEE Trans Inform Foren Sec 9(3):489–500

    Article  Google Scholar 

  18. Zhang C, Tan T (2008) Voice disguise and automatic speaker recognition. Elsevier: Sci Direct: Foren Sci Int 175(2–3):118–122

    Article  Google Scholar 

  19. Zhang C, Tan T (2008) Voice disguise and automatic speaker recognition. Forensic Sci Int 175(2):118–122

    Article  Google Scholar 

  20. Zhu X, Beauregard G, Wyse L (2007) Real-time signal estimation from modified short-time Fourier transforms magnitude spectra. IEEE Trans Audio Speech Lang Process 15(5):1645–1653

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahesh K. Singh.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, M.K., Singh, A.K. & Singh, N. Multimedia analysis for disguised voice and classification efficiency. Multimed Tools Appl 78, 29395–29411 (2019). https://doi.org/10.1007/s11042-018-6718-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6718-6

Keywords