Underwater Acoustic Target Recognition: A Combination of Multi-Dimensional Fusion Features and Modified Deep Neural Network
Abstract
:1. Introduction
2. Multi-Dimensional Fusion Features Method
2.1. Gammatone Frequency Cepstral Coefficient
2.2. Modified Empirical Mode Decomposition
2.3. Multi-Dimensional Fusion Features Algorithm
2.3.1. Multi-Dimensional Feature Extraction
2.3.2. Dimension Reduction Method
Algorithm 1 Multi-dimensional Fusion Feature |
Input: original underwater acoustic signals ; |
Output: fusion feature vector; |
Initialization: (the number points in the DFT), (the number of filters), , , ; |
Procedure: |
1: Let ← ; |
2: Calculate using Equation (1); |
3: Calculate using Equation (2); |
4: Calculate the energy spectrum using Equation (3); |
5: Calculate the using Equation (6); |
6: Calculate the GFCC using Equation (7); |
7: Let ← , ← ; |
8: For ← 1 to or he number of extrema in is 2 or less do: |
9: For ← 1 to do: |
10: Calculate the x-coordinates of the IPs, i.e., obtain and using Equation (11); |
11: Calculate the y-coordinates of the IPs, ; |
12: ← , ← ; |
13: Create the maxima envelope, using cubic spline interpolation, with the IPs as ; |
14: Create the minima envelope, , using cubic spline interpolation, with the IPs as ; |
15: Deduce the mean envelope, ← ; |
16: ← ; |
17: ← , ← ; |
18: Decomposition results: ← ( is the kth order of IMFs); |
19: Dimension Reduction to IMFs; |
20: Calculate the using Equation (8); |
21: Calculate the IE and IF using Equation (10) to Equation (12); |
22: Construct fusion feature vector using Equation (13). |
3. Modified Deep Neural Network
4. Experiment Results and Analysis
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Yang, H.; Shen, S.; Yao, X.; Sheng, M.; Wang, C. Competitive Deep-Belief Networks for Underwater Acoustic Target Recognition. Sensors 2018, 18, 952. [Google Scholar] [CrossRef] [PubMed]
- Wang, X.; Jiao, J.; Yin, J.; Zhao, W.; Han, X.; Sun, B. Underwater sonar image classification using adaptive weights convolutional neural network. Appl. Acoust. 2018, 146, 145–154. [Google Scholar] [CrossRef]
- Liang, H.; Liang, Q.L. Target Recognition Based on 3-D Sparse Underwater Sonar Sensor Network. Lect. Notes Electr. Eng. 2019, 463, 2552–2564. [Google Scholar]
- Wu, Y.; Li, X.; Wang, Y. Extraction and classification of acoustic scattering from underwater target based on Wigner-Ville distribution. Appl. Acoust. 2018, 138, 52–59. [Google Scholar] [CrossRef]
- González-Hernández, F.R.; Sánchez-Fernández, L.P.; Suárez-Guerra, S.; Sánchez-Pérez, L.A. Marine mammal sound classification based on a parallel recognition model and octave analysis. Appl. Acoust. 2017, 119, 17–28. [Google Scholar] [CrossRef]
- Can, G.; Akbas, C.E.; Cetin, A.E. Recognition of vessel acoustic signatures using non-linear teager energy based features. In Proceedings of the 2016 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM), Reggio Calabria, Italy, 27–28 October 2016. [Google Scholar]
- Novitzky, M.; Pippin, C.; Collins, T.R.; Balch, T.R.; West, M.E. AUV behavior recognition using behavior histograms, HMMs, and CRFs. Robot 2014, 32, 291–304. [Google Scholar] [CrossRef]
- Murino, V. Reconstruction and segmentation of underwater acoustic images combining confidence information in MRF models. Pattern Recognit. 2001, 34, 981–997. [Google Scholar] [CrossRef]
- Tegowski, J.; Koza, R.; Pawliczka, I.; Skora, K.; Trzcinska, K.; 71Zdroik, J. Statistical, Spectral and Wavelet Features of the Ambient Noise Detected in the Southern Baltic Sea. In Proceedings of the 23rd International Congress on Sound and Vibration: From Ancient to Modern Acoustics, Int Inst Acoustics & Vibration, Auburn, Al, USA, 10–14 July 2016. [Google Scholar]
- Parada, P.P.; Cardenal-López, A. Using Gaussian mixture models to detect and classify dolphin whistles and pulses. J. Acoust. Soc. Am. 2014, 135, 3371–3380. [Google Scholar] [CrossRef] [PubMed]
- Lim, T.; Bae, K.; Hwang, C.; Lee, H. Classification of underwater transient signals using MFCC feature vector. In Proceedings of the 9th International Symposium on Signal Processing and Its Applications (ISSPA), Sharjah, United Arab, 12–15 February 2007. [Google Scholar]
- Lim, T.; Bae, K.; Hwang, C.; Lee, H. Underwater Transient Signal Classification Using Binary Pattern Image of MFCC and Neural Network. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 2008, E91A, 772–774. [Google Scholar] [CrossRef]
- Jankowski, C., Jr.; Quatieri, T.; Reynolds, D. Measuring fine structure in speech: Application to speaker identification. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Detroit, MI, USA, 9–12 May 1995; IEEE: Piscataway, NJ, USA; pp. 325–328. [Google Scholar]
- Guo, Y.; Gas, B. Underwater transient and non transient signals classification using predictive neural networks. In Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, St. Louis, MO, USA, 10–15 October 2009; pp. 2283–2288. [Google Scholar]
- Wang, W.; Li, S.; Yang, J.; Liu, Z.; Zhou, W. Feature Extraction of Underwater Target in Auditory Sensation Area Based on MFCC. In Proceedings of the 2016 IEEE/OES China Ocean Acoustics Symposium (COA), Harbin, China, 9–11 January 2016. [Google Scholar]
- Sharma, R.; Vignolo, L.; Schlotthauer, G.; Colominas, M.; Rufiner, H.L.; Prasanna, S. Empirical mode decomposition for adaptive AM-FM analysis of speech: A review. Speech Commun. 2017, 88, 39–64. [Google Scholar] [CrossRef]
- Holambe, R.S.; Deshpande, M.S. Advances in Non-Linear Modeling for Speech Processing; Springer Science and Business Media: Berlin, Germany, 2012. [Google Scholar]
- Lian, Z.; Xu, K.; Wan, J.; Li, G. Underwater acoustic target classification based on modified GFCC features. In Proceedings of the IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 25–26 March 2017; pp. 258–262. [Google Scholar]
- Mao, Z.; Wang, Z.; Wang, D. Speaker recognition algorithm based on Gammatone filter bank. Comput. Eng. Appl. 2015, 51, 200–203. [Google Scholar]
- Grimaldi, M.; Cummins, F. Speaker Identification Using Instantaneous Frequencies. IEEE Trans. Audio Speech Lang. Process. 2008, 16, 1097–1111. [Google Scholar] [CrossRef]
- Hasan, T.; Hansen, J.H. Robust speaker recognition in non-stationary room environments based on empirical mode decomposition. In Proceedings of the Interspeech, Florence, Italy, 27–31 August 2011; pp. 2733–2736. [Google Scholar]
- Zeng, X.; Wang, S. Underwater sound classification based on Gammatone filter bank and Hilbert-Huang transform. In Proceedings of the 2014 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Guilin, China, 5–8 August 2014; pp. 707–710. [Google Scholar]
- Deshpande, M.S.; Holambe, R.S. Speaker Identification Based on Robust AM-FM Features. In Proceedings of the Second International Conference on Emerging Trends in Engineering & Technology, Nagpur, India, 16–18 December 2009; pp. 880–884. [Google Scholar]
- Deshpande, M.S.; Holambe, R.S. Am-fm based robust speaker identification in babble noise. Environments 2011, 6, 19. [Google Scholar]
- Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Yen, N.-C.; Tung, C.C.; Liu, H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. A Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
- E Huang, N.; Wu, M.-L.C.; Long, S.R.; Shen, S.S.; Qu, W.; Gloersen, P.; Fan, K.L.; Wu, M.-L.C. A confidence limit for the empirical mode decomposition and Hilbert spectral analysis. Proc. R. Soc. A Math. Phys. Eng. Sci. 2003, 459, 2317–2345. [Google Scholar] [CrossRef]
- Huang, H.; Chen, X.-X. Speech formant frequency estimation based on Hilbert–Huang transform. J. Zhejiang Univ. Eng. Sci. 2006, 40, 1926. [Google Scholar] [CrossRef]
- Sharma, R.; Prasanna, S.M. A better decomposition of speech obtained using modified Empirical Mode Decomposition. Digit. Signal Process. 2016, 58, 26–39. [Google Scholar] [CrossRef]
- Huang, N.E.; Shen, S.S.P. Hilbert-Huang Transform and Its Applications; World Scientific: Singapore, 2005. [Google Scholar]
- Huang, H.; Pan, J. Speech pitch determination based on Hilbert-Huang transform. Signal Process. 2006, 86, 792–803. [Google Scholar] [CrossRef]
- Hayakawa, S.; Itakura, F. Text-dependent speaker recognition using the information in the higher frequency band. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Adelaide, Australia, 19–22 April 1994; IEEE: Piscataway, NJ, USA; pp. 1–137. [Google Scholar]
- Kotari, V.; Chang, K.C. Fusion and Gaussian Mixture Based Classifiers for SONAR data. In Signal Processing, Sensor Fusion, and Target Recognition XX; International Society for Optics and Photonics: Orlando, FL, USA, 5 May 2011. [Google Scholar]
- Liu, S.; Sim, K.C. On combining DNN and GMM with unsupervised speaker adaptation for robust automatic speech recognition. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 195–199. [Google Scholar]
- Wang, Q.; Wang, L.; Zeng, X.; Zhao, L. An Improved Deep Clustering Model for Underwater Acoustical Targets. Neural Process. Lett. 2018, 48, 1633–1644. [Google Scholar] [CrossRef]
- Ibrahim, A.K.; Zhuang, H.; Chérubin, L.M.; Schärer-Umpierre, M.T.; Erdol, N. Automatic classification of grouper species by their sounds using deep neural networks. J. Acoust. Soc. Am. 2018, 144, EL196–EL202. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, J.; Dai, W.; Metze, F.; Qu, S.; Das, S. A comparison of Deep Learning methods for environmental sound detection. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017; pp. 126–130. [Google Scholar]
- Dumpala, S.H.; Kopparapu, S.K. Improved speaker recognition system for stressed speech using deep neural networks. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 1257–1264. [Google Scholar]
Experiment Times | MFCC-GMM | GFCC-GMM |
---|---|---|
1 | 0.85 | 0.94 |
2 | 0.88 | 0.90 |
3 | 0.88 | 0.90 |
4 | 0.86 | 0.93 |
5 | 0.87 | 0.92 |
6 | 0.89 | 0.92 |
7 | 0.89 | 0.95 |
8 | 0.88 | 0.92 |
9 | 0.87 | 0.92 |
10 | 0.90 | 0.90 |
Experiment Times | MFCC-GMM | GFCC-GMM | MFCC-MDNN | GFCC-MDNN | MM-MDNN | MFF-MDNN |
---|---|---|---|---|---|---|
1 | 0.59 | 0.61 | 0.74 | 0.78 | 0.91 | 0.94 |
2 | 0.60 | 0.62 | 0.72 | 0.76 | 0.91 | 0.95 |
3 | 0.62 | 0.65 | 0.76 | 0.79 | 0.91 | 0.94 |
4 | 0.64 | 0.60 | 0.74 | 0.79 | 0.92 | 0.94 |
5 | 0.58 | 0.64 | 0.73 | 0.78 | 0.91 | 0.95 |
6 | 0.58 | 0.62 | 0.73 | 0.79 | 0.91 | 0.95 |
7 | 0.60 | 0.65 | 0.72 | 0.79 | 0.91 | 0.94 |
8 | 0.61 | 0.64 | 0.71 | 0.79 | 0.91 | 0.94 |
9 | 0.58 | 0.61 | 0.72 | 0.77 | 0.92 | 0.94 |
10 | 0.58 | 0.63 | 0.74 | 0.78 | 0.91 | 0.95 |
11 | 0.60 | 0.64 | 0.74 | 0.78 | 0.91 | 0.93 |
12 | 0.58 | 0.65 | 0.72 | 0.76 | 0.91 | 0.94 |
13 | 0.58 | 0.65 | 0.75 | 0.78 | 0.91 | 0.95 |
14 | 0.58 | 0.59 | 0.74 | 0.76 | 0.91 | 0.94 |
15 | 0.59 | 0.60 | 0.72 | 0.80 | 0.92 | 0.94 |
16 | 0.58 | 0.62 | 0.73 | 0.79 | 0.90 | 0.94 |
17 | 0.60 | 0.61 | 0.74 | 0.78 | 0.90 | 0.92 |
18 | 0.58 | 0.62 | 0.72 | 0.79 | 0.90 | 0.94 |
19 | 0.57 | 0.61 | 0.75 | 0.78 | 0.92 | 0.94 |
20 | 0.59 | 0.60 | 0.74 | 0.77 | 0.91 | 0.94 |
21 | 0.58 | 0.63 | 0.72 | 0.79 | 0.91 | 0.94 |
22 | 0.58 | 0.61 | 0.71 | 0.76 | 0.91 | 0.95 |
23 | 0.58 | 0.60 | 0.73 | 0.78 | 0.91 | 0.94 |
24 | 0.60 | 0.62 | 0.74 | 0.79 | 0.92 | 0.93 |
25 | 0.61 | 0.63 | 0.72 | 0.79 | 0.91 | 0.93 |
26 | 0.58 | 0.61 | 0.74 | 0.79 | 0.90 | 0.94 |
27 | 0.58 | 0.60 | 0.70 | 0.79 | 0.91 | 0.94 |
28 | 0.60 | 0.62 | 0.72 | 0.78 | 0.91 | 0.94 |
29 | 0.57 | 0.64 | 0.74 | 0.78 | 0.92 | 0.94 |
30 | 0.59 | 0.64 | 0.72 | 0.80 | 0.91 | 0.94 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, X.; Liu, A.; Zhang, Y.; Xue, F. Underwater Acoustic Target Recognition: A Combination of Multi-Dimensional Fusion Features and Modified Deep Neural Network. Remote Sens. 2019, 11, 1888. https://doi.org/10.3390/rs11161888
Wang X, Liu A, Zhang Y, Xue F. Underwater Acoustic Target Recognition: A Combination of Multi-Dimensional Fusion Features and Modified Deep Neural Network. Remote Sensing. 2019; 11(16):1888. https://doi.org/10.3390/rs11161888
Chicago/Turabian StyleWang, Xingmei, Anhua Liu, Yu Zhang, and Fuzhao Xue. 2019. "Underwater Acoustic Target Recognition: A Combination of Multi-Dimensional Fusion Features and Modified Deep Neural Network" Remote Sensing 11, no. 16: 1888. https://doi.org/10.3390/rs11161888
APA StyleWang, X., Liu, A., Zhang, Y., & Xue, F. (2019). Underwater Acoustic Target Recognition: A Combination of Multi-Dimensional Fusion Features and Modified Deep Neural Network. Remote Sensing, 11(16), 1888. https://doi.org/10.3390/rs11161888