Abstract
The public location of CCTV cameras and their connexion with public safety demand high robustness and reliability from surveillance systems. This paper focuses on the development of a multimodal fusion technique which exploits the benefits of a Bayesian inference scheme to enhance surveillance systems’ reliability. Additionally, an automatic object classifier is proposed based on the multimodal fusion technique, addressing semantic indexing and classification for forensic applications. The proposed Bayesian-based Multimodal Fusion technique, and particularly, the proposed object classifier are evaluated against two state-of-the-art automatic object classifiers on the i-LIDS surveillance dataset.
The research was partially supported by the European Commission under contract FP7-SEC 261743 VideoSense.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Fernandez Arguedas, V., Zhang, Q., Chandramouli, K., Izquierdo, E.: Vision Based Semantic Analysis of Surveillance Videos. In: Anagnostopoulos, I.E., Bieliková, M., Mylonas, P., Tsapatsoulis, N. (eds.) Semantic Hyper/Multi-media Adaptation. SCI, vol. 418, pp. 83–126. Springer, Heidelberg (2012)
Atrey, P., Hossain, M., El Saddik, A., Kankanhalli, M.: Multimodal fusion for multimedia analysis: a survey. Multimedia Systems 16, 345–379 (2010)
Snoek, C., Worring, M., Smeulders, A.: Early versus late fusion in semantic video analysis. In: ACM Multimedia (2005)
Wu, Z., Cai, L., Meng, H.: Multi-level Fusion of Audio and Visual Features for Speaker Identification. In: Zhang, D., Jain, A.K. (eds.) ICB 2005. LNCS, vol. 3832, pp. 493–499. Springer, Heidelberg (2005)
Zhang, Q., Izquierdo, E.: Combining low-level features for semantic inference in image retrieval. EURASIP Journal on Advances in Signal Processing 12 (2007)
Jaffre, G., Pinquier, J.: Autdio/video fusion: a preprocessing step for multimodal person identification. In: MMUA (2006)
Kankanhalli, M., Wang, J., Jain, R.: Experiential sampling in multimedia systems. IEEE Transactions on Multimedia 8, 937–946 (2006)
Nirmala, D., Paul, B., Vaidehi, V.: A novel multimodal image fusion method using shift invariant discrete wavelet transform and support vector machines. In: ICRTIT, pp. 932–937 (2011)
Arsic, D., Schuller, B., Rigoll, G.: Suspicious behavior detection in public transport by fusion of low-level video descriptors. In: ICME, pp. 2018–2021 (2007)
Bahlmann, C., Zhu, Y., Ramesh, V., Pellkofer, M., Koehler, T.: A system for traffic sign detection, tracking, and recognition using color, shape, and motion information. In: IEEE Intelligent Vehicles Symposium, pp. 255–260. IEEE (2005)
Meuter, M., Nunn, C., Görmer, S., Müller-Schneiders, S., Kummert, A.: A decision fusion and reasoning module for a traffic sign recognition system. IEEE Transactions on Intelligent Transportation Systems, 1–9 (2011)
Klausner, A., Tengg, A., Rinner, B.: Vehicle classification on multi-sensor smart cameras using feature-and decision-fusion. In: ICDSC, pp. 67–74. IEEE (2007)
Xiao, J., Wang, X.: Study on traffic flow prediction using rbf neural network. In: ICMLC, vol. 5, pp. 2672–2675 (2004)
Ozkurt, C., Camci, F.: Automatic traffic density estimation and vehicle classification for traffic surveillance systems using neural networks. Mathematical and Computational Applications 14, 187 (2010)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893. IEEE (2005)
Paisitkriangkrai, S., Shen, C., Zhang, J.: Performance evaluation of local features in human classification and detection. IET Computer Vision 2, 236–246 (2008)
Chen, X., Zhang, C.: Vehicle Classification from Traffic Surveillance Videos at a Finer Granularity. In: Cham, T.-J., Cai, J., Dorai, C., Rajan, D., Chua, T.-S., Chia, L.-T. (eds.) MMM 2007. LNCS, vol. 4351, pp. 772–781. Springer, Heidelberg (2006)
Thi, T., Robert, K., Lu, S., Zhang, J.: Vehicle classification at nighttime using eigenspaces and support vector machine. In: ICISP, vol. 2, pp. 422–426. IEEE (2008)
Kafai, M., Bhanu, B.: Dynamic bayesian networks for vehicle classification in video. IEEE Transactions on Industrial Informatics, 1 (2012)
Cho, W., Kim, S., Ahn, G.: Detection and recognition of moving objects using the temporal difference method and the hidden markov model. In: CSAE, vol. 4, pp. 119–123 (2011)
Zhang, Z., Li, M., Huang, K., Tan, T.: Boosting local feature descriptors for automatic objects classification in traffic scene surveillance. In: ICPR, pp. 1–4 (2008)
Gurwicz, Y., Yehezkel, R., Lachover, B.: Multiclass object classification for real-time video surveillance systems. Pattern Recognition Letters (2011)
Fernandez Arguedas, V., Zhang, Q., Chandramouli, K., Izquierdo, E.: Multi-feature fusion for surveillance video indexing. In: WIAMIS. IEEE (2011)
Fernandez Arguedas, V., Izquierdo, E.: Object classification based on behaviour patterns. In: ICDP (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fernandez Arguedas, V., Zhang, Q., Izquierdo, E. (2012). Bayesian Multimodal Fusion in Forensic Applications. In: Fusiello, A., Murino, V., Cucchiara, R. (eds) Computer Vision – ECCV 2012. Workshops and Demonstrations. ECCV 2012. Lecture Notes in Computer Science, vol 7585. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33885-4_47
Download citation
DOI: https://doi.org/10.1007/978-3-642-33885-4_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33884-7
Online ISBN: 978-3-642-33885-4
eBook Packages: Computer ScienceComputer Science (R0)