Integration of Multi-Camera Video Moving Objects and GIS
Abstract
:1. Introduction
2. Related Work
3. Fusion between Multiple Camera Objects and GIS
3.1. Extraction and Data Organization of Multi-Camera Video Moving Objects
3.2. Fusion between Video Moving Object and GIS
3.3. Data Organization of Spatial–Temporal Trajectory
4. Architecture of GIS-MCVO Surveillance System
4.1. Design Schematic of the System
- (1)
- Function layer: The function layer is a server with data processing and analysis functions. This layer is used for pre-processing GIS and video data and comprises functional modules for video data acquisition, video moving object extraction, video data geospatial mapping, and cross-camera object recognition. In addition, the function layer can provide basic data support for real-time publishing.
- (2)
- Data layer: This layer is supported by the database and is mainly used to store, access, and manage geospatial, video image, and video moving object data and to provide data services to clients.
- (3)
- Service layer: The service layer publishes the data service of the underlying system database, including video stream image, video moving object, and geospatial information data services. This layer provides real-time multisource data services to terminal users and remote command centers.
- (4)
- Business layer: The business layer selects relevant data service content according to the demand of the system user. Through analysis, this layer fetches different services and generates and transmits the corresponding result to the representation layer.
- (5)
- Representation layer: In the representation layer, the user can apply multiple modes on the MCVO and GIS fusion, along with related functions on application and analysis by using a common browser under various operating system platforms.
4.2. Design of System Functions
- (1)
- Moving object extraction module: This module uses detection and tracking algorithms to extract moving objects; separate the video’s foreground and background; achieve cross-camera recognition on objects from different cameras; and stores the trajectory, type, set of sub-graphs, and other associated information of the moving objects.
- (2)
- Video spatialization module: This module constructs the mapping matrix by selecting the associated image and geospatial mapping model and calibrates the internal and external parameters of the camera for video spatialization.
- (3)
- Virtual scene generation module: This module is mainly used to load the virtual geographic scene, virtual point of view, position of the surveillance camera, and sight of video image. This module builds the foundation of fusion representation. Many applications based on GIS-MCVO system are applied under the condition of establishing this module.
- (4)
- Moving object spatial–temporal analysis module: To achieve some specific applications, this module synthesizes the related information on the video moving objects and the geographic scene. This module also obtains the necessary result to be outputted in the representation module.
- (5)
- Fusion representation module: This module selects the fusion pattern between the moving objects and the virtual geographic scene by performing visual loading on video images, moving object trajectory, sub-graphs, and spatial–temporal analysis results.
5. Applications and Potential Benefits for GIS-MCVO Surveillance System
5.1. GIS-Based User Interface
5.2. Video Compression Storage
5.3. Trajectory Deduction in Visual Blind Zone
5.4. Retrieval of MCVO
5.5. Synopsis of Multiple Videos
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Milosavljević, A.; Rančić, D.; Dimitrijević, A.; Predić, B.; Mihajlović, V. Integration of GIS and video surveillance. Int. J. Geogr. Inf. Sci. 2016, 30, 2089–2107. [Google Scholar] [CrossRef]
- Milosavljević, A.; Rančić, D.; Dimitrijević, A.; Predić, B.; Mihajlović, V. A method for estimating surveillance video georeferences. ISPRS Int. J. Geo-Inf. 2017, 6, 211. [Google Scholar] [CrossRef] [Green Version]
- Wu, C.; Zhu, Q.; Zhang, Y.T.; Du, Z.Q.; Zhou, Y.; Xie, X.; He, F. An Adaptive Organization Method of Geovideo Data for Spatio-Temporal Association Analysis. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, 2, 29. [Google Scholar] [CrossRef] [Green Version]
- Han, Z.; Cui, C.; Kong, Y.; Qin, F.; Fu, P. Video data model and retrieval service framework using geographic information. Trans. GIS 2016, 20, 701–717. [Google Scholar] [CrossRef]
- Xie, X.; Zhu, Q.; Zhang, Y.; Zhou, Y.; Xu, W.; Wu, C. Hierarchical semantic model of Geovideo. Acta Geod. Cartogr. Sin. 2015, 44, 555–562. [Google Scholar]
- Kong, Y. Design of Geo Video Data Model and Implementation of Web-Based VideoGIS. Geomat. Inf. Sci. Wuhan Univ. 2010, 35, 133–137. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Milosavljević, A.; Dimitrijević, A.; Rančić, D. GIS-augmented video surveillance. Int. J. Geogr. Inf. Sci. 2010, 24, 1415–1433. [Google Scholar] [CrossRef]
- Xie, Y.; Wang, M.; Liu, X.; Wu, Y. Integration of GIS and moving objects in surveillance video. ISPRS Int. J. Geo-Inf. 2017, 6, 94. [Google Scholar] [CrossRef] [Green Version]
- Ristani, E.; Tomasi, C. Features for multi-target multi-camera tracking and re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6036–6046. [Google Scholar]
- Tilley, N. Understanding Car Parks, Crime, and CCTV: Evaluation Lessons from Safer Cities; Home Office Police Department: London, UK, 1993. [Google Scholar]
- Brown, B. CCTV in Town Centres: Three Case Studies; Home Office, Police Department: London, UK, 1995. [Google Scholar]
- Huang, K.Q.; Chen, X.T.; Kang, Y.F.; Tan, T.N. Intelligent visual surveillance: A review. Chin. J. Comput. 2015, 38, 1093–1118. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2015; pp. 91–99. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Lukezic, A.; Vojir, T.; Cehovin Zajc, L.; Matas, J.; Kristan, M. Discriminative correlation filter with channel and spatial reliability. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6309–6318. [Google Scholar]
- Valmadre, J.; Ertinetto, L.; Henriques, J.; Vedaldi, A.; Torr, P.H. End-to-end representation learning for correlation filter based tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2805–2813. [Google Scholar]
- Liu, H.; Feng, J.; Qi, M.; Jiang, J.; Yan, S. End-to-end comparative attention networks for person re-identification. IEEE Trans. Image Process. 2017, 26, 3492–3506. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wei, L.; Zhang, S.; Yao, H.; Gao, W.; Tian, Q. Glad: Global-local-alignment descriptor for pedestrian retrieval. In Proceedings of the 25th ACM international conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 420–428. [Google Scholar]
- Liu, H.; Jie, Z.; Jayashree, K.; Qi, M.; Jiang, J.; Yan, S.; Feng, J. Video-based person re-identification with accumulative motion context. IEEE Trans. Circuits Syst. Video Technol. 2017, 28, 2788–2802. [Google Scholar] [CrossRef] [Green Version]
- Wei, L.; Zhang, S.; Gao, W.; Tian, Q. Person transfer gan to bridge domain gap for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 79–88. [Google Scholar]
- Charou, E.; Kabassi, K.; Martinis, A.; Stefouli, M. Integrating multimedia GIS technologies in a recommendation system for geotourism. In Multimedia Services in Intelligent Environments; Springer: Berlin/Heidelberg, Germany, 2010; pp. 63–74. [Google Scholar]
- McDermid, G.J.; Franklin, S.E.; LeDrew, E.F. Remote sensing for large-area habitat mapping. Prog. Phys. Geogr. 2005, 29, 449–474. [Google Scholar] [CrossRef]
- Navarrete, T.; Blat, J. VideoGIS: Segmenting and indexing video based on geographic information. In Proceedings of the 5th AGILE Conference on Geographic Information Science, Palma de Mallorca, Spain, 25–27 April 2002. [Google Scholar]
- Walton, S.; Berger, K.; Ebert, D.; Chen, M. Vehicle object retargeting from dynamic traffic videos for real-time visualisation. Vis. Comput. 2014, 30, 493–505. [Google Scholar] [CrossRef]
- Du, R.; Bista, S.; Varshney, A. Video fields: Fusing multiple surveillance videos into a dynamic virtual environment. In Proceedings of the 21st International Conference on Web3D Technology, Anaheim, CA, USA, 22–24 July 2016; ACM: New York, NY, USA, 2016; pp. 165–172. [Google Scholar]
- Wang, M.; Liu, X.; Zhang, Y.; Wang, Z. Camera coverage estimation based on multistage grid subdivision. ISPRS Int. J. Geo-Inf. 2017, 6, 110. [Google Scholar] [CrossRef] [Green Version]
- Cho, Y.J.; Park, J.H.; Kim, S.A.; Lee, K.; Yoon, K.J. Unified framework for automated person re-identification and camera network topology inference in camera networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2601–2607. [Google Scholar]
- Jian, H.; Liao, J.; Fan, X.; Xue, Z. Augmented virtual environment: Fusion of real-time video and 3D models in the digital earth system. Int. J. Digit. Earth 2017, 10, 1177–1196. [Google Scholar] [CrossRef]
- Loy, C.C.; Xiang, T.; Gong, S. Time-delayed correlation analysis for multi-camera activity understanding. Int. J. Comput. Vis. 2010, 90, 106–129. [Google Scholar] [CrossRef]
- Mehboob, F.; Abbas, M.; Rehman, S.; Khan, S.A.; Jiang, R.; Bouridane, A. Glyph-based video visualization on Google Map for surveillance in smart cities. EURASIP J. Image Video Process. 2017, 1, 28. [Google Scholar] [CrossRef] [Green Version]
- De Haan, G.; Scheuer, J.; de Vries, R.; Post, F.H. Egocentric navigation for video surveillance in 3D virtual environments. In Proceedings of the 2009 IEEE Symposium on 3D User Interfaces, Lafayette, LA, USA, 14–15 March 2009; pp. 103–110. [Google Scholar]
- Wang, Y.; Bowman, D.A. Effects of navigation design on Contextualized Video Interfaces. In Proceedings of the 2011 IEEE Symposium on 3D User Interfaces (3DUI), Singapore, 19–20 March 2011; pp. 27–34. [Google Scholar]
- Katkere, A.; Moezzi, S.; Kuramura, D.Y.; Kelly, P.; Jain, R. Towards video-based immersive environments. Multimed. Syst. 1997, 5, 69–85. [Google Scholar] [CrossRef]
- Lewis, P.; Fotheringham, S.; Winstanley, A. Spatial video and GIS. Int. J. Geogr. Inf. Sci. 2011, 25, 697–716. [Google Scholar] [CrossRef]
- Wang, X. Intelligent multi-camera video surveillance: A review. Pattern Recognit. Lett. 2013, 34, 3–19. [Google Scholar] [CrossRef]
- Yang, Y.; Chang, M.C.; Tu, P.; Lyu, S. Seeing as it happens: Real time 3D video event visualization. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada, 27–30 September 2015; pp. 2875–2879. [Google Scholar]
- Baklouti, M.; Chamfrault, M.; Boufarguine, M.; Guitteny, V. Virtu4D: A dynamic audio-video virtual representation for surveillance systems. In Proceedings of the 2009 3rd International Conference on Signals, Circuits and Systems (SCS), Medenine, Tunisia, 6–8 November 2009; pp. 1–6. [Google Scholar]
- Pan, C.; Chen, Y.; Wang, G. Virtual-Real Fusion with Dynamic Scene from Videos. In Proceedings of the 2016 International Conference on Cyberworlds (CW), Chongqing, China, 28–30 September 2016; pp. 65–72. [Google Scholar]
- Collins, R.T.; Lipton, A.J.; Fujiyoshi, H.; Kanade, T. Algorithms for cooperative multisensor surveillance. Proc. IEEE 2001, 89, 1456–1477. [Google Scholar] [CrossRef] [Green Version]
- Katz, B.; Lin, J.J.; Stauffer, C.; Grimson, W.E.L. Answering Questions about Moving Objects in Surveillance Videos. In New Directions in Question Answering; Springer: Dordrecht, The Netherlands, 2003; pp. 145–152. [Google Scholar]
- Hu, W.; Xie, D.; Fu, Z.; Zeng, W.; Maybank, S. Semantic-based surveillance video retrieval. IEEE Trans. Image Process. 2007, 16, 1168–1181. [Google Scholar] [CrossRef] [PubMed]
- Deng, H.; Gunda, K.; Rasheed, Z.; Haering, N. Retrieving large-scale high density video target tracks from spatial database. In Proceedings of the 3rd International Conference on Computing for Geospatial Research and Applications, Washington, DC, USA, 1–3 July 2012; p. 19. [Google Scholar]
- Panta, F.J.; Qodseya, M.; Péninou, A.; Sèdes, F. Management of Mobile Objects Location for Video Content Filtering. In Proceedings of the 16th International Conference on Advances in Mobile Computing and Multimedia, Yogyakarta, Indonesia, 19–21 November 2018; pp. 44–52. [Google Scholar]
- Pritch, Y.; Rav-Acha, A.; Peleg, S. Nonchronological video synopsis and indexing. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 1971–1984. [Google Scholar] [CrossRef]
- Zhu, X.; Liu, J.; Wang, J.; Lu, H. Key observation selection-based effective video synopsis for camera network. Mach. Vis. Appl. 2014, 25, 145–157. [Google Scholar] [CrossRef]
- Zhang, Z.; Nie, Y.; Sun, H.; Lai, Q.; Li, G. Multi-video object synopsis integrating optimal view switching. In Proceedings of the SIGGRAPH Asia 2017 Technical Briefs, Bangkok, Thailand, 27–30 November 2017; p. 17. [Google Scholar]
Title 1 | Displaying Environment | Ability on Supporting Virtual View Browsing | Correlation Representation Ability between Image and Virtual Scene | Ability on Highlighting Video Foreground Object |
---|---|---|---|---|
Image projection | 2D/3D | Range view | Yes | No |
Foreground and background independent projection | 3D | Range view | Yes | No |
Foreground projection | 3D | Range view | Yes | Yes |
Foreground Abstraction | 2D/3D | Arbitrary view | No | Yes |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xie, Y.; Wang, M.; Liu, X.; Mao, B.; Wang, F. Integration of Multi-Camera Video Moving Objects and GIS. ISPRS Int. J. Geo-Inf. 2019, 8, 561. https://doi.org/10.3390/ijgi8120561
Xie Y, Wang M, Liu X, Mao B, Wang F. Integration of Multi-Camera Video Moving Objects and GIS. ISPRS International Journal of Geo-Information. 2019; 8(12):561. https://doi.org/10.3390/ijgi8120561
Chicago/Turabian StyleXie, Yujia, Meizhen Wang, Xuejun Liu, Bo Mao, and Feiyue Wang. 2019. "Integration of Multi-Camera Video Moving Objects and GIS" ISPRS International Journal of Geo-Information 8, no. 12: 561. https://doi.org/10.3390/ijgi8120561
APA StyleXie, Y., Wang, M., Liu, X., Mao, B., & Wang, F. (2019). Integration of Multi-Camera Video Moving Objects and GIS. ISPRS International Journal of Geo-Information, 8(12), 561. https://doi.org/10.3390/ijgi8120561