Abstract
Target detection and tracking represent two fundamental steps in automatic video-based surveillance systems where the goal is to provide intelligent recognition capabilities by analyzing target behavior. This paper presents a framework for video-based surveillance where target detection is integrated with tracking to improve detection results. In contrast to methods that apply target detection and tracking sequentially and independently from each other, we feed the results of tracking back to the detection stage in order to adaptively optimize the detection threshold and improve system robustness. First, the initial target locations are extracted using background subtraction. To model the background, we employ Support Vector Regression (SVR) which is updated over time using an on-line learning scheme. Target detection is performed by thresholding the outputs of the SVR model. Tracking uses shape projection histograms to iteratively localize the targets and improve the confidence level of detection. For verification, additional information based on size, color and motion information is utilized. Feeding back the results of tracking to the detection stage restricts the range of detection threshold values, suppresses false alarms due to noise, and allows to continuously detect small targets as well as targets undergoing perspective projection distortions. We have validated the proposed framework in two different application scenarios, one detecting vehicles at a traffic intersection using visible video and the other detecting pedestrians at a university campus walkway using thermal video. Our experimental results and comparisons with frame-based detection and kernel-based tracking methods illustrate the robustness of our approach.
Similar content being viewed by others
References
Sun Z., Bebis G. and Miller R. (2006). On-road vehicle detection: a review. IEEE Trans. Pattern Anal. Mach. Intell. 28(5): 694–711
Haritaoglu I., Harwood D. and Davis L. (2000). W 4: real-time surveillance of people and their activities. IEEE Trans. Pattern Anal. Mach. Intell. 22(8): 809–830
Wren W., Azarbaygaui A., Darrell T. and Pentland A. (1997). Pfinder: real-time tracking of the human body. IEEE Trans. Pattern Anal. Mach. Intell. 19(7): 780–785
Stauffer C. and Grimson W. (2000). Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 22(8): 747–757
Matsuyama, T., Ohya, T., Habe, H.: Background subtraction for non-stationary scenes. In: Proceedings of the 4th Asian Conference on Computer Vision, pp. 662–667 (2000)
Eng, H., Wang, J., Kam, A.H., Yau, W.: Novel region-based modeling for human detection within highly dynamic aquatic environment. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. II-390–II-397 (2004)
Elgammal, A., Harwood, D., Davis, L.: Non-parametric model for background adaptation. In: European Conference on Computer Vision (2000)
Javed, O., Shafique, K., Shah, M.: A hierarchical apporach to robust background subtraction using color and gradient. IEEE Workshop on Motion and Video Computing, pp. 22–27 (2002)
Monnet, A., Mittal, A., Paragios, N., Ramesh, V.: Background modeling and subtraction of dynamic scenes. In: IEEE Internaltional Conference on Computer Vision, 2003
Toyama, K., Krumm, J., Brumitt, B., Meyers, B.: Wallflower: principles and practice of background maintenance. In: International Conference on Computer Vision, pp. 255–261 (1999)
Rittscher, J., Kato, J., Joga, S., Blake. A.: A probabilistic background model for tracking. European Conference on Computer Vision vol. 2, pp. 336–350, (2000)
Seki, M., Wada, T., Fujiwara, H., Sumi, K.: Background subtraction based on co-occurrence of image variations. In: International Conference on Computer Vision and Pattern Recognition vol. 2, 65 (2003)
Li L. and Leung M. (2002). Integrating intensity and texture differences for robust change detection. IEEE Trans. Image Process. 11(2): 105–112
Lipton, A.J., Fujiyoshi, H., Patil, R.S.: Moving target classification and tracking from real-time video on-road vehicle detection: a review. In: Proceedings of the IEEE workshop on applications of computer vision (2002)
Beren J.R., et al. (1992). A three frame algorithm for estimating two-component image motion. IEEE Trans. Pattern Anal. Mach. Intell, 14(9): 886–896
Beren J.R., et al. (1992). A three frame algorithm for estimating two-component image motion. IEEE Trans. Pattern Anal. Mach. Intell. 14(9): 886–896
Sharma R. and Aloimonos Y. (1996). Early detection of independent motion from active control of normal image flow patterns. IEEE Trans. Sys. Man Cybernet. Part B 26(1): 42–52
Avidan S. (2004). Support vector tracking. IEEE Trans. Pattern Anal. Mach. Intell. 26(8): 1064–1072
Bar-shalom Y. and Fortmann T. (1988). Tracking and Data Association. Academic Press, New York
Kitagawa G. (1987). Non-Gaussian state-space modeling of nonstationary time series. J. Am. Stat. Assoc. 82: 1032–1063
Gordon N., Salmond D. and Smith A. (1993). A novel approach to non-linear and non-Gaussian Bayesian state estimation. Proc. Part-F: Radar Signal Process. 140: 107–113
Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 25(5), (2005)
Wang, J., Eng, H., Kam, A., Yau, W.: A framework for foreground detection in complex environments. In: European Conference on Computer Vision, Workshop of Statistical Modeling for Video Processing, pp. 129–140 (2004)
Verma R., Schmid C. and Mikolajczyk K. (2003). Face detection and trackign in a video by propagating detection probabilities. IEEE Trans. Pattern Anal. Mach. Intell. 25(10): 1215–1227
Hearst M. (1998). Trends and controversies—support vector machines. IEEE Intell. Syst. 13(4): 18–28
Smola, A., Scholkopf, B.: A tutorial on support vector regression. NeuroCOLTS Technical Report Series NC2-TR-1998-030, October 1998
Ma J. and Theiler J. (2003). Accurate on-line support vector regression. Neural Comput. 15: 2683–2703
Davis, J.W., Sharma, V.: Robust background-subtraction for person detection in thermal imagery. In: IEEE International Conference on Computer Vision and Pattern Recognition (2004)
Davis, J.W., Keck, M.A.: A two-stage template approach to person detection in thermal imagery. In: IEEE International Conference on Computer Vision and Pattern Recognition (2005)
Amer A. (2005). Voting-based simultaneous tracking of multiple video objects. IEEE Trans. Circuits Syst. Video Technol. 15(11): 1448–1462
Duda R., Hart P. and Stork D. (2001). Pattern Classification. Wiley, Chichester
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, J., Bebis, G., Nicolescu, M. et al. Improving target detection by coupling it with tracking. Machine Vision and Applications 20, 205–223 (2009). https://doi.org/10.1007/s00138-007-0118-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-007-0118-7