Abstract
We propose a novel method for automatic detection and tracking of Object of Interest (OOI) from actively acquired videos by non-calibrated cameras. The proposed approach benefits from the object-centered property of Active Video and facilitates self-initialization in tracking. We first use a color-saliency weighted Probability-of-Boundary (cPoB) map for keypoint filtering and salient region detection. Successive Classification and Refinement (SCR) is used for tracking between two consecutive frames. A strong classifier trained on-the-fly by AdaBoost is utilized for keypoint classification and subsequent Linear Programming solves a maximum similarity problem to reject outliers. Experiments demonstrate the importance of Active Video during the data collection phase and confirm that our new approach can automatically detect and reliably track OOI in videos.
Similar content being viewed by others
Notes
The term “tracking” here refers to one of the traditional camera motions in filming, whereas in other parts of this paper it refers to the action of following moving OOI as it is generally used in Computer Vision literatures.
We use the MATLAB code from http://research.graphicon.ru/machine-learning/gml-adaboost-matlab-toolbox.html.
We implement Online Feature Selection (OFS) tracker in MATLAB.
References
Avidan, S. (2004). Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), 1064–1072.
Avidan, S. (2005). Ensemble tracking. In Proceedings of computer vision and pattern recognition (pp. 494–501).
Avidan, S., & Shamir, A. (2007). Seam carving for content-aware image resizing. ACM Transactions on Graphics, 26(3). doi:10.1145/1276377.1276390.
Baker, S., & Matthews, I. (2004). Lucas-kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(1), 221–255.
Barber, C. B., Dobkin, D. P., & Huhdanpaa, H. (1995). The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software, 22, 469–483.
Bay, H., Tuytelaars, T., & Gool, L. V. (2006). Surf: Speeded up robust features. In Proceedings of European conference on computer vision (pp. 404–417).
Carpenter, R. (1977). Movements of the eyes. London: Pion.
Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 790–799.
Chouinard, J. Y., Fortier, P., Gulliver, T. A. (Eds.) (1996). Information theory and applications II, 4th Canadian workshop. Lac Delage, Québec, Canada, May 28–30, 1995. Selected papers. Lecture notes in computer science (Vol. 1133). Springer.
Collins, R. T. (2003). Mean-shift blob tracking through scale space. In Proceedings of computer vision and pattern recognition.
Collins, R. T., & Liu, Y. (2003). On-line selection of discriminative tracking features. In Proceedings of international conference on computer vision (pp. 346–352).
Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619.
Comaniciu, D., Ramesh, V., & Meer, P. (2003). Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 564–577.
Enkelmann, W. (2001). Video-based driver assistance: From basic functions to applications. International Journal of Computer Vision, 45(3), 201–221.
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.
Ghanbari, M. (1999). Video coding: An introduction to standard codecs. Stevenage: Institution of Electrical Engineers.
Huang, J., & Li, Z. N. (2009). Automatic detection of object of interest and tracking in active video. In Proceedings of Pacific rim conference on multimedia (pp. 368–380).
Huang, J., & Li, Z. N. (2009). Image trimming via saliency region detection and iterative feature matching. In Proceedings of international conference on multimedia expo (pp. 1322–1325).
Intille, S. S., Davis, J. W., & Bobick, A. F. (1997). Real-time closed-world tracking. In Proceedings of computer vision and pattern recognition (pp. 697–703).
Itti, L., & Koch, C. (1999). A comparison of feature combination strategies for saliency-based visual attention systems. In Proceedings of SPIE. Human vision and electronic imaging IV. (HVEI’99) (Vol. 3644, pp. 473–482). San Jose: SPIE.
Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259.
Julesz, B. (1995). Dialogues on perception. Cambridge: MIT Press.
Kadir, T., & Brady, M. (2001). Saliency, scale and image description. International Journal of Computer Vision, 45(2), 83–105.
Kim, Z. (2008). Real time object tracking based on dynamic feature grouping with background subtraction. In Proceedings of computer vision and pattern recognition (pp. 1–8).
Liu, D., Hua, G., & Chen, T. (2008). Videocut: Removing irrelevant frames by discovering the object of interest. In Proceedings of European conference on computer vision (Vol. I, pp. 441–453).
Lu, Y., & Li, Z. N. (2008). Automatic object extraction and reconstruction in active video. Pattern Recognition, 41(3), 1159–1172.
Mahadevan, V., & Vasconcelos, N. (2008). Background subtraction in highly dynamic scenes. In Proceedings of computer vision and pattern recognition (pp. 1–8).
Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.
Niebur, E., & Koch, C. (1998). Computational architectures for attention. In R. Parasuraman (Ed.), The attentive brain (pp. 163–186). MIT Press.
Rother, C., Bordeaux, L., Hamadi, Y., Blake, A. (2006). Autocollage. ACM Transactions on Graphics, 25(3), 847–852.
Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry. In Proceedings of computer vision and pattern recognition (pp. 589–596).
Rubner, Y., Tomasi, C., & Guibas, L. J. (2000). The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2), 99–121.
Simakov, D., Caspi, Y., Shechtman, E., & Irani, M. (2008). Summarizing visual data using bidirectional similarity. In Proceedings of computer vision and pattern recognition (pp. 1–8).
Sizintsev, M., Derpanis, K. G., & Hogue, A. (2008). Histogram-based search: A comparative study. In Proceedings of computer vision and pattern recognition (pp. 1–8).
Stauffer, C., Eric, W., & Grimson, W. E. L. (2000). Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 747–757.
Tsotsos, J. K., Culhane, S. M., Winky, W. Y. K., Lai, Y., Davis, N., & Nuflo, F. (1995). Modeling visual attention via selective tuning. Artificial Intelligence, 78(1–2), 507–545.
Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of computer vision and pattern recognition (Vol. I, pp. 511–518).
Yin, Z., & Collins, R. T. (2008). Object tracking and detection after occlusion via numerical hybird local and global mode-seeking. In Proceedings of computer vision and pattern recognition (pp. 1–8).
You, W., Jiang, H., & Li, Z. N. (2008). Real-time multiple object tracking in smart environments. In Proceedings of international conference on robotics and biomimetics (pp. 818–823).
Zhu, S., & Ma, K. K. (2000). A new diamond search algorithm for fast block-matching motion estimation. IEEE Transactions on Image Processing, 9(2), 287–290.
Zivkovic, Z. (2004). Improved adaptive gaussian mixture model for background subtraction. In Proceedings of international conference on pattern recognition (Vol. 2, pp. 28–31).
Acknowledgements
This work was supported in part by the Natural Sciences and Engineering Research Council of Canada under the grant RGP36726.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huang, J., Li, ZN. Automatic Detection of Object of Interest and Tracking in Active Video. J Sign Process Syst 65, 49–62 (2011). https://doi.org/10.1007/s11265-010-0540-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-010-0540-3