Abstract
We review recent biological vision studies that are related to human motion segmentation. Our goal is to develop a practically plausible computational framework that is guided by recent cognitive and psychological studies on the human visual system for the segmentation of human body in a video sequence. Specifically, we discuss the roles and interactions of bottom-up and top-down processes in visual perception processing as well as how to combine them synergistically in one computational model to guide human motion segmentation. We also examine recent research on biological movement perception, such as neural mechanisms and functionalities for biological movement recognition and two major psychological tracking theories. We attempt to develop a comprehensive computational model that involves both bottom-up and top-down processing and is deeply inspired by biological motion perception. According to this model, object segmentation, motion estimation, and action recognition are results of recurrent feedforward (bottom-up) and feedback (top-down) processes. Some open technical questions are also raised and discussed for future research.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Johansson, G.: Visual perception of biological motion and a model for its analysis. Perception & Psychophysics 14, 201–211 (1973)
Troje, N.: Decomposing biological motion: A framework for analysis and synthesis of human gait patterns. Journal of Vision 2, 371–387 (2002)
Thornton, I., Pinto, J., Shiffrar, M.: The visual perception of human locomotion. Cognitive Neuropsychology, 535–552 (1998)
Knoblich, G., Thornton, I.M., Grosjean, M., Shiffrar, M.: The human body. Perception from the inside out. Oxford University Press, New York (2006)
Gepshtein, S., Kubovy, M.: The emergence of visual objects in space-time. Proceedings of the National Academy of Sciences of the United States of America 97, 8186–8191 (2000)
Kubovy, M., Gepshtein, S.: Grouping in space and in space-time: An exercise in phenomenological psychophysics. In: Behrmann, M., Kimchi, R., Olson, C. (eds.) Perceptual Organization in Vision: Behavioral and Neural perspectives, pp. 45–85. Lawrence Erlbaum Association, Mahwah (2003)
Kubovy, M., Gepshtein, S.: Gestalt: from phenomena to laws. In: Perceptual Organization for Artificial Vision Systems, pp. 41–71. Academic Publishers, Boston (2000)
Olivers, C.N., Humphreys, G.: Spatiotemporal segregation in visual search: evidence from parietal lesions. Journal of Experimental Psychology: Human Perception and Performance 30, 667–688 (2004)
Ullman, S.: The Interpretation of Visual Motion. MIT Press, Cambridge (1979)
Marr, D.: Vision: a Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman and Company, New York (1982)
Palmer, S.E., Rock, I.: Rethinking perceptual organization: The role of uniform connectedness. Psychonomic Bulletin and Review 1, 29–55 (1994)
Palmer, S.: Vision Science: Photons to Phenomenology. MIT Press, Bradford Books (1999)
McClelland, J.L.: On the time relations of mental processes: An examination of systems of processes in cascade. Psychological Review 86, 287–330 (1979)
Stillings, N.A., Weisler, S.E., Chase, C.H., Feinstein, M.H., Garfield, J.L., Rissland, E.L.: Cognitive Science: An Introduction. MIT Press, Cambridge (1995)
Palmer, S.E., Brooks, J.L., Nelson, R.: When does grouping happen? Acta Psychologica 114, 311–330 (2003)
Clifford, C.W., Freedman, J., Vaina, L.M.: First- and second-order motion perception in gabor micropattern stimuli: Psychophysical and computational modelling. Cogn. Brain Res. 6, 263–271 (1998)
Braddick, O.J.: A short-range process in apparent motion. Vision Res. 14, 519–527 (1974)
Thornton, I.M., Pinto, J., Shiffrar, M.: The visual perception of human locomotion. Cognitive Neuropsychology 15, 535–552 (1998)
Franconeri, S.L., Halberda, J., Feigenson, L., Alvarez, G.A.: Common fate can define objects in multiple object tracking. Journal of Vision 4, 365a (2004)
Mumford, D.: Neuronal architecture for pattern-theoretic problems. MIT Press, Cambridge (1993)
Bullier, J.: Integrated model of visual processing. Brain Reseach Reviews 36, 96–107 (2001)
Ullman, S.: High-level vision: object recognition and visual cognition. MIT Press, Cambridge (1996)
Kersten, D., Mamassian, P., Yuille, A.: Object perception as bayesian inference. Annual Review of Psychology 55, 271–304 (2004)
Knill, D.C., Richards, W.: Perception as Bayesian Inference. Cambridge Univ. Press, UK (1996)
Rao, R.P.N., Olshausen, B., Lewicki, M.: Probabilistic Models of the Brain: Perception and Neural Function. MIT Press, Cambridge (2002)
Lee, T.S., Mumford, D.: Hierarchical bayesian inference in the visual cortex. Journal of Optical Society of America 20, 1434–1448 (2003)
Vecera, S.P., O’Reilly, R.C.: Figure-ground organization and object recognition processes: An interactive account. Journal of Experimental Psychology: Human Perception and Performance 24, 441–462 (1998)
Shi, J., Malik, J.: Motion segmentation and tracking using Normalized cuts. In: Proc. of Int. Conf. on Computer Vision, pp. 1151–1160 (1998)
Fowlkes, C., Belongie, S., Malik, J.: Efficient spatiotemporal grouping using the Nystrom method. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, vol. 1, pp. 231–238 (2001)
DeMenthon, D., Megret, R.: Spatio-temporal segmentation of video by hierarchical mean shift analysis. Technical Report: LAMP-TR-090/CAR-TR-978/CS-TR-4388/UMIACS-TR-2002-68 (2002)
Greenspan, H., Goldberger, J., Mayer, A.: A probabilistic framework for spatio-temporal video representation amp indexing. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 461–475. Springer, Heidelberg (2002)
Megret, R., De Menthon, D.: A survey of spatio-temporal grouping techniques. Technical report, University of Maryland, College Park (2002), http://www.umiacs.umd.edu/lamp/pubs/TechReports/
Moscheni, F., Bhattacharjee, S., Kunt, M.: Spatiotemporal segmentation based on region merging. IEEE Trans. Pattern Anal. Mach. Intell. 20, 897–915 (1998)
Mezaris, V., Kompatsiaris, I., Strintzis, M.G.: Video object segmentation using bayes-based temporal tracking and trajectory-based region merging. IEEE Trans. Circuits and Systems for Video Technology 14, 782–795 (2004)
Gelgon, M., Bouthemy, P.: A region-level motion-based graph representation and labeling for tracking a spatial image partition. Pattern Recognition 33, 725–740 (2000)
Wang, D.: Unsupervised video segmentation based on watersheds and temporal tracking. IEEE Trans. Circuits and Systems for Video Technology, 539–546 (1998)
Porikli, F., Wang, Y.: Automatic video object segmentation using volume growing and hierarchical clustering. Journal on Applied Signal Processin 3, 442–453 (2004)
Tsai, Y., Lai, C., Hung, Y., Shih, Z.: A bayesian approach to video object segmentation. IEEE Trans. Circuits syst. video Technology 15, 175–180 (2005)
Hochstein, S., Ahissar, M.: View from the top: Herarchies and reverse hierarchies in the visual system. Neuron 36, 791–804 (2002)
Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition (2004)
Tu, Z., Chen, X., Yuille, A.L., Zhu, S.C.: Image parsing: Unifying segmentation, detection, and recognition. Int’l Journal of Computer Vision 63, 113–140 (2005)
Ungerleider, L.G., Mishkin, M.: Two cortical visual systems, pp. 549–586. MIT Press, Cambridge (1982)
Felleman, D.J., van Essen, D.C.: Distributed hierarchical processing in primate cerebral cortex. Cerebral cortex 1, 1–47 (1991)
Burr, D., Ross, J.: Vision: The world through picket fences. Current Biology 14, 381–382 (2004)
Giese, M.A., Poggio, T.: Neural Mechanisms for the Recognition of Biological Movement. Nature Neuroscience Review 4, 179–192 (2003)
Oram, M.W., Perrett, D.I.: Integration of form and motion in the anterior part of the superior temporal polysensory area (STPa) of the macaque monkey. Journal of neurophysiology 76, 109–129 (1996)
Sajda, P., Baek, K.: Integration of form and motion within a generative model of visual cortex. Neural Networks 17, 809–821 (2004)
Bullier, J.: Integrated model of visual processing. Brain research review 36, 96–107 (2001)
Kahneman, D., Terisman, A., Gibbs, B.J.: The reviewing of object files: object specific integration of information. Cognitive Psychology 24, 175–219 (1992)
Pylyshyn, Z.W., Storm, R.W.: Tracking multiple independent target: Evidence for a parallel tracking mechanism. Spatial Vision 3, 1–19 (1988)
Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: Proc. IEEE Int’l Conference on Computer Vision (2003)
Lim, H., Morariu, V., Camps, O.I., Sznaier, M.: Dynamic appearance modeling for human tracking. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2006)
Tu, Z., Zhu, S.C.: Image segmentation by data-driven markov chain monte carlo. IEEE Trans. on Pattern Anal. Mach. Intell. 24, 657–673 (2002)
Micilotta, A., Bowden, R.: View-based location and tracking of body parts for visual interaction. In: Proc. of British Machine Vision Conference, pp. 849–858 (2004)
Zhou, S.K., Chellappa, R., Moghaddam, B.: Visual tracking and recognition using appearance-adaptive models in particle filters. IEEE Trans. Image Processing 13, 1491–1506 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, C., Fan, G. (2006). What Can We Learn from Biological Vision Studies for Human Motion Segmentation?. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2006. Lecture Notes in Computer Science, vol 4292. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11919629_79
Download citation
DOI: https://doi.org/10.1007/11919629_79
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-48626-8
Online ISBN: 978-3-540-48627-5
eBook Packages: Computer ScienceComputer Science (R0)