Abstract
There is growing interest in studying the Human Visual System (HVS) to supplement and improve the performance of computer vision tasks. A major challenge for current visual saliency models is predicting saliency in cluttered scenes (i.e. high false positive rate). In this paper, we propose a fixation patch detector that predicts image patches that contain human fixations with high probability. Our proposed model detects sparse fixation patches with an accuracy of \(84\) % and eliminates non-fixation patches with an accuracy of \(84\) % demonstrating that low-level image features can indeed be used to short-list and identify human fixation patches. We then show how these detected fixation patches can be used as saliency priors for popular saliency models, thus, reducing false positives while maintaining true positives. Extensive experimental results show that our proposed approach allows state-of-the-art saliency methods to achieve better prediction performance on benchmark datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that there are different versions of Itti’s model. Here, we used the best performing version in [14].
- 2.
References
Ross, J., Burr, D., Morrone, C.: Suppression of the magnocellular pathway during saccades. (Behavioural Brain Research)
Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2, 194–203 (2001)
Rutishauser, U., Walther, D., Koch, C., Perona, P.: Is bottom-up attention useful for object recognition. In: CVPR (2004)
Walther, D., Itti, L., Riesenhuber, M., Poggio, T.A., Koch, C.: Attentional selection for object recognition - a gentle way. In: Bülthoff, H.H., Lee, S.-W., Poggio, T.A., Wallraven, C. (eds.) BMCV 2002. LNCS, vol. 2525, pp. 472–479. Springer, Heidelberg (2002)
Endres, I., Hoiem, D.: Category independent object proposals. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 575–588. Springer, Heidelberg (2010)
Shapovalova, N., Raptis, M., Sigal, L., Mori, G.: Action is in the eye of the beholder: eye-gaze driven model for spatio-temporal action localization. In: NIPS (2013)
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. comput. vis. 60, 63–86 (2004)
Dave, A., Dubey, R., Ghanem, B.: Do humans fixate on interest points? In: ICPR (2012)
Yang, L., Zheng, N., Yang, J., Chen, M., Chen, H.: A biased sampling strategy for object categorization. In: CVPR (2009)
Marchesotti, L., Cifarelli, C., Csurka, G.: A framework for visual saliency detection with applications to image thumbnailing. In: ICCV (2009)
Borji, A., Sihite, D., Itti, L.: Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. IEEE Trans. Image Process. 22, 55–69 (2013)
Hou, X., Zhang, L.: Saliency detection: a spectral residual approach. In: CVPR (2007)
Zhang, L., Tong, M.H., Marks, T.K., Shan, H., Cottrell, G.W.: Sun: a bayesian framework for saliency using natural statistics. J. Vis. 8(7), 1–20 (2008)
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: NIPS (2007)
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254–1259 (1998)
Garcia-Diaz, A., Fdez-Vidal, X.R., Pardo, X.M., Dosil, R.: Saliency from hierarchical adaptation through decorrelation and variance normalization. Image Vis. Comput. 30, 51–64 (2012)
Garcia-Diaz, A., Leborán, V., Fdez-Vidal, X.R., Pardo, X.M.: On the relationship between optical variability, visual saliency, and eye fixations: a computational approach. J. Vis. 12(6), 1–22 (2012)
Avraham, T., Lindenbaum, M.: Esaliency (extended saliency): meaningful attention using stochastic image modeling. IEEE Trans. Pattern Anal. Mach. Intell. 32, 693–708 (2010)
Li, Y., Zhou, Y., Yan, J., Niu, Z., Yang, J.: Visual saliency based on conditional entropy. In: Maybank, S., Taniguchi, R., Zha, H. (eds.) ACCV 2009, Part I. LNCS, vol. 5994, pp. 246–257. Springer, Heidelberg (2010)
Zhang, J., Stan, S.: Saliency detection: a boolean map approach. In: ICCV (2013)
Itti, L., Baldi, P.: Bayesian surprise attracts human attention. In: NIPS (2006)
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: ICCV (2009)
Borji, A., Tavakoli, H., Sihite, D., Itti, L.: Analysis of scores, datasets, and models in visual saliency prediction. In: ICCV (2013)
Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35, 185–207 (2012)
Soto, D., Humphreys, G.W., Heinke, D.: Working memory can guide pop-out search. Vis. Res. 46, 1010–1018 (2006)
Sheinberg, D.L., Logothetis, N.K.: Noticing familiar objects in real world scenes: the role of temporal cortical neurons in natural vision. J. Neurosci. 21, 1340–1350 (2001)
Yang, Y., Song, M., Li, N., Bu, J., Chen, C.: What is the chance of happening: a new way to predict where people look. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 631–643. Springer, Heidelberg (2010)
Poirier, F.J., Gosselin, F., Arguin, M.: Perceptive fields of saliency. J. Vis. 8, 14 (2008)
Scharfenberger, C., Wong, A., Fergani, K., Zelek, J.S., Clausi, D.A.: Statistical textural distinctiveness for salient region detection in natural images. In: CVPR (2013)
Le Meur, O., Le Callet, P., Barba, D.: Predicting visual fixations on video based on low-level visual features. Vis. Res. 47, 2483–2498 (2007)
Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: NIPS (1998)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soci. Ser. B 39, 1–38 (1977)
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Marchesotti, L., Cifarelli, C., Csurka, G.: A framework for visual saliency detection with applications to image thumbnailing. In: ICCV (2009)
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: CVPR (2007)
Deselaers, T., Ferrari, V.: Global and efficient self-similarity for object classification and detection. In: CVPR (2010)
Zhao, Q., Koch, C.: Learning a saliency map using fixated locations in natural scenes. J. Vis. 11, 1–15 (2011)
Judd, T., Durand, F., Torralba, A.: A benchmark of computational models of saliency to predict human fixations. Technical report (2012)
Peters, R.J., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vis. Res. 45, 2397–2416 (2005)
Bruce, N., Tsotsos, J.: Saliency based on information maximization. In: NIPS (2006)
Einhäuser, W., Spain, M., Perona, P.: Objects predict fixations better than early saliency. J. Vis. 8, 18 (2008)
Rahtu, E., Kannala, J., Salo, M., Heikkilä, J.: Segmenting salient objects from images and videos. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 366–379. Springer, Heidelberg (2010)
Jiang, B., Zhang, L., Lu, H., Yang, C., Yang, M.H.: Saliency detection via absorbing markov chain. In: ICCV (2013)
Margolin, R., Tal, A., Zelnik-Manor, L.: What makes a patch distinct? In: CVPR (2013)
Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: CVPR (2013)
Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes challenge 2007 (voc 2007) results (2007). In: URL http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html. (2008)
Acknowledgement
Research reported in this publication was supported by competitive research funding from King Abdullah University of Science and Technology (KAUST).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Dubey, R., Dave, A., Ghanem, B. (2015). Improving Saliency Models by Predicting Human Fixation Patches. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9005. Springer, Cham. https://doi.org/10.1007/978-3-319-16811-1_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-16811-1_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16810-4
Online ISBN: 978-3-319-16811-1
eBook Packages: Computer ScienceComputer Science (R0)