Abstract
Studies of visual attention of patients with Dementia such as Parkinson’s Disease Dementia and Alzheimer Disease is a promising way for non-invasive diagnostics. Past research showed, that people suffering from dementia are not reactive with regard to degradations on still images. Attempts are being made to study their visual attention relatively to the video content. Here the delays in their reactions on novelty and “unusual” novelty of the visual scene are expected. Nevertheless, large-scale screening of population is possible only if sufficiently robust automatic prediction models can be built. In the medical protocols the detection of Dementia behavior in visual content observation is always performed in comparison with healthy, “normal control” subjects. Hence, it is a research question per see as to develop an automatic prediction models for specific visual content to use in psycho-visual experience involving Patients with Dementia (PwD). The difficulty of such a prediction resides in a very small amount of training data. In this paper the reaction of healthy normal control subjects on degraded areas in videos was studied. Furthermore, in order to build an automatic prediction model for salient areas in intentionally degraded videos for PwD studies, a deep learning architecture was designed. Optimal transfer learning strategy for training the model in case of very small amount of training data was deployed. The comparison with gaze fixation maps and classical visual attention prediction models was performed. Results are interesting regarding the reaction of normal control subjects against degraded areas in videos.
Similar content being viewed by others
Notes
available in http://www.di.ens.fr/~laptev/actions/hollywood2/
References
Ankri J, Hesse C, Renucci A, Martineau A (2013) Evaluation du plan alzheimer 2008-2012
Archibald NK, Hutton SB, Clarke MP, Mosimann UP, Burn DJ (2013) Visual exploration in parkinson’s disease and parkinson’s disease dementia. Brain Journal de Neurologie
Borji A, Itti L (2013) State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell 35(1):185–207
Boujut H, Benois-pineau J, Mégret R (2012) Fusion of multiple visual cues for visual saliency extraction from wearable camera settings with strong motion Computer vision – ECCV 2012. Workshops and demonstrations - florence, Italy, October 7-13, 2012, Proceedings, Part III, pp 436–445
CS231n (2016) Convolutional Neural Networks for Visual Recognition
Chaabouni S, Benois-Pineau J, Ben Amar C (2016) Transfer learning with deep networks for saliency prediction in natural video 2016 IEEE international conference on image processing, ICIP 2016, vol 91
Chaabouni S, Benois-Pineau J, Hadar O (2016) Prediction of visual saliency in video with deep cnns Proceedings SPIE, vol 9971, pp 99,711Q–99,711Q–14
Chaabouni S, Benois-Pineau J, Hadar O, Ben Amar C (2016) Deep learning for saliency prediction in natural video. arXiv:1604.08010
Chaabouni S, Benois-Pineau J, Tison F, Ben Amar C (2016) Prediction of visual attention with deep CNN for studies of neurodegenerative diseases 2016 14th international workshop on content-based multimedia indexing (CBMI), pp 1–6
Gitchel G, Wetzel P, Baron M (2012) Pervasive ocular tremor in patients with parkinson disease. Arch Neurol
Itti L, Koch C, Niebur E (1998) A model of Saliency-Based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
ITU-R: Recommendation 500-11:(2002) Methodology for the subjective assessment of the quality of television pictures. ITU-R Rec. BT.500-11
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093
Khefifi W L’alzheimer en Tunisie : Une maladie en pleine expansion
Kruthiventi SSS, Gudisa V, Dholakiya JH, Babu RV (2016) Saliency unified: A deep architecture for simultaneous eye fixation prediction and salient object segmentation 2016 IEEE conference on computer vision and pattern recognition (CVPR) 00. doi:10.1109/CVPR.2016.623, pp 5781–5790
Kümmerer M, Theis L, Bethge M (2014) Deep gaze I: boosting saliency prediction with feature maps trained on imagenet. CoRR arXiv:1411.1045
Lappi OJPHIT (2013) Pursuit eye-movements in curve driving differentiate between future path and tangent point models. PLoS One 8(7). doi:10.1371/journal.pone.0068326
Le Meur O, Baccino T (2012) Methods for comparing scanpaths and saliency maps: strengths and weaknesses. Behav Res Methods 45(1):1–16
Lu Y, Li Z, Zhang X, Ming B, Jia J, Wang R, Ma D (2010) Retinal nerve fiber layer structure abnormalities in early Alzheimer’s disease: evidence in optical coherence tomography. Neurosci Lett (480):69–72
Mai L, Le H, Niu Y, Liu F (2011) Rule of thirds detection from photograph 2011 IEEE international symposium on Multimedia (ISM), pp 91–96
Marat S (2010) Modèles de saillance visuelle par fusion d’informations sur la luminance, le mouvement et les visages pour la prédiction de mouvements oculaires lors de l’exploration de vidéos. Ph.D. thesis université de grenoble
Marszałek M, Laptev I, Schmid C (2009) Actions in context IEEE conference on computer vision & pattern recognition
Mathe S, Sminchisescu C (2015) actions in the eye: Dynamic gaze datasets and learnt saliency models for visual recognition. IEEE Trans Pattern Anal Mach Intell, 37
Pan J, Giró i Nieto X (2015) End-to-end convolutional network for saliency prediction. CoRR arXiv:1507.01422
Pinto Y, van der Leij AR, Sligte IG, Lamme VF, Scholte HS (2013) Bottom-up and top-down attention are independent. J Vis 13(3):16
Seo HJ, Milanfar P (2009) Static and space-time visual saliency detection by self-resemblance. J Vis 9(12:15):1–27
Shen C, Zhao Q (2014) Learning to predict eye fixations for semantic contents using multi-layer sparse network. Neurocomputing 138:61–68
Simonyan K, Vedaldi A, Zisserman A (2013) deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR arXiv:1312.6034
Tison F, Chene G (2013) Les Yeux l’ont: anomalies des saccades oculaires à la phase prodromale de la maladie d’alzheimer ACRONYME : LYLO PROTOCOLE DE RECHERCHE BIOMEDICALE Version n3.0 du 09/10/2013
Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cogn Psychol 12(1):97–136
Tseng P, Cameron IGM, Pari G, Reynolds JN, Munoz DP, Itti L (2013) High-throughput classification of clinical populations from natural viewing eye movements. J Neurol 260:275– 284
Vig E, Dorr M, Cox D (2014) Large-Scale optimization of hierarchical features for saliency prediction in natural images. In: IEEE computer vision and pattern recognition (CVPR)
Wooding DS (2002) Eye movements of large populations: II. Deriving regions of interest, coverage, and similarity using fixation maps. Behav Res Methods Instrum Comput 34(4):518– 528
Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger K (eds) Advances in neural information processing systems 27. Curran Associates, Inc, pp 3320–3328
Acknowledgements
This research has been supported by University of Bordeaux, University of Sfax and the grant UNetBA.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chaabouni, S., Benois-pineau, J., Tison, F. et al. Prediction of visual attention with deep CNN on artificially degraded videos for studies of attention of patients with Dementia. Multimed Tools Appl 76, 22527–22546 (2017). https://doi.org/10.1007/s11042-017-4796-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-4796-5