Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2919332.2919806guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks

Published: 07 December 2015 Publication History
  • Get Citation Alerts
  • Abstract

    Saliency in Context (SALICON) is an ongoing effort that aims at understanding and predicting visual attention. Conventional saliency models typically rely on low-level image statistics to predict human fixations. While these models perform significantly better than chance, there is still a large gap between model prediction and human behavior. This gap is largely due to the limited capability of models in predicting eye fixations with strong semantic content, the so-called semantic gap. This paper presents a focused study to narrow the semantic gap with an architecture based on Deep Neural Network (DNN). It leverages the representational power of high-level semantics encoded in DNNs pretrained for object recognition. Two key components are fine-tuning the DNNs fully convolutionally with an objective function based on the saliency evaluation metrics, and integrating information at different image scales. We compare our method with 14 saliency models on 6 public eye tracking benchmark datasets. Results demonstrate that our DNNs can automatically learn features particularly for saliency prediction that surpass by a big margin the state-of-the-art. In addition, our model ranks top to date under all seven metrics on the MIT300 challenge set.

    Cited By

    View all
    • (2023)What do deep saliency models learn about visual attention?Proceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666540(9543-9555)Online publication date: 10-Dec-2023
    • (2022)LiteReconfigProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519577(334-351)Online publication date: 28-Mar-2022
    • (2021)Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and AgencyProceedings of the ACM on Human-Computer Interaction10.1145/34795945:CSCW2(1-24)Online publication date: 18-Oct-2021
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ICCV '15: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV)
    December 2015
    4730 pages
    ISBN:9781467383912

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 07 December 2015

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)What do deep saliency models learn about visual attention?Proceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666540(9543-9555)Online publication date: 10-Dec-2023
    • (2022)LiteReconfigProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519577(334-351)Online publication date: 28-Mar-2022
    • (2021)Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and AgencyProceedings of the ACM on Human-Computer Interaction10.1145/34795945:CSCW2(1-24)Online publication date: 18-Oct-2021
    • (2021)JDMAN: Joint Discriminative and Mutual Adaptation Networks for Cross-Domain Facial Expression RecognitionProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475484(3312-3320)Online publication date: 17-Oct-2021
    • (2021)Attention Transition Prediction Based on Multi-Scale Spatiotemporal FeaturesThe 2nd International Conference on Computing and Data Science10.1145/3448734.3450892(1-4)Online publication date: 28-Jan-2021
    • (2021)OpenNEEDS: A Dataset of Gaze, Head, Hand, and Scene Signals During Exploration in Open-Ended VR EnvironmentsACM Symposium on Eye Tracking Research and Applications10.1145/3448018.3457996(1-7)Online publication date: 25-May-2021
    • (2021)Learning the Relation Between Interested Objects and Aesthetic Region for Image CroppingIEEE Transactions on Multimedia10.1109/TMM.2020.302988223(3618-3630)Online publication date: 1-Jan-2021
    • (2020)Exploring Language Prior for Mode-Sensitive Visual Attention ModelingProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3414008(4199-4207)Online publication date: 12-Oct-2020
    • (2020)SalGCNProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3413733(682-690)Online publication date: 12-Oct-2020
    • (2020)Object-level Attention for Aesthetic Rating Distribution PredictionProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3413695(816-824)Online publication date: 12-Oct-2020
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media