Article

SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks

Authors:

Xun Huang,

Chengyao Shen,

Xavier Boix,

Qi ZhaoAuthors Info & Claims

ICCV '15: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV)

Pages 262 - 270

Published: 07 December 2015 Publication History

Abstract

Saliency in Context (SALICON) is an ongoing effort that aims at understanding and predicting visual attention. Conventional saliency models typically rely on low-level image statistics to predict human fixations. While these models perform significantly better than chance, there is still a large gap between model prediction and human behavior. This gap is largely due to the limited capability of models in predicting eye fixations with strong semantic content, the so-called semantic gap. This paper presents a focused study to narrow the semantic gap with an architecture based on Deep Neural Network (DNN). It leverages the representational power of high-level semantics encoded in DNNs pretrained for object recognition. Two key components are fine-tuning the DNNs fully convolutionally with an objective function based on the saliency evaluation metrics, and integrating information at different image scales. We compare our method with 14 saliency models on 6 public eye tracking benchmark datasets. Results demonstrate that our DNNs can automatically learn features particularly for saliency prediction that surpass by a big margin the state-of-the-art. In addition, our model ranks top to date under all seven metrics on the MIT300 challenge set.

Cited By

View all

Chen SJiang MZhao QOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)What do deep saliency models learn about visual attention?Proceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666540(9543-9555)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666540
Xu RLee JWang PBagchi SLi YChaterji SBromberg YKermarrec AKozyrakis C(2022)LiteReconfigProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519577(334-351)Online publication date: 28-Mar-2022
https://dl.acm.org/doi/10.1145/3492321.3519577
Yee KTantipongpipat UMishra S(2021)Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and AgencyProceedings of the ACM on Human-Computer Interaction10.1145/34795945:CSCW2(1-24)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3479594
Show More Cited By

Recommendations

Image fusion based on visual salient features and the cross-contrast

Low frequency subband coefficients are selected based on visual salient features.Bandpass directional subband coefficients are selected by the cross-contrast.Three maps of visual salient features are constructed based on visual saliency. To extract and ...
Lossless-constraint Denoising based Auto-encoders

In this paper, we address the poor generalization ability problem of traditional auto-encoder on noise data, and propose a Lossless-constraint Denoising (LD) method, which can enhance the anti-noise ability and robustness of auto-encoders. We ...
Undecimated wavelet shrinkage estimate of the 1D and 2D spectra
ICASSP '00: Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04

We study the problem of estimating the log-spectrum of a stationary Gaussian time series by thresholding the wavelet coefficients. We propose the use of the undecimated wavelet transform to denoise the log-periodogram. For this, we review a denoising ...

Comments

Information & Contributors

Information

Published In

ICCV '15: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV)

December 2015

4730 pages

ISBN:9781467383912

Publisher

IEEE Computer Society

United States

Publication History

Published: 07 December 2015

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

37
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Chen SJiang MZhao QOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)What do deep saliency models learn about visual attention?Proceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666540(9543-9555)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666540
Xu RLee JWang PBagchi SLi YChaterji SBromberg YKermarrec AKozyrakis C(2022)LiteReconfigProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519577(334-351)Online publication date: 28-Mar-2022
https://dl.acm.org/doi/10.1145/3492321.3519577
Yee KTantipongpipat UMishra S(2021)Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and AgencyProceedings of the ACM on Human-Computer Interaction10.1145/34795945:CSCW2(1-24)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3479594
Li YGao YChen BZhang ZZhu LLu GShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)JDMAN: Joint Discriminative and Mutual Adaptation Networks for Cross-Domain Facial Expression RecognitionProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475484(3312-3320)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475484
Chi WWang PChen DSang XWang KYan B(2021)Attention Transition Prediction Based on Multi-Scale Spatiotemporal FeaturesThe 2nd International Conference on Computing and Data Science10.1145/3448734.3450892(1-4)Online publication date: 28-Jan-2021
https://dl.acm.org/doi/10.1145/3448734.3450892
Emery KZannoli MWarren JXiao LTalathi S(2021)OpenNEEDS: A Dataset of Gaze, Head, Hand, and Scene Signals During Exploration in Open-Ended VR EnvironmentsACM Symposium on Eye Tracking Research and Applications10.1145/3448018.3457996(1-7)Online publication date: 25-May-2021
https://dl.acm.org/doi/10.1145/3448018.3457996
Lu PZhang HPeng XJin X(2021)Learning the Relation Between Interested Objects and Aesthetic Region for Image CroppingIEEE Transactions on Multimedia10.1109/TMM.2020.302988223(3618-3630)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1109/TMM.2020.3029882
Sun XZhang XCao LWu YHuang FJi RWen Chen CCucchiara RHua XQi GRicci EZhang ZZimmermann R(2020)Exploring Language Prior for Mode-Sensitive Visual Attention ModelingProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3414008(4199-4207)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3394171.3414008
Lv HYang QLi CDai WZou JXiong HWen Chen CCucchiara RHua XQi GRicci EZhang ZZimmermann R(2020)SalGCNProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3413733(682-690)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3394171.3413733
Hou JYang SLin WWen Chen CCucchiara RHua XQi GRicci EZhang ZZimmermann R(2020)Object-level Attention for Aesthetic Rating Distribution PredictionProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3413695(816-824)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3394171.3413695
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

Cited By

Recommendations

Image fusion based on visual salient features and the cross-contrast

Lossless-constraint Denoising based Auto-encoders

Undecimated wavelet shrinkage estimate of the 1D and 2D spectra

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations