research-article

Retargeting Semantically-Rich Photos

Authors:

Qi TianAuthors Info & Claims

IEEE Transactions on Multimedia, Volume 17, Issue 9

Pages 1538 - 1549

https://doi.org/10.1109/TMM.2015.2451954

Published: 01 September 2015 Publication History

Abstract

Semantically-rich photos contain a rich variety of semantic objects (e.g., pedestrians and bicycles). Retargeting these photos is a challenging task since each semantic object has fixed geometric characteristics. Shrinking these objects simultaneously during retargeting is prone to distortion. In this paper, we propose to retarget semantically-rich photos by detecting photo semantics from image tags, which are predicted by a multi-label SVM. The key technique is a generative model termed latent stability discovery (LSD). It can robustly localize various semantic objects in a photo by making use of the predicted noisy image tags. Based on LSD, a feature fusion algorithm is proposed to detect salient regions at both the low-level and high-level. These salient regions are linked into a path sequentially to simulate human visual perception . Finally, we learn the prior distribution of such paths from aesthetically pleasing training photos. The prior enforces the path of a retargeted photo to be maximally similar to those from the training photos. In the experiment, we collect 217 1600 ×1200 photos, each containing over seven salient objects. Comprehensive user studies demonstrate the competitiveness of our method.

References

[1]

S. Avidan and A. Shamir, “Seam carving for content-aware image resizing,” in ACM TOG, vol. 26, no. 3, p. 10, 2007.

Digital Library

[2]

Y. Ke, X. Tang, and F. Jing, “The design of high-level features for photo quality assessment,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recog., 2006, vol. 1, pp. 419–426.

[3]

R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Studying aesthetics in photographic images using a computational approach,” in Proc. ECCV, 2006, pp. 288–301.

[4]

M. Stricker and M. Orengo, “Similarity of color images,” in Proc. Storage Retrieval Image Video Databases, pp. 381–392.

[5]

W. Luo, X. Wang, and X. Tang, “Content-based photo quality assessment,” in Proc. IEEE Int. Conf. Comput. Vis., Nov. 2011, pp. 2206–2213.

[6]

L. Marchesotti, F. Perronnin, D. Larlus, and G. Csurka, “Assessing the aesthetic quality of photographs using generic image descriptors,” in Proc. IEEE Int. Conf. Comput. Vis., Nov. 2011, pp. 1784–1791.

[7]

M. Nishiyama, T. Okabe, Y. Sato, and I. Sato, “Sensation-based photo cropping,” in Proc. ACM Multimedia, 2009, pp. 669–672.

[8]

M. Nishiyama, T. Okabe1, I. Sato, and Y. Sato, “Aesthetic quality classification of photographs based on color harmony,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 33–40.

[9]

K. Crammer and Y. Singer, “On the algorithmic implementation of multi-class kernel-based vector machines,” in JMRL, vol. 2, pp. 265–292, 2002.

Digital Library

[10]

B. Cheng, B. Ni, S. Yan, and Q. Tian, “Learning to photograph,” in Proc. ACM Multimedia, 2010, pp. 291–300.

[11]

Y. J. Lee and K. Grauman, “Object-graphs for context-aware category discovery,” in IEEE Conf. Comput. Vis. Pattern Recog., 2010, pp. 1–8.

[12]

S. Dhar, V. Ordonez, and T. L. Berg, “High level describable attributes for predicting aesthetics and interestingness,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 1657–1664.

[13]

N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2005, vol. 1, pp. 886–893.

[14]

Q. Zhao and C. Koch, “Learning visual saliency by combining feature maps in a nonlinear manner using AdaBoost,” in JOV, vol. 12, no. 6, pp. 1–15, 2012.

[15]

J. Cai, E. J. Candès, and Z. Shen, “A singular value thresholding algorithm for matrix completion,” in SIAM J. Optimization, vol. 20, no. 4, pp. 1956–1982, 2010.

[16]

Y. Guo, F. Liu, J. Shi, Z. Zhou, and M. Gleicher, “Image retargeting using mesh parameterization,” in IEEE Trans. Multimedia, vol. 11, no. 5, pp. 856–867, Aug. 2009.

Digital Library

[17]

S.-S. Lin, I.-C. Yeh, C.-H. Lin, and T.-Y. Lee, “Patch-based image warping for content-aware retargeting,” in IEEE Trans. Multimedia, vol. 15, no. 2, pp. 359–368, Feb. 2013.

Digital Library

[18]

M. Nishiyama, T. Okabe, Y. Sato, and I. Sato, “Sensation-based photo cropping,” in Proc. ACM Multimedia, 2009, pp. 669–672.

[19]

Y. Pritch, E. Kav-Venaki, and S. Peleg, “Shift-map image editing,” in Proc. IEEE Int. Conf. Comput. Vis., Sep.–Oct. 2009, pp. 151–158.

[20]

X. Hou, J. Harel, and C. Koch, “Image signature: Highlighting sparse salient regions,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 1, pp. 194–201, Jan. 2012.

Digital Library

[21]

M. Rubinstein, A. Shamir, and S. Avidan, “Improved seam carving for video retargeting,” in ACM TOG, vol. 27, no. 3, p. 16, 2008.

Digital Library

[22]

M. Rubinstein, D. Gutierrez, O. Sorkine, and A. Shamir, “A comparative study of image retargeting,” in ACM TOG, vol. 29, no. 5, p. 160, 2010.

[23]

G. Liu, Z. Lin, and Y. Yu, “Robust subspace segmentation by low-rank representation,” in Proc. ICML, 2010, pp. 663–670.

[24]

L. Wolf, M. Guttmann, and D. Cohen-Or, “Non-homogeneous content-driven video retargeting,” in Proc. IEEE Int. Conf. Comput. Vis., Oct. 2007, pp. 1–6.

[25]

X. Wang, Z. Li, and D. Tao, “Subspaces indexing model on Grassmann Manifold for image search,” in IEEE Trans. Image Process., vol. 20, no. 9, pp. 2627–2635, Sep. 2011.

Digital Library

[26]

Y.-S. Wang, C.-L. Tai, O. Sorkine, and T.-Y. Lee, “Optimized scale-and-stretch for image resizing,” in ACM TOG, vol. 27, no. 5, p. 118, 2008.

[27]

C. Liu, J. Yuen, and A. Torralba, “Nonparametric scene parsing: Label transfer via dense scene alignment,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2009, pp. 1972–1979.

[28]

A. Oliva and A. Torralba, “Modeling the shape of the scene: A holistic representation of the spatial envelope,” in Int. J. Comput. Vis., vol. 42, no. 3, pp. 145–175, 2001.

Digital Library

[29]

L. Zhang, M. Song, Y. Yang, Q. Zhao, Z. Chen, and N. Sebe, “Weakly supervised photo cropping,” in IEEE Trans. Multimedia, vol. 16, no. 1, pp. 94–107, Jan. 2014.

[30]

L. Zhang, M. Song, Z. Liu, X. Liu, J. Bu, and C. Chen, “Probabilistic graphlet cut: Exploiting spatial structure cue for weakly supervised image segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 1908–1915.

[31]

S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2006, vol. 2, pp. 2169–2178.

[32]

L. Zhang, Y. Gao, C. Zhang, H. Zhang, Q. Tian, and R. Zimmermann, “Perception-guided multimodal feature fusion for photo aesthetics assessment,” in Proc. ACM Multimedia, 2014, pp. 237–246.

[33]

L. Zhang, Y. Gao, R. Ji, L. Ke, and J. Shen, “Representative discovery of structure cues for weakly-supervised image segmentation,” in IEEE Trans. Multimedia, vol. 16, no. 2, pp. 470–479, Feb. 2014.

Digital Library

[34]

Y. Fang, W. Lin, B.-S. Lee, C.-T. Lau, Z. Chen, and C.-W. Lin, “Bottom-up saliency detection model based on human visual sensitivity and amplitude spectrum,” in IEEE Trans. Multimedia, vol. 14, no. 1, pp. 187–198, Feb. 2012.

Digital Library

[35]

W. Lin and Y. Fang, “A saliency detection model using low-level features based on wavelet transform,” in IEEE Trans. Multimedia, vol. 15, no. 1, pp. 96–105, Jan. 2013.

[36]

M. Rubinstein, A. Shamir, and S. Avidan, “Multi-operator media retargeting,” in Proc. ACM Siggraph, 2009.

[37]

M. Rubinstein, A. Shamir, and S. Avidan, “Multi-operator media retargeting,” in ACM TOG, vol. 28, no. 3, p. 23, 2009.

[38]

Y.-F. Zhang, S.-M. Hu, and R. R. Martin, “Shrinkability maps for content-aware video resizing,” in Comput. Graph. Forum, vol. 27, no. 7, pp. 1797–1804, 2008.

[39]

Y.-S. Wang, H.-C. Lin, O. Sorkine, and T.-Y. Lee, “Motion-based video retargeting with optimized crop-and-warp,” in ACM TOG, vol. 29, no. 4, 2010.

[40]

M. G. Kendall and B. Babington Smith, “On the method of paired comparisons,” in Biometrica, vol. 31, pp. 324–345, 1940.

[41]

D. Vaquero, M. Turk, K. Pulli, M. Tico, and N. Gelfand, “A survey of image retargeting techniques,” in Proc. SPIE, 2010.

[42]

L. Zhang, Y. Gao, R. Ji, Q. Dai, and X. Li, “Actively learning human gaze shifting paths for semantics-aware photo cropping,” in IEEE Trans. Image Process., vol. 23, no. 5, pp. 2235–2245, May 2014.

[43]

J. Sun and H. Ling, “Scale and object aware image thumbnailing,” in Int. J. Comput. Vis., vol. 10, no. 4, pp. 135–153, 2013.

[44]

C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang, “Saliency detection via graph-based manifold ranking,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 3166–3173.

[45]

M.-M. Cheng, G.-X. Zhang, N. J. Mitra, X. Huang, and S.-M. Hu, “Global contrast based salient region detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 409–416.

[46]

H. Jiang, J. Wang, Z. Yuan, Y. Wu, N. Zheng, and S. Li, “Salient object detection: A discriminative regional feature integration approach,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 2083–2090.

[47]

A. Borji, D. N. Sihite, and L. Itti, “Salient object detection: A benchmark,” in Proc. ECCV, 2012, pp. 414–429.

[48]

T. Judd, F. Durand, and A. Torralba, A benchmark of computational models of saliency to predict human fixations, Massachusetts Inst. of Technol., Cambridge, MA USA, Tech. Rep. MIT-CSAIL-TR-2012-001, 2012.

[49]

J. Zhang and S. Sclaro, “Saliency detection: A boolean map approach,” in Proc. IEEE Int. Conf. Comput. Vis., Dec. 2013, pp. 153–160.

[50]

E. Vig, M. Dorr, and D. Cox, “Large-scale optimization of hierarchical features for saliency prediction in natural images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2014, pp. 2798–2805.

[51]

B. Alexe, T. Deselaers, and V. Ferrari, “What is an object?,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2010, pp. 73–80.

[52]

J. Uijlings, K. E. A. van de Sande, T. Gevers, and A. W. M. Smeulders, “Selective search for object recognition,” in Int. J. Comput. Vis., vol. 2, no. 104, pp. 154–171, 2013.

[53]

M.-M. Cheng, Z. Zhang, W.-Y. Lin, and P. Torr, “BING: Binarized normed gradients for objectness estimation at 300fps,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2014, pp. 3286–3293.

[54]

L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 11, pp. 1254–1259, Nov. 1998.

Digital Library

[55]

N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2005, vol. 1, pp. 886–893.

[56]

X. Hou, J. Harel, and C. Koch, “Image signature: Highlighting sparse salient regions,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 1, pp. 194–201, Jan. 2011.

[57]

N. D. B. Bruce and J. K. Tsotsos, “Saliency, attention, and visual search: An information theoretic approach,” in J. Vis., vol. 9, no. 3, p. 5, 2009.

[58]

J. Li, M. D. Levine, X. An, X. Xu, and H. He, “Visual saliency based on scale-space analysis in the frequency domain,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 4, pp. 996–1010, Apr. 2013.

Digital Library

[59]

T. Judd, K. A. Ehinger, F. Durand, and A. Torralba, “Learning to predict where humans look,” in Proc. IEEE Int. Conf. Comput. Vis., Sep.–Oct. 2009, pp. 2106–2113.

[60]

S. Goferman, L. Zelnik-Manor, and A. Tal, “Context-aware saliency detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2010, pp. 2376–2383.

[61]

J. Yang and M.-H. Yang, “Top-down visual saliency via joint CRF and dictionary learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2012, pp. 2296–2303.

[62]

Q. Yan, L. Xu, J. Shi, and J. Jia, “Hierarchical saliency detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 1155–1162.

[63]

L. Zhang, M. H. Tong, T. K. Marks, H. Shan, and G. W. Cottrell, “SUN: A bayesian framework for saliency using natural statistics,” in J. Vis., vol. 8, no. 7, p. 32, 2008.

[64]

L. Shi, J. Wang, L. y. Duan, and H. Lu, “Consumer video retargeting: Context assisted spatial-temporal grid optimization,” in Proc. ACM Multimedia, 2009, pp. 301–310.

[65]

T. Ren, Y. Liu, and G. Wu, “Image retargeting using multi-map constrained region warping,” in Proc. ACM Multimedia, 2009, pp. 853–856.

[66]

M. G. Kendall and B. Babington Smith, “On the method of paired comparisons,” in Biometrika Trust, vol. 31, no. 3/4, pp. 324–345, 1940.

[67]

M. Rubinstein, D. Gutierrez, O. Sorkine, and A. Shamir, “A comparative study of image retargeting,” in Proc. SIGGRAPH Asia, 2010.

[68]

L. Zhang, M. Song, Y. Yang, Q. Zhao, C. Zhao, and N. Sebe, “Weakly supervised photo cropping,” in IEEE Trans. Multimedia, vol. 16, no. 1, pp. 94–107, Jan. 2014.

[69]

Y.-J. Liu, X. Luo, Y.-M. Xuan, W.-F. Chen, and X.-L. Fu, “Image retargeting quality assessment,” in Eurographics, vol. 30, no. 2, pp. 583–592, 2011.

[70]

T. Ren and G. Wu, “Automatic image retargeting evaluation based on user perception,” in Proc. ICIP, 2010, pp. 1569–1572.

Cited By

Chang HShih TChang CTavanapong W(2019)CMAIR: content and mask-aware image retargetingMultimedia Tools and Applications10.1007/s11042-019-7462-278:15(21731-21758)Online publication date: 1-Aug-2019
https://dl.acm.org/doi/10.1007/s11042-019-7462-2
Kiess JKopf SGuthier BEffelsberg W(2018)A Survey on Content-Aware Image and Video RetargetingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/323159814:3(1-28)Online publication date: 24-Jul-2018
https://dl.acm.org/doi/10.1145/3231598
Hong RZhang LTao D(2016)Unified Photo Enhancement by Discovering Aesthetic Communities From FlickrIEEE Transactions on Image Processing10.1109/TIP.2016.251449925:3(1124-1135)Online publication date: 1-Mar-2016
https://dl.acm.org/doi/10.1109/TIP.2016.2514499

Index Terms

Retargeting Semantically-Rich Photos
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
  2. Computer graphics

Index terms have been assigned to the content through auto-classification.

Recommendations

Personalized Geo-Specific Tag Recommendation for Photos on Social Websites

Social tagging becomes increasingly important to organize and search large-scale community-contributed photos on social websites. To facilitate generating high-quality social tags, tag recommendation by automatically assigning relevant tags to photos ...
Tag recommendation for georeferenced photos
LBSN '11: Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social Networks

This paper presents methods for annotating georeferenced photos with descriptive tags, exploring the annotations for other georeferenced photos which are available at online repositories like Flickr. Specifically, by using the geospatial coordinates ...
AutoTag 'n Search My Photos: Leveraging the Social Graph for Photo Tagging
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web

Personal photo collections are large and growing rapidly. Today, it is difficult to search such a photo collection for people who occur in them since it is tedious to manually associate face tags in photos. The key idea is to learn face models for ...

Comments

Information & Contributors

Information

Published In

Copyright © 2015.

Publisher

IEEE Press

Publication History

Published: 01 September 2015

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chang HShih TChang CTavanapong W(2019)CMAIR: content and mask-aware image retargetingMultimedia Tools and Applications10.1007/s11042-019-7462-278:15(21731-21758)Online publication date: 1-Aug-2019
https://dl.acm.org/doi/10.1007/s11042-019-7462-2
Kiess JKopf SGuthier BEffelsberg W(2018)A Survey on Content-Aware Image and Video RetargetingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/323159814:3(1-28)Online publication date: 24-Jul-2018
https://dl.acm.org/doi/10.1145/3231598
Hong RZhang LTao D(2016)Unified Photo Enhancement by Discovering Aesthetic Communities From FlickrIEEE Transactions on Image Processing10.1109/TIP.2016.251449925:3(1124-1135)Online publication date: 1-Mar-2016
https://dl.acm.org/doi/10.1109/TIP.2016.2514499

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents