Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Retargeting Semantically-Rich Photos

Published: 01 September 2015 Publication History

Abstract

Semantically-rich photos contain a rich variety of semantic objects (e.g., pedestrians and bicycles). Retargeting these photos is a challenging task since each semantic object has fixed geometric characteristics. Shrinking these objects simultaneously during retargeting is prone to distortion. In this paper, we propose to retarget semantically-rich photos by detecting photo semantics from image tags, which are predicted by a multi-label SVM. The key technique is a generative model termed latent stability discovery (LSD). It can robustly localize various semantic objects in a photo by making use of the predicted noisy image tags. Based on LSD, a feature fusion algorithm is proposed to detect salient regions at both the low-level and high-level. These salient regions are linked into a path sequentially to simulate human visual perception . Finally, we learn the prior distribution of such paths from aesthetically pleasing training photos. The prior enforces the path of a retargeted photo to be maximally similar to those from the training photos. In the experiment, we collect 217 1600 ×1200 photos, each containing over seven salient objects. Comprehensive user studies demonstrate the competitiveness of our method.

References

[1]
S. Avidan and A. Shamir, “Seam carving for content-aware image resizing,” in ACM TOG, vol. 26, no. 3, p. 10, 2007.
[2]
Y. Ke, X. Tang, and F. Jing, “The design of high-level features for photo quality assessment,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recog., 2006, vol. 1, pp. 419–426.
[3]
R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Studying aesthetics in photographic images using a computational approach,” in Proc. ECCV, 2006, pp. 288–301.
[4]
M. Stricker and M. Orengo, “Similarity of color images,” in Proc. Storage Retrieval Image Video Databases, pp. 381–392.
[5]
W. Luo, X. Wang, and X. Tang, “Content-based photo quality assessment,” in Proc. IEEE Int. Conf. Comput. Vis., Nov. 2011, pp. 2206–2213.
[6]
L. Marchesotti, F. Perronnin, D. Larlus, and G. Csurka, “Assessing the aesthetic quality of photographs using generic image descriptors,” in Proc. IEEE Int. Conf. Comput. Vis., Nov. 2011, pp. 1784–1791.
[7]
M. Nishiyama, T. Okabe, Y. Sato, and I. Sato, “Sensation-based photo cropping,” in Proc. ACM Multimedia, 2009, pp. 669–672.
[8]
M. Nishiyama, T. Okabe1, I. Sato, and Y. Sato, “Aesthetic quality classification of photographs based on color harmony,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 33–40.
[9]
K. Crammer and Y. Singer, “On the algorithmic implementation of multi-class kernel-based vector machines,” in JMRL, vol. 2, pp. 265–292, 2002.
[10]
B. Cheng, B. Ni, S. Yan, and Q. Tian, “Learning to photograph,” in Proc. ACM Multimedia, 2010, pp. 291–300.
[11]
Y. J. Lee and K. Grauman, “Object-graphs for context-aware category discovery,” in IEEE Conf. Comput. Vis. Pattern Recog., 2010, pp. 1–8.
[12]
S. Dhar, V. Ordonez, and T. L. Berg, “High level describable attributes for predicting aesthetics and interestingness,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 1657–1664.
[13]
N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2005, vol. 1, pp. 886–893.
[14]
Q. Zhao and C. Koch, “Learning visual saliency by combining feature maps in a nonlinear manner using AdaBoost,” in JOV, vol. 12, no. 6, pp. 1–15, 2012.
[15]
J. Cai, E. J. Candès, and Z. Shen, “A singular value thresholding algorithm for matrix completion,” in SIAM J. Optimization, vol. 20, no. 4, pp. 1956–1982, 2010.
[16]
Y. Guo, F. Liu, J. Shi, Z. Zhou, and M. Gleicher, “Image retargeting using mesh parameterization,” in IEEE Trans. Multimedia, vol. 11, no. 5, pp. 856–867, Aug. 2009.
[17]
S.-S. Lin, I.-C. Yeh, C.-H. Lin, and T.-Y. Lee, “Patch-based image warping for content-aware retargeting,” in IEEE Trans. Multimedia, vol. 15, no. 2, pp. 359–368, Feb. 2013.
[18]
M. Nishiyama, T. Okabe, Y. Sato, and I. Sato, “Sensation-based photo cropping,” in Proc. ACM Multimedia, 2009, pp. 669–672.
[19]
Y. Pritch, E. Kav-Venaki, and S. Peleg, “Shift-map image editing,” in Proc. IEEE Int. Conf. Comput. Vis., Sep.–Oct. 2009, pp. 151–158.
[20]
X. Hou, J. Harel, and C. Koch, “Image signature: Highlighting sparse salient regions,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 1, pp. 194–201, Jan. 2012.
[21]
M. Rubinstein, A. Shamir, and S. Avidan, “Improved seam carving for video retargeting,” in ACM TOG, vol. 27, no. 3, p. 16, 2008.
[22]
M. Rubinstein, D. Gutierrez, O. Sorkine, and A. Shamir, “A comparative study of image retargeting,” in ACM TOG, vol. 29, no. 5, p. 160, 2010.
[23]
G. Liu, Z. Lin, and Y. Yu, “Robust subspace segmentation by low-rank representation,” in Proc. ICML, 2010, pp. 663–670.
[24]
L. Wolf, M. Guttmann, and D. Cohen-Or, “Non-homogeneous content-driven video retargeting,” in Proc. IEEE Int. Conf. Comput. Vis., Oct. 2007, pp. 1–6.
[25]
X. Wang, Z. Li, and D. Tao, “Subspaces indexing model on Grassmann Manifold for image search,” in IEEE Trans. Image Process., vol. 20, no. 9, pp. 2627–2635, Sep. 2011.
[26]
Y.-S. Wang, C.-L. Tai, O. Sorkine, and T.-Y. Lee, “Optimized scale-and-stretch for image resizing,” in ACM TOG, vol. 27, no. 5, p. 118, 2008.
[27]
C. Liu, J. Yuen, and A. Torralba, “Nonparametric scene parsing: Label transfer via dense scene alignment,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2009, pp. 1972–1979.
[28]
A. Oliva and A. Torralba, “Modeling the shape of the scene: A holistic representation of the spatial envelope,” in Int. J. Comput. Vis., vol. 42, no. 3, pp. 145–175, 2001.
[29]
L. Zhang, M. Song, Y. Yang, Q. Zhao, Z. Chen, and N. Sebe, “Weakly supervised photo cropping,” in IEEE Trans. Multimedia, vol. 16, no. 1, pp. 94–107, Jan. 2014.
[30]
L. Zhang, M. Song, Z. Liu, X. Liu, J. Bu, and C. Chen, “Probabilistic graphlet cut: Exploiting spatial structure cue for weakly supervised image segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 1908–1915.
[31]
S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2006, vol. 2, pp. 2169–2178.
[32]
L. Zhang, Y. Gao, C. Zhang, H. Zhang, Q. Tian, and R. Zimmermann, “Perception-guided multimodal feature fusion for photo aesthetics assessment,” in Proc. ACM Multimedia, 2014, pp. 237–246.
[33]
L. Zhang, Y. Gao, R. Ji, L. Ke, and J. Shen, “Representative discovery of structure cues for weakly-supervised image segmentation,” in IEEE Trans. Multimedia, vol. 16, no. 2, pp. 470–479, Feb. 2014.
[34]
Y. Fang, W. Lin, B.-S. Lee, C.-T. Lau, Z. Chen, and C.-W. Lin, “Bottom-up saliency detection model based on human visual sensitivity and amplitude spectrum,” in IEEE Trans. Multimedia, vol. 14, no. 1, pp. 187–198, Feb. 2012.
[35]
W. Lin and Y. Fang, “A saliency detection model using low-level features based on wavelet transform,” in IEEE Trans. Multimedia, vol. 15, no. 1, pp. 96–105, Jan. 2013.
[36]
M. Rubinstein, A. Shamir, and S. Avidan, “Multi-operator media retargeting,” in Proc. ACM Siggraph, 2009.
[37]
M. Rubinstein, A. Shamir, and S. Avidan, “Multi-operator media retargeting,” in ACM TOG, vol. 28, no. 3, p. 23, 2009.
[38]
Y.-F. Zhang, S.-M. Hu, and R. R. Martin, “Shrinkability maps for content-aware video resizing,” in Comput. Graph. Forum, vol. 27, no. 7, pp. 1797–1804, 2008.
[39]
Y.-S. Wang, H.-C. Lin, O. Sorkine, and T.-Y. Lee, “Motion-based video retargeting with optimized crop-and-warp,” in ACM TOG, vol. 29, no. 4, 2010.
[40]
M. G. Kendall and B. Babington Smith, “On the method of paired comparisons,” in Biometrica, vol. 31, pp. 324–345, 1940.
[41]
D. Vaquero, M. Turk, K. Pulli, M. Tico, and N. Gelfand, “A survey of image retargeting techniques,” in Proc. SPIE, 2010.
[42]
L. Zhang, Y. Gao, R. Ji, Q. Dai, and X. Li, “Actively learning human gaze shifting paths for semantics-aware photo cropping,” in IEEE Trans. Image Process., vol. 23, no. 5, pp. 2235–2245, May 2014.
[43]
J. Sun and H. Ling, “Scale and object aware image thumbnailing,” in Int. J. Comput. Vis., vol. 10, no. 4, pp. 135–153, 2013.
[44]
C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang, “Saliency detection via graph-based manifold ranking,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 3166–3173.
[45]
M.-M. Cheng, G.-X. Zhang, N. J. Mitra, X. Huang, and S.-M. Hu, “Global contrast based salient region detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 409–416.
[46]
H. Jiang, J. Wang, Z. Yuan, Y. Wu, N. Zheng, and S. Li, “Salient object detection: A discriminative regional feature integration approach,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 2083–2090.
[47]
A. Borji, D. N. Sihite, and L. Itti, “Salient object detection: A benchmark,” in Proc. ECCV, 2012, pp. 414–429.
[48]
T. Judd, F. Durand, and A. Torralba, A benchmark of computational models of saliency to predict human fixations, Massachusetts Inst. of Technol., Cambridge, MA USA, Tech. Rep. MIT-CSAIL-TR-2012-001, 2012.
[49]
J. Zhang and S. Sclaro, “Saliency detection: A boolean map approach,” in Proc. IEEE Int. Conf. Comput. Vis., Dec. 2013, pp. 153–160.
[50]
E. Vig, M. Dorr, and D. Cox, “Large-scale optimization of hierarchical features for saliency prediction in natural images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2014, pp. 2798–2805.
[51]
B. Alexe, T. Deselaers, and V. Ferrari, “What is an object?,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2010, pp. 73–80.
[52]
J. Uijlings, K. E. A. van de Sande, T. Gevers, and A. W. M. Smeulders, “Selective search for object recognition,” in Int. J. Comput. Vis., vol. 2, no. 104, pp. 154–171, 2013.
[53]
M.-M. Cheng, Z. Zhang, W.-Y. Lin, and P. Torr, “BING: Binarized normed gradients for objectness estimation at 300fps,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2014, pp. 3286–3293.
[54]
L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 11, pp. 1254–1259, Nov. 1998.
[55]
N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2005, vol. 1, pp. 886–893.
[56]
X. Hou, J. Harel, and C. Koch, “Image signature: Highlighting sparse salient regions,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 1, pp. 194–201, Jan. 2011.
[57]
N. D. B. Bruce and J. K. Tsotsos, “Saliency, attention, and visual search: An information theoretic approach,” in J. Vis., vol. 9, no. 3, p. 5, 2009.
[58]
J. Li, M. D. Levine, X. An, X. Xu, and H. He, “Visual saliency based on scale-space analysis in the frequency domain,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 4, pp. 996–1010, Apr. 2013.
[59]
T. Judd, K. A. Ehinger, F. Durand, and A. Torralba, “Learning to predict where humans look,” in Proc. IEEE Int. Conf. Comput. Vis., Sep.–Oct. 2009, pp. 2106–2113.
[60]
S. Goferman, L. Zelnik-Manor, and A. Tal, “Context-aware saliency detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2010, pp. 2376–2383.
[61]
J. Yang and M.-H. Yang, “Top-down visual saliency via joint CRF and dictionary learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2012, pp. 2296–2303.
[62]
Q. Yan, L. Xu, J. Shi, and J. Jia, “Hierarchical saliency detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 1155–1162.
[63]
L. Zhang, M. H. Tong, T. K. Marks, H. Shan, and G. W. Cottrell, “SUN: A bayesian framework for saliency using natural statistics,” in J. Vis., vol. 8, no. 7, p. 32, 2008.
[64]
L. Shi, J. Wang, L. y. Duan, and H. Lu, “Consumer video retargeting: Context assisted spatial-temporal grid optimization,” in Proc. ACM Multimedia, 2009, pp. 301–310.
[65]
T. Ren, Y. Liu, and G. Wu, “Image retargeting using multi-map constrained region warping,” in Proc. ACM Multimedia, 2009, pp. 853–856.
[66]
M. G. Kendall and B. Babington Smith, “On the method of paired comparisons,” in Biometrika Trust, vol. 31, no. 3/4, pp. 324–345, 1940.
[67]
M. Rubinstein, D. Gutierrez, O. Sorkine, and A. Shamir, “A comparative study of image retargeting,” in Proc. SIGGRAPH Asia, 2010.
[68]
L. Zhang, M. Song, Y. Yang, Q. Zhao, C. Zhao, and N. Sebe, “Weakly supervised photo cropping,” in IEEE Trans. Multimedia, vol. 16, no. 1, pp. 94–107, Jan. 2014.
[69]
Y.-J. Liu, X. Luo, Y.-M. Xuan, W.-F. Chen, and X.-L. Fu, “Image retargeting quality assessment,” in Eurographics, vol. 30, no. 2, pp. 583–592, 2011.
[70]
T. Ren and G. Wu, “Automatic image retargeting evaluation based on user perception,” in Proc. ICIP, 2010, pp. 1569–1572.

Cited By

View all
  • (2019)CMAIR: content and mask-aware image retargetingMultimedia Tools and Applications10.1007/s11042-019-7462-278:15(21731-21758)Online publication date: 1-Aug-2019
  • (2018)A Survey on Content-Aware Image and Video RetargetingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/323159814:3(1-28)Online publication date: 24-Jul-2018
  • (2016)Unified Photo Enhancement by Discovering Aesthetic Communities From FlickrIEEE Transactions on Image Processing10.1109/TIP.2016.251449925:3(1124-1135)Online publication date: 1-Mar-2016

Index Terms

  1. Retargeting Semantically-Rich Photos
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        Publisher

        IEEE Press

        Publication History

        Published: 01 September 2015

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 04 Oct 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2019)CMAIR: content and mask-aware image retargetingMultimedia Tools and Applications10.1007/s11042-019-7462-278:15(21731-21758)Online publication date: 1-Aug-2019
        • (2018)A Survey on Content-Aware Image and Video RetargetingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/323159814:3(1-28)Online publication date: 24-Jul-2018
        • (2016)Unified Photo Enhancement by Discovering Aesthetic Communities From FlickrIEEE Transactions on Image Processing10.1109/TIP.2016.251449925:3(1124-1135)Online publication date: 1-Mar-2016

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media