Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
survey

Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement, and Retrieval

Published: 06 June 2016 Publication History
  • Get Citation Alerts
  • Abstract

    Where previous reviews on content-based image retrieval emphasize what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems (i.e., image tag assignment, refinement, and tag-based image retrieval) is presented. While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, that is, estimating the relevance of a specific tag with respect to the visual content of a given image and its social context. By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this article introduces a two-dimensional taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations. For a head-to-head comparison with the state of the art, a new experimental protocol is presented, with training sets containing 10,000, 100,000, and 1 million images, and an evaluation on three test sets, contributed by various research groups. Eleven representative works are implemented and evaluated. Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future.

    References

    [1]
    Morgan Ames and Mor Naaman. 2007. Why we tag: Motivations for annotation in mobile and online media. In Proc. of ACM CHI. 971--980.
    [2]
    Stuart Andrews, Ioannis Tsochantaridis, and Thomas Hofmann. 2003. Support vector machines for multiple-instance learning. In Proc. of NIPS. 561--568.
    [3]
    Pradeep K. Atrey, M. Anwar Hossain, Abdulmotaleb El Saddik, and Mohan S. Kankanhalli. 2010. Multimodal fusion for multimedia analysis: A survey. Multimedia Systems 16, 6 (2010), 345--379.
    [4]
    Lamberto Ballan, Marco Bertini, Tiberio Uricchio, and Alberto Del Bimbo. 2015. Data-driven approaches for social image and video tagging. Multimedia Tools and Applications 74, 4 (2015), 1443--1468.
    [5]
    Lamberto Ballan, Tiberio Uricchio, Lorenzo Seidenari, and Alberto Del Bimbo. 2014. A cross-media model for automatic image annotation. In Proc. of ACM ICMR. 73--80.
    [6]
    Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural Language Processing with Python. O’Reilly Media.
    [7]
    David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (2003), 993--1022.
    [8]
    Damian Borth, Rongrong Ji, Tao Chen, Thomas Breuel, and Shih-Fu Chang. 2013. Large-scale visual sentiment ontology and detectors using adjective noun pairs. In Proc. of ACM MM. 223--232.
    [9]
    Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. 2011. Robust principal component analysis? Journal of the ACM 58, 3 (2011), 11.
    [10]
    Lin Chen, Dong Xu, Ivor W. Tsang, and Jiebo Luo. 2012. Tag-based image retrieval improved by augmented features and group-based refinement. IEEE Transactions on Multimedia 14, 4 (2012), 1057--1067.
    [11]
    Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A real-world web image database from national university of singapore. In Proc. of ACM CIVR. 48:1--48:9.
    [12]
    Rudi L. Cilibrasi and Paul M. B. Vitanyi. 2007. The Google similarity distance. IEEE Transactions on Knowledge and Data Engineering 19, 3 (2007), 370--383.
    [13]
    Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z. Wang. 2008. Image retrieval: Ideas, influences, and trends of the new age. Computing Surveys 40, 2 (2008), 5:1--5:60.
    [14]
    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proc. of CVPR. 248--255.
    [15]
    Jesse Dodge, Amit Goyal, Xufeng Han, Alyssa Mensch, Margaret Mitchell, Karl Stratos, Kota Yamaguchi, Yejin Choi, Hal Daumé, III, Alexander C. Berg, and Tamara L. Berg. 2012. Detecting visual text. In Proc. of NAACL. 762--772.
    [16]
    Kun Duan, David J. Crandall, and Dhruv Batra. 2014. Multimodal learning in loosely-organized web images. In Proc. of CVPR. 2465--2472.
    [17]
    Lixin Duan, Wen Li, Ivor Wai-Hung Tsang, and Dong Xu. 2011. Improving web image search by bag-based reranking. IEEE Transactions on Image Processing 20, 11 (2011), 3280--3290.
    [18]
    Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K. I. Williams, John Winn, and Andrew Zisserman. 2015. The PASCAL visual object classes challenge: A retrospective. International Journal of Computer Vision 111, 1 (2015), 98--136.
    [19]
    Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9 (2008), 1871--1874.
    [20]
    Songhe Feng, Congyan Lang, and Bing Li. 2012. Towards relevance and saliency ranking of image tags. In Proc. of ACM MM. 917--920.
    [21]
    Zheyun Feng, Songhe Feng, Rong Jin, and Anil K. Jain. 2014. Image tag completion by noisy matrix recovery. In Proc. of ECCV. 424--438.
    [22]
    Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. 2003. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research 4 (2003), 933--969.
    [23]
    Yue Gao, Meng Wang, Zheng-Jun Zha, Jialie Shen, Xuelong Li, and Xindong Wu. 2013. Visual-textual joint relevance learning for tag-based social image search. IEEE Transactions on Image Processing 22, 1 (2013), 363--376.
    [24]
    Alexandru Lucian Ginsca, Adrian Popescu, Bogdan Ionescu, Anil Armagan, and Ioannis Kanellos. 2014. Toward an estimation of user tagging credibility for social image retrieval. In Proc. of ACM MM. 1021--1024.
    [25]
    Scott A. Golder and Bernardo A. Huberman. 2006. Usage patterns of collaborative tagging systems. Journal of Information Science 32, 2 (2006), 198--208.
    [26]
    Gene H. Golub and Charles F. Van Loan. 2012. Matrix Computations. Johns Hopkins University Press.
    [27]
    Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. 2009. TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation. In Proc. of ICCV. 309--316.
    [28]
    Manish Gupta, Rui Li, Zhijun Yin, and Jiawei Han. 2010. Survey on social tagging techniques. SIGKDD Explorations Newsletter 12, 1 (2010), 58--72.
    [29]
    Xian-Sheng Hua, Linjun Yang, Jingdong Wang, Jing Wang, Ming Ye, Kuansan Wang, Yong Rui, and Jin Li. 2013. Clickage: Towards bridging semantic and intent gaps via mining click logs of search engines. In Proc. of ACM MM. 243--252.
    [30]
    Mark J. Huiskes, Bart Thomee, and Michael S. Lew. 2010. New trends and ideas in visual concept detection: The MIR Flickr retrieval evaluation initiative. In Proc. of ACM MIR. 527--536.
    [31]
    Fouzia Jabeen, Shah Khusro, Amna Majid, and Azhar Rauf. 2016. Semantics discovery in social tagging systems: A review. Multimedia Tools and Applications 75, 1 (2016), 573--605.
    [32]
    Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Intelligent Systems and Technology 20, 4 (2002), 422--446.
    [33]
    Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2011. Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 1 (2011), 117--128.
    [34]
    Yu-Gang Jiang, Chong-Wah Ngo, and Shih-Fu Chang. 2009. Semantic context transfer across heterogeneous sources for domain adaptive video search. In Proc. of ACM MM. 155--164.
    [35]
    Yohan Jin, Latifur Khan, Lei Wang, and Mamoun Awad. 2005. Image annotations by combining multiple evidence & wordNet. In Proc. of ACM MM. 706--715.
    [36]
    Thorsten Joachims. 1999. Transductive inference for text classification using support vector machines. In Proc. of ICML. 200--209.
    [37]
    Justin Johnson, Lamberto Ballan, and Li Fei-Fei. 2015. Love thy neighbors: Image annotation by exploiting image metadata. In Proc. of ICCV.
    [38]
    Mahdi M. Kalayeh, Haroon Idrees, and Mubarak Shah. 2014. NMF-KNN: Image annotation using weighted multi-view non-negative matrix factorization. In Proc. of CVPR. 184--191.
    [39]
    Lyndon S. Kennedy, Shih-Fu Chang, and Igor V. Kozintsev. 2006. To search or to label?: Predicting the performance of search-based automatic image classifiers. In Proc. of ACM MIR. 249--258.
    [40]
    Lyndon S. Kennedy, Malcolm Slaney, and Kilian Weinberger. 2009. Reliable tags using image similarity: Mining specificity and expertise from large-scale multimedia databases. In Proc. of ACM MM Workshop on Web-Scale Multimedia Corpus. 17--24.
    [41]
    Gunhee Kim and Eric P. Xing. 2013. Time-sensitive web image ranking and retrieval via dynamic multi-task regression. In Proc. of ACM WSDM. 163--172.
    [42]
    Yin-Hsi Kuo, Wen-Huang Cheng, Hsuan-Tien Lin, and Winston H. Hsu. 2012. Unsupervised semantic feature discovery for image object retrieval and tag refinement. IEEE Transactions on Multimedia 14, 4 (2012), 1079--1090.
    [43]
    Tian Lan and Greg Mori. 2013. A max-margin riffled independence model for image tag ranking. In Proc. of CVPR. 3103--3110.
    [44]
    Sihyoung Lee, Wesley De Neve, and Yong Man Ro. 2013. Visually weighted neighbor voting for image tag relevance learning. Multimedia Tools and Applications 72, 2 (2013), 1363--1386.
    [45]
    Mingling Li. 2007. Texture moment for content-based image retrieval. In Proc. of ICME. 508--511.
    [46]
    Wen Li, Lixin Duan, Dong Xu, and Ivor Wai-Hung Tsang. 2011a. Text-based image retrieval using progressive multi-instance learning. In Proc. of ICCV. 2049--2055.
    [47]
    Xirong Li. 2016. Tag relevance fusion for social image retrieval. Multimedia Systems. In press (2016).
    [48]
    Xirong Li, Efstratios Gavves, Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. 2011b. Personalizing automated image annotation using cross-entropy. In Proc. of ACM MM. 233--242.
    [49]
    Xirong Li and Cees G. M. Snoek. 2013. Classifying tag relevance with relevant positive and negative examples. In Proc. of ACM MM. 485--488.
    [50]
    Xirong Li, Cees G. M. Snoek, and Marcel Worring. 2009a. Annotating images by harnessing worldwide user-tagged photos. In Proc. of ICASSP. 3717--3720.
    [51]
    Xirong Li, Cees G. M. Snoek, and Marcel Worring. 2009b. Learning social tag relevance by neighbor voting. IEEE Transactions on Multimedia 11, 7 (2009), 1310--1322.
    [52]
    Xirong Li, Cees G. M. Snoek, and Marcel Worring. 2010. Unsupervised multi-feature tag relevance learning for social image retrieval. In Proc. of ACM CIVR. 10--17.
    [53]
    Xirong Li, Cees G. M. Snoek, Marcel Worring, Dennis Koelma, and Arnold W. M. Smeulders. 2013. Bootstrapping visual categorization with relevant negatives. IEEE Transactions on Multimedia 15, 4 (2013), 933--945.
    [54]
    Xirong Li, Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. 2012. Harvesting social images for bi-concept search. IEEE Transactions on Multimedia 14, 4 (2012), 1091--1104.
    [55]
    Zechao Li, Jing Liu, and Hanqing Lu. 2013. Nonlinear matrix factorization with unified embedding for social tag relevance learning. Neurocomputing 105 (2013), 38--44.
    [56]
    Zechao Li, Jing Liu, Xiaobin Zhu, Tinglin Liu, and Hanqing Lu. 2010. Image annotation using multi-correlation probabilistic matrix factorization. In Proc. of ACM MM. 1187--119.
    [57]
    Hsuan-Tien Lin, Chih-Jen Lin, and Ruby C. Weng. 2007. A note on Platt’s probabilistic outputs for support vector machines. Machine Learning 68, 3 (2007), 267--276.
    [58]
    Zijia Lin, Guiguang Ding, Mingqing Hu, Jianmin Wang, and Xiaojun Ye. 2013. Image tag completion via image-specific and tag-specific linear sparse reconstructions. In Proc. of CVPR. 1618--1625.
    [59]
    Dong Liu, Xian-Sheng Hua, Meng Wang, and Hong-Jiang Zhang. 2010. Image retagging. In Proc. of ACM MM. 491--500.
    [60]
    Dong Liu, Xian-Sheng Hua, Linjun Yang, Meng Wang, and Hong-Jiang Zhang. 2009. Tag ranking. In Proc. of WWW. 351--360.
    [61]
    Dong Liu, Xian-Sheng Hua, and Hong-Jiang Zhang. 2011. Content-based tag processing for internet social images. Multimedia Tools and Applications 51, 2 (2011), 723--738.
    [62]
    Dong Liu, Shuicheng Yan, Xian-Sheng Hua, and Hong-Jiang Zhang. 2011b. Image retagging using collaborative tag propagation. IEEE Transactions on Multimedia 13, 4 (2011), 702--712.
    [63]
    Jing Liu, Zechao Li, Jinhui Tang, Yu Jiang, and Hanqing Lu. 2014. Personalized geo-specific tag recommendation for photos on social websites. IEEE Transactions on Multimedia 16, 3 (2014), 588--600.
    [64]
    Jing Liu, Yifan Zhang, Zechao Li, and Hanqing Lu. 2013. Correlation consistency constrained probabilistic matrix factorization for social tag refinement. Neurocomputing 119, 7 (2013), 3--9.
    [65]
    Yang Liu, Fei Wu, Yin Zhang, Jian Shao, and Yueting Zhuang. 2011a. Tag clustering and refinement on semantic unity graph. In Proc. of ICDM. 417--426.
    [66]
    Hao Ma, Jianke Zhu, Michael Rung-Tsong Lyu, and Irwin King. 2010. Bridging the semantic gap between image contents and tags. IEEE Transactions on Multimedia 12, 5 (2010), 462--473.
    [67]
    Subhransu Maji, Alexander C. Berg, and Jitendra Malik. 2008. Classification using intersection kernel support vector machines is efficient. In Proc. of CVPR. 1--8.
    [68]
    Ameesh Makadia, Vladimir Pavlovic, and Sanjiv Kumar. 2010. Baselines for image annotation. International Journal of Computer Vision 90, 1 (2010), 88--105.
    [69]
    Julian McAuley and Jure Leskovec. 2012. Image labeling on a network: Using social-network metadata for image classification. In Proc. of ECCV. 828--841.
    [70]
    Philip McParlane, Stewart Whiting, and Joemon Jose. 2013b. Improving automatic image tagging using temporal tag co-occurrence. In Proc. of MMM. 251--262.
    [71]
    Philip J. McParlane, Yashar Moshfeghi, and Joemon M. Jose. 2013a. On contextual photo tag recommendation. In Proc. of ACM SIGIR. 965--968.
    [72]
    Tao Mei, Yong Rui, Shipeng Li, and Qi Tian. 2014. Multimedia search reranking: A literature survey. Computing Surveys 46, 3 (2014), 38.
    [73]
    Ryszard S. Michalski. 1993. A theory and methodology of inductive learning. In Readings in Knowledge Acquisition and Learning. Morgan Kaufmann Publishers, 323--348.
    [74]
    Liqiang Nie, Shuicheng Yan, Meng Wang, Richang Hong, and Tat-Seng Chua. 2012. Harvesting visual concepts for image search with complex queries. In Proc. of ACM MM. 59--68.
    [75]
    Zhenxing Niu, Gang Hua, Xinbo Gao, and Qi Tian. 2014. Semi-supervised relational topic model for weakly annotated image recognition in social media. In Proc. of CVPR. 4233--4240.
    [76]
    Oded Nov and Chen Ye. 2010. Why do people tag?: Motivations for photo tagging. Communications of the ACM 53, 7 (2010), 128--131.
    [77]
    Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Nikhil Rasiwasia, Gert R. G. Lanckriet, Roger Levy, and Nuno Vasconcelos. 2014. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 3 (2014), 521--535.
    [78]
    Guo-Jun Qi, Charu Aggarwal, Qi Tian, Heng Ji, and Thomas Huang. 2012. Exploring context and content links in social media: A latent space method. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 5 (2012), 850--862.
    [79]
    Xueming Qian, Xian-Sheng Hua, Yuan Yan Tang, and Tao Mei. 2014. Social image tagging with diverse semantics. IEEE Transactions on Cybernetics 44, 12 (2014), 2493--2508.
    [80]
    Zhiming Qian, Ping Zhong, and Runsheng Wang. 2015. Tag refinement for user-contributed images via graph learning and nonnegative tensor factorization. IEEE Signal Processing Letters 22, 9 (2015), 1302--1305.
    [81]
    Fabian Richter, Stefan Romberg, Eva Hörster, and Rainer Lienhart. 2012. Leveraging community metadata for multimodal image ranking. Multimedia Tools and Applications 56, 1 (2012), 35--62.
    [82]
    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252.
    [83]
    Jitao Sang, Changsheng Xu, and Jing Liu. 2012a. User-aware image tag refinement via ternary semantic analysis. IEEE Transactions on Multimedia 14, 3 (2012), 883--895.
    [84]
    Jitao Sang, Changsheng Xu, and Dongyuan Lu. 2012b. Learn to personalized image search from the photo sharing websites. IEEE Transactions on Multimedia 14, 4 (2012), 963--974.
    [85]
    Neela Sawant, Ritendra Datta, Jia Li, and James Z. Wang. 2010. Quest for relevant tags using local interaction networks and visual content. In Proc. of ACM MIR. 231--240.
    [86]
    Neela Sawant, Jia Li, and James Z. Wang. 2011. Automatic image semantic interpretation using social action and tagging data. Multimedia Tools and Applications 51, 1 (2011), 213--246.
    [87]
    Shilad Sen, Shyong K. Lam, Al Mamunur Rashid, Dan Cosley, Dan Frankowski, Jeremy Osterhouse, F. Maxwell Harper, and John Riedl. 2006. Tagging, communities, vocabulary, evolution. In Proc. of CSCW. 181--190.
    [88]
    Börkur Sigurbjörnsson and Roelof Van Zwol. 2008. Flickr tag recommendation based on collective knowledge. In Proc. of WWW. 327--336.
    [89]
    Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proc. of ICLR.
    [90]
    Arnold W. M. Smeulders, Marcel Worring, Simone Santini, Amarnath Gupta, and Ramesh Jain. 2000. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 12 (2000), 1349--1380.
    [91]
    Nitish Srivastava and Ruslan R. Salakhutdinov. 2014. Multimodal learning with deep Boltzmann machines. Journal of Machine Learning Research 15, 1 (2014), 2949--2980.
    [92]
    Aixin Sun, Sourav S. Bhowmick, Nam Nguyen, Khanh Tran, and Ge Bai. 2011. Tag-based social image retrieval: An empirical evaluation. Journal of the American Society for Information Science and Technology 62, 12 (2011), 2364--2381.
    [93]
    Jinhui Tang, Richang Hong, Shuicheng Yan, Tat-Seng Chua, Guo-Jun Qi, and Ramesh Jain. 2011. Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images. ACM Transactions on Intelligent Systems and Technology 2, 2 (2011), 14:1--14:15.
    [94]
    Jinhui Tang, Shuicheng Yan, Richang Hong, Guo-Jun Qi, and Tat-Seng Chua. 2009. Inferring semantic concepts from community-contributed images and noisy tags. In Proc. of ACM MM. 223--232.
    [95]
    Ba Quan Truong, Aixin Sun, and Sourav S. Bhowmick. 2012. Content is still king: The effect of neighbor voting schemes on tag relevance for social image retrieval. In Proc. of ACM ICMR. 9:1--9:8.
    [96]
    Ledyard R. Tucker. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31, 3 (1966), 279--311.
    [97]
    Tiberio Uricchio, Lamberto Ballan, Marco Bertini, and Alberto Del Bimbo. 2013. An evaluation of nearest-neighbor methods for tag refinement. In Proc. of ICME. 1--6.
    [98]
    Koen E. A. Van De Sande, Theo Gevers, and Cees G. M. Snoek. 2010. Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 9 (2010), 1582--1596.
    [99]
    Jakob Verbeek, Matthieu Guillaumin, Thomas Mensink, and Cordelia Schmid. 2010. Image annotation with TagProp on the MIRFLICKR set. In Proc. of ACM MIR. 537--546.
    [100]
    Daan T. J. Vreeswijk, Cees G. M. Snoek, Koen E. A. van de Sande, and Arnold W. M. Smeulders. 2012. All vehicles are cars: Subclass preferences in container concepts. In Proc. of ACM ICMR. 8:1--8:7.
    [101]
    Changhu Wang, Feng Jing, Lei Zhang, and Hong-Jiang Zhang. 2006. Image annotation refinement using random walk with restarts. In Proc. of ACM MM. 647--650.
    [102]
    Gang Wang, Derek Hoiem, and David Forsyth. 2009. Building text features for object image classification. In Proc. of CVPR. 1367--1374.
    [103]
    Jingdong Wang, Jiazhen Zhou, Hao Xu, Tao Mei, Xian-Sheng Hua, and Shipeng Li. 2014. Image tag refinement by regularized latent Dirichlet allocation. Computer Vision and Image Understanding 124 (2014), 61--70.
    [104]
    Meng Wang, Bingbing Ni, Xian-Sheng Hua, and Tat-Seng Chua. 2012. Assistive tagging: A survey of multimedia tagging with human-computer joint exploration. Computing Surveys 44, 4 (2012), 25:1--25:24.
    [105]
    Meng Wang, Kuiyuan Yang, Xian-Sheng Hua, and Hong-Jiang Zhang. 2010. Towards a relevant and diverse search of social images. IEEE Transactions on Multimedia 12, 8 (2010), 829--842.
    [106]
    Lei Wu, Xian-Sheng Hua, Nenghai Yu, Wei-Ying Ma, and Shipeng Li. 2008. Flickr distance. In Proc. of ACM MM. 31--40.
    [107]
    Lei Wu, Rong Jin, and Anubhav K. Jain. 2013. Tag completion for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 3 (2013), 716--727.
    [108]
    Lei Wu, Linjun Yang, Nenghai Yu, and Xian-Sheng Hua. 2009. Learning to tag. In Proc. of WWW. 361--370.
    [109]
    Pengcheng Wu, Steven Chu-Hong Hoi, Peilin Zhao, and Ying He. 2011. Mining social images with distance metric learning for automated image tagging. In Proc. of ACM WSDM. 97--206.
    [110]
    Zhibiao Wu and Martha Palmer. 1994. Verbs semantics and lexical selection. In Proc. of ACL. 133--138.
    [111]
    Hao Xu, Jingdong Wang, Xian-Sheng Hua, and Shipeng Li. 2009. Tag refinement by regularized LDA. In Proc. of ACM MM. 573--576.
    [112]
    Xing Xu, Akira Shimada, and Rin-ichiro Taniguchi. 2014. Tag completion with defective tag assignments via image-tag re-weighting. In Proc. of ICME. 1--6.
    [113]
    Kuiyuan Yang, Xian-Sheng Hua, Meng Wang, and Hong-Jiang Zhang. 2011. Tag tagging: Towards more descriptive keywords of image content. IEEE Transactions on Multimedia 13, 4 (2011), 662--673.
    [114]
    Yang Yang, Yue Gao, Hanwang Zhang, Jie Shao, and Tat-Seng Chua. 2014. Image tagging with social assistance. In Proc. of ACM ICMR. 81--88.
    [115]
    Bolei Zhou, Vignesh Jagadeesh, and Robinson Piramuthu. 2015. ConceptLearner: Discovering visual concepts from weakly labeled image collections. In Proc. of CVPR.
    [116]
    Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with hypergraphs: Clustering, classification, and embedding. In Proc. of NIPS. 1601--1608.
    [117]
    Guangyu Zhu, Shuicheng Yan, and Yi Ma. 2010. Image tag refinement towards low-rank, content-tag prior and error sparsity. In Proc. of ACM MM. 461--470.
    [118]
    Shiai Zhu, Chong-Wah Ngo, and Yu-Gang Jiang. 2012. Sampling and ontologically pooling web images for visual concept learning. IEEE Transactions on Multimedia 14, 4 (2012), 1068--1078.
    [119]
    Xiaofei Zhu, Wolfgang Nejdl, and Mihai Georgescu. 2014. An adaptive teleportation random walk model for learning social tag relevance. In Proc. of ACM SIGIR. 223--232.
    [120]
    Jinfeng Zhuang and Steven C. H. Hoi. 2011. A two-view learning approach for image tag ranking. In Proc. of ACM WSDM. 625--634.
    [121]
    Amel Znaidia, Hervé Le Borgne, and Céline Hudelot. 2013. Tag completion based on belief theory and neighbor voting. In Proc. of ACM ICMR. 49--56.

    Cited By

    View all
    • (2024)The Contemporary Art of Image Search: Iterative User Intent Expansion via Vision-Language ModelProceedings of the ACM on Human-Computer Interaction10.1145/36410198:CSCW1(1-31)Online publication date: 26-Apr-2024
    • (2024)Digital Images – The Bread and Butter of Computer VisionSynthetic Data10.1007/978-3-031-47560-3_5(89-106)Online publication date: 4-Jan-2024
    • (2023)An Image-Text Matching Method for Multi-Modal RobotsJournal of Organizational and End User Computing10.4018/JOEUC.33470136:1(1-21)Online publication date: 8-Dec-2023
    • Show More Cited By

    Index Terms

    1. Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement, and Retrieval

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Computing Surveys
        ACM Computing Surveys  Volume 49, Issue 1
        March 2017
        705 pages
        ISSN:0360-0300
        EISSN:1557-7341
        DOI:10.1145/2911992
        • Editor:
        • Sartaj Sahni
        Issue’s Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 06 June 2016
        Accepted: 01 March 2016
        Revised: 01 December 2015
        Received: 01 March 2015
        Published in CSUR Volume 49, Issue 1

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Social media
        2. content-based image retrieval
        3. social tagging
        4. tag assignment
        5. tag refinement
        6. tag relevance
        7. tag retrieval

        Qualifiers

        • Survey
        • Research
        • Refereed

        Funding Sources

        • Research Funds of Renmin University of China
        • NSFC
        • SRF for ROCS, SEM
        • SRFDP
        • STW STORY project, Telecom Italia PhD
        • Dutch national program COMMIT
        • AQUIS-CH
        • EC's FP7
        • Fundamental Research Funds for the Central Universities
        • Tuscany Region (Italy)

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)47
        • Downloads (Last 6 weeks)3
        Reflects downloads up to 26 Jul 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)The Contemporary Art of Image Search: Iterative User Intent Expansion via Vision-Language ModelProceedings of the ACM on Human-Computer Interaction10.1145/36410198:CSCW1(1-31)Online publication date: 26-Apr-2024
        • (2024)Digital Images – The Bread and Butter of Computer VisionSynthetic Data10.1007/978-3-031-47560-3_5(89-106)Online publication date: 4-Jan-2024
        • (2023)An Image-Text Matching Method for Multi-Modal RobotsJournal of Organizational and End User Computing10.4018/JOEUC.33470136:1(1-21)Online publication date: 8-Dec-2023
        • (2023)Meta-learning Advisor Networks for Long-tail and Noisy Labels in Social Image ClassificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/358436019:5s(1-23)Online publication date: 7-Jun-2023
        • (2023)Deep Learning for Instance Retrieval: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.321859145:6(7270-7292)Online publication date: 1-Jun-2023
        • (2023)A Review on Content Based Image Retrieval Techniques2023 International Conference on Circuit Power and Computing Technologies (ICCPCT)10.1109/ICCPCT58313.2023.10245360(1251-1256)Online publication date: 10-Aug-2023
        • (2023)A Triplet-loss Dilated Residual Network for High-Resolution Representation Learning in Image RetrievalJournal of Signal Processing Systems10.1007/s11265-023-01865-995:4(529-541)Online publication date: 25-Apr-2023
        • (2023)Unsupervised knowledge representation of panoramic dental X-ray images using SVG image-and-object clusteringMultimedia Systems10.1007/s00530-023-01099-629:4(2293-2322)Online publication date: 24-May-2023
        • (2022)Scalable Image Annotation by Summarizing Training Samples into Labeled PrototypesSignal and Data Processing10.52547/jsdp.18.4.4918:4(49-68)Online publication date: 1-Mar-2022
        • (2022)The Image Annotation Refinement in Embedding Feature Space based on Mutual InformationInternational Journal of Circuits, Systems and Signal Processing10.46300/9106.2022.16.2316(191-201)Online publication date: 10-Jan-2022
        • Show More Cited By

        View Options

        Get Access

        Login options

        Full Access

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media