Abstract
With the rapid growth of multimedia data, cross-modal retrieval has received great attention. Generally, learning semantics correlation is the primary solution for eliminating heterogeneous gap between modalities. Existing approaches usually focus on modeling cross-modal correlation and category correlation, which can’t capture semantic correlation thoroughly for social multimedia data. In fact, the diverse link information is complementary to provide rich hints for semantic correlation. In this paper, we propose a novel cross-modal correlation learning approach based on subspace learning by taking heterogeneous social link and content information into account. Both intra-modal and inter-modal correlation are simultaneously considered through explicitly modeling link information. Additionally, those correlations are incorporated into final representation, which further improve the performance of cross modal retrieval effectively. Experimental results demonstrate that the proposed approach performs better comparing with several state-of-the-art cross-modal correlation learning approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sang, J., Xu, C., Jain, R.: Social multimedia ming: from special to general. In: IEEE International Symposium on Multimedia, pp. 481–485 (2016)
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Compute. 16(12), 2639–2664 (2004)
Li, D., Dimitrova, N., Li, M., Sethi, I.: Multimedia content processing through cross-modal association. In: Proceedings of ACM International Conference on Multimedia, pp. 604–611 (2003)
Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the ACM International Conference on Multimedia, pp. 251–260 (2010)
Andrew, G., Arora, R., Bilmes, J., Livescu, K.: Deep canonical correlation analysis. In: Proceedings of the 30th International Conference on Machine Learning, pp. 1247–1255 (2013)
Rosipal, R., Krämer, N.: Overview and Recent Advances in Partial Least Squares. In: Saunders, C., Grobelnik, M., Gunn, S., Shawe-Taylor, J. (eds.) SLSFS 2005. LNCS, vol. 3940, pp. 34–51. Springer, Heidelberg (2006). https://doi.org/10.1007/11752790_2
Huang, L., Peng, Y.: cross-media retrieval by exploiting fine-grained correlation at entity level. Neurocomputing 236, 123–133 (2017)
Peng, Y., Zhai, X., Zhao, Y., Huang, X.: Semi-supervised cross-media feature learning with unified patch graph regularization. IEEE Trans. Circuits Syst. Video Technol. 26(3), 583–596 (2016)
Jia, Y., Salzmann, M., Darrell, T.: Learning cross-modality similarity for multinomial data. In: Proceedings of the 11th International Conference on Computer Vision, pp. 2407–2414 (2011)
Han, X., Thomas, S.: Toward artificial synesthesia: linking images and sounds via words. In: NIPS Workshop on Machine Learning for Next Generation Computer Vision Challenges (2010)
Wang, S., Huang, Q.: Research on heterogeneous media analytics: a brief introduction. J. Integr. Technol. 4(2), March 2015
Sun, Y., Han, J.: Mining heterogeneous information networks: a structural analysis approach. SIGKDD Explor. 14(2), 20–28 (2012)
Jin, X., Luo, J., Yu, J., Wang, G., Joshi, D., Han, J.: Reinforced similarity integration in image-rich information networks. IEEE Trans. Knowl. Data Eng. 25(2), 448–460 (2013)
Sun, Y., Han, J., Yan, X., Yu, P., Wu, T.: Pathsim: meta path-based top-k similarity search in heterogeneous information networks. VLDB 4(11), 992–1003 (2011)
Wang, K., He, R., Wang, W., Wang, L., Tan, T.: Learning coupled feature spaces for cross-modal matching. In: IEEE International Conference on Computer Vision, pp. 2088–2095 (2013)
Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceedings of ACM International Conference Image Video Retrieval, pp. 1–9 (2009)
Acknowledgments
This work is supported by Chinese National Nature Science Foundations (61501050).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Zhang, L., Liu, F., Zeng, Z. (2018). Combining Link and Content Correlation Learning for Cross-Modal Retrieval in Social Multimedia. In: Zu, Q., Hu, B. (eds) Human Centered Computing. HCC 2017. Lecture Notes in Computer Science(), vol 10745. Springer, Cham. https://doi.org/10.1007/978-3-319-74521-3_54
Download citation
DOI: https://doi.org/10.1007/978-3-319-74521-3_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-74520-6
Online ISBN: 978-3-319-74521-3
eBook Packages: Computer ScienceComputer Science (R0)