Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Improving the representation of image descriptions for semantic image retrieval with RDF

Published: 25 June 2024 Publication History
  • Get Citation Alerts
  • Abstract

    The past few years have witnessed a surge of interest in many topics at the intersection of natural language processing and computer vision. In particular, using objects together with their attributes and relations to represent images or interpret languages has been proved useful across a wide variety of applications. The goal of this work is to provide an improved RDF-based model to represent images for enhancing textual based image retrieval. We use natural language processing tools to obtain a set of objects, attributes and relations; and then model them into graphical structures with RDF-based model. We also conduct some preliminary experiments to show how to handle textual based image retrieval for complex queries or multilingual queries. The experimental results show that our approach improves the representation of image descriptions, which is suitable for enhancing image retrieval with high-level semantics.

    References

    [1]
    Wang Z., Zhou J., Ma J., Li J., Ai J., Yang Y., Discovering attractive segments in the user-generated video streams, Inf. Process. Manage. 57 (1) (2020).
    [2]
    Yang Y., Edu u A.Y., Fermuller C., Deepiu: an architecture for image understanding, Adv. Cogn. Syst. (2016).
    [3]
    Xu X., Lu H., Song J., Yang Y., Shen H.T., Li X., Ternary adversarial networks with self-supervision for zero-shot cross-modal retrieval, IEEE Trans. Cybern. 50 (6) (2020) 2400–2413.
    [4]
    Wang Z., Chen K., Zhang M., He P., Wang Y., Zhu P., Yang Y., Multi-scale aggregation network for temporal action proposals, Pattern Recognit. Lett. 122 (2019) 60–65.
    [5]
    Xu X., Lin K., Gao L., Lu H., Shen H.T., Li X., Cross-modal common representations by private-shared subspaces separation, IEEE Trans. Cybern. (2020) 1–14.
    [6]
    Ghosh S., Das N., Gonçalves T., Quaresma P., Representing image captions as concept graphs using semantic information, in: 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, 2016, pp. 162–167.
    [7]
    Xu X., Wang T., Yang Y., Zuo L., Shen F., Shen H.T., Cross-modal attention with semantic consistence for image-text matching, IEEE Trans. Neural Netw. Learn. Syst. (2020).
    [8]
    O. Vinyals, A. Toshev, S. Bengio, D. Erhan, Show and tell: A neural image caption generator, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3156–3164.
    [9]
    Q. You, H. Jin, Z. Wang, C. Fang, J. Luo, Image captioning with semantic attention, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4651–4659.
    [10]
    Anderson P., Fernando B., Johnson M., Gould S., Spice: Semantic propositional image caption evaluation, in: European Conference on Computer Vision, Springer, 2016, pp. 382–398.
    [11]
    J. Johnson, R. Krishna, M. Stark, L.-J. Li, D. Shamma, M. Bernstein, L. Fei-Fei, Image retrieval using scene graphs, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3668–3678.
    [12]
    S. Schuster, R. Krishna, A. Chang, L. Fei-Fei, C.D. Manning, Generating semantically precise scene graphs from textual descriptions for improved image retrieval, in: Proceedings of the Fourth Workshop on Vision and Language, 2015, pp. 70–80.
    [13]
    Aditya S., Yang Y., Baral C., Fermuller C., Aloimonos Y., From images to sentences through scene description graphs using commonsense reasoning and knowledge, 2015, arXiv preprint arXiv:1511.03292.
    [14]
    Manola F., Miller E., McBride B., Resource Description Framework (RDF) Primer, Vol. 10, W3C Recommendation, 2004.
    [15]
    Hodosh M., Young P., Hockenmaier J., Framing image description as a ranking task: Data, models and evaluation metrics, J. Artificial Intelligence Res. 47 (2013) 853–899.
    [16]
    Berners-Lee T., Hendler J., Lassila O., et al., The semantic web, Sci. Amer. 284 (5) (2001) 28–37.
    [17]
    Prud’Hommeaux E., Carothers G., Beckett D., Berners-Lee T., Rdf 1.1 Turtle: Terse RDF Triple Language, Vol. 25, W3C Recommendation, 2014, pp. 2008–2014.
    [18]
    Prud’Hommeaux E., Seaborne A., et al., Sparql Query Language for rdf (Working Draft), W3C, 2007.
    [19]
    P. Bard, S. Participants, The SESAME project: an overview and main results, in: Proc. of 13th World Conf. on Earthquake Engineering, Vancouver, BC, Canada, August, 2004, pp. 1–6.
    [20]
    Grobe M., Rdf, jena, sparql and the ‘semantic web’, in: Proceedings of the 37th Annual ACM SIGUCCS Fall Conference: Communication and Collaboration, ACM, 2009, pp. 131–138.
    [21]
    Erling O., Mikhailov I., RDF support in the virtuoso DBMS, in: Networked Knowledge-Networked Media, Springer, 2009, pp. 7–24.
    [22]
    Liu H., Singh P., ConceptNet–a practical commonsense reasoning tool-kit, BT Technol. J. 22 (4) (2004) 211–226.
    [23]
    Speer R., Havasi C., Representing general relational knowledge in conceptnet 5, in: LREC, 2012, pp. 3679–3686.
    [24]
    Sharma A., Vo N.H., Aditya S., Baral C., Towards addressing the winograd schema challenge?building and using a semantic parser and a knowledge hunting module, in: Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015.
    [25]
    Dong Y., Zhou Y., Li C., Ge J., Han Y., He M., Liu D., Zhou X., Luo B., Establish evidence chain model on chinese criminal judgment documents using text similarity measure, in: International Conference of Pioneering Computer Scientists, Engineers and Educators, Springer, 2018, pp. 27–40.
    [26]
    Kipf T.N., Welling M., Semi-supervised classification with graph convolutional networks, 2016, arXiv preprint arXiv:1609.02907.
    [27]
    Chaudhuri U., Banerjee B., Bhattacharya A., Siamese graph convolutional network for content based remote sensing image retrieval, Comput. Vis. Image Underst. 184 (2019) 22–30.
    [28]
    J. Yang, J. Lu, S. Lee, D. Batra, D. Parikh, Graph r-cnn for scene graph generation, in: Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 670–685.
    [29]
    L. Zhao, X. Peng, Y. Tian, M. Kapadia, D.N. Metaxas, Semantic graph convolutional networks for 3D human pose regression, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 3425–3435.
    [30]
    Z.-M. Chen, X.-S. Wei, P. Wang, Y. Guo, Multi-label image recognition with graph convolutional networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 5177–5186.
    [31]
    Yoshikawa Y., Shigeto Y., Takeuchi A., Stair captions: Constructing a large-scale japanese image caption dataset, 2017, arXiv preprint arXiv:1705.00823.
    [32]
    Li X., Lan W., Dong J., Liu H., Adding chinese captions to images, in: International Conference on Multimedia Retrieval, ACM, 2016, pp. 271–275.
    [33]
    Li X., Xu C., Wang X., Lan W., Jia Z., Yang G., Xu J., COCO-CN for cross-lingual image tagging, captioning and retrieval, IEEE Trans. Multimed. (2019).
    [34]
    Chen H., Trouve A., Murakami K.J., Fukuda A., Semantic image retrieval for complex queries using a knowledge parser, Multimedia Tools Appl. 77 (9) (2018) 10733–10751.
    [35]
    McBride B., Boothby D., Dollin C., An introduction to RDF and the jena RDF api, 2004, p. 2007. Retrieved August, 1.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Journal of Visual Communication and Image Representation
    Journal of Visual Communication and Image Representation  Volume 73, Issue C
    Nov 2020
    236 pages

    Publisher

    Academic Press, Inc.

    United States

    Publication History

    Published: 25 June 2024

    Author Tags

    1. 41A05
    2. 41A10
    3. 65D05
    4. 65D17

    Author Tags

    1. Image representation
    2. RDF
    3. Image retrieval
    4. Cross-lingual retrieval
    5. Semantic image retrieval

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media