research-article

Semantic Hilbert Space for Interactive Image Retrieval

Authors:

Amit Kumar Jaiswal,

Ingo FrommholzAuthors Info & Claims

ICTIR '21: Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval

Pages 307 - 315

https://doi.org/10.1145/3471158.3472253

Published: 31 August 2021 Publication History

Abstract

The paper introduces a model for interactive image retrieval utilising the geometrical framework of information retrieval (IR). We tackle the problem of image retrieval based on an expressive user information need in form of a textual-visual query, where a user is attempting to find an image similar to the picture in their mind during querying. The user information need is expressed using guided visual feedback based on Information Foraging which lets the user perception embed within the model via semantic Hilbert space (SHS). This framework is based on the mathematical formalism of quantum probabilities and aims to understand the relationship between user textual and image input, where the image in the input is considered a form of visual feedback. We propose SHS, a quantum-inspired approach where the textual-visual query is regarded analogously to a physical system that allows for modelling different system states and their dynamic changes thereof based on observations (such as queries, relevance judgements). We will be able to learn the input multimodal representation and relationships between textual-image queries for retrieving images. Our experiments are conducted on the MIT States and Fashion200k datasets that demonstrate the effectiveness of finding particular images autocratically when the user inputs are semantically expressive.

References

[1]

Omri Abend, Tom Kwiatkowski, Nathaniel J Smith, Sharon Goldwater, and Mark Steedman. 2017. Bootstrapping language acquisition. Cognition, Vol. 164 (2017), 116--143.

[2]

Diederik Aerts. 2014. Quantum theory and human perception of the macro-world. Frontiers in Psychology, Vol. 5 (2014), 554.

[3]

Diederik Aerts, Liane Gabora, and Sandro Sozzo. 2013. Concepts and their dynamics: A quantum-theoretic modeling of human thought. Topics in Cognitive Science, Vol. 5, 4 (2013), 737--772.

[4]

Gerd Berget. 2020. " Information Needs of the End Users Have Never Been Discussed" An Investigation of the User-intermediary Interaction of People with Intellectual Impairments. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. 93--102.

Digital Library

[5]

Jerome R Busemeyer and Peter D Bruza. 2012. Quantum models of cognition and decision .Cambridge University Press.

Digital Library

[6]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186.

[7]

Ingo Frommholz, Birger Larsen, Benjamin Piwowarski, Mounia Lalmas, Peter Ingwersen, and Keith Van Rijsbergen. 2010. Supporting polyrepresentation in a quantum-inspired geometrical retrieval framework. In Proceedings of the third symposium on Information interaction in context. 115--124.

Digital Library

[8]

Ingo Frommholz, Benjamin Piwowarski, Mounia Lalmas, and Keith Van Rijsbergen. 2011. Processing queries in session in a quantum-inspired IR framework. In European Conference on Information Retrieval. Springer, 751--754.

Digital Library

[9]

Dehong Gao, Linbo Jin, Ben Chen, Minghui Qiu, Peng Li, Yi Wei, Yi Hu, and Hao Wang. 2020. Fashionbert: Text and image matching with adaptive loss for cross-modal retrieval. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2251--2260.

Digital Library

[10]

Xintong Han, Zuxuan Wu, Phoenix X Huang, Xiao Zhang, Menglong Zhu, Yuan Li, Yang Zhao, and Larry S Davis. 2017. Automatic spatially-aware fashion concept discovery. In Proceedings of the IEEE International Conference on Computer Vision. 1463--1471.

[11]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[12]

Mehrdad Hosseinzadeh and Yang Wang. 2020. Composed Query Image Retrieval Using Locally Bounded Features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3596--3605.

[13]

Peter Ingwersen. 1996. Cognitive perspectives of information retrieval interaction: elements of a cognitive IR theory. Journal of documentation, Vol. 52, 1 (1996), 3--50.

[14]

Phillip Isola, Joseph J Lim, and Edward H Adelson. 2015. Discovering states and transformations in image collections. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1383--1391.

[15]

Amit Kumar Jaiswal, Haiming Liu, and Ingo Frommholz. 2020. Utilising information foraging theory for user interaction with image query auto-completion. In European Conference on Information Retrieval. Springer, 666--680.

Digital Library

[16]

Xiaoran Jin, Marc Sloan, and Jun Wang. 2013. Interactive exploratory search for multi page search results. In Proceedings of the 22nd international conference on World Wide Web. 655--666.

Digital Library

[17]

Ryan Kiros, Ruslan Salakhutdinov, and Richard S Zemel. 2014. Unifying visual-semantic embeddings with multimodal neural language models. arXiv preprint arXiv:1411.2539 (2014).

[18]

Adriana Kovashka, Devi Parikh, and Kristen Grauman. 2015. Whittlesearch: Interactive image search with relative attribute feedback. International Journal of Computer Vision, Vol. 115, 2 (2015), 185--210.

Digital Library

[19]

Jiayu Li, Min Zhang, Weizhi Ma, Yiqun Liu, and Shaoping Ma. 2020. A Multi-level Interactive Lifelog Search Engine with User Feedback. In Proceedings of the Third Annual Workshop on Lifelog Search Challenge. 29--35.

Digital Library

[20]

Qiuchi Li, Jingfei Li, Peng Zhang, and Dawei Song. 2015. Modeling multi-query retrieval tasks using density matrix transformation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. 871--874.

Digital Library

[21]

Qiuchi Li, Sagar Uprety, Benyou Wang, and Dawei Song. 2018. Quantum-Inspired Complex Word Embedding. In Proceedings of The Third Workshop on Representation Learning for NLP. 50--57.

[22]

Haiming Liu, Paul Mulholland, Dawei Song, Victoria Uren, and Stefan Rüger. 2010. Applying information foraging theory to understand user interaction with content-based image retrieval. In Proceedings of the third symposium on Information interaction in context. 135--144.

Digital Library

[23]

Yashar Moshfeghi and Joemon M Jose. 2013. On cognition, emotion, and interaction aspects of search tasks with different search intentions. In Proceedings of the 22nd international conference on World Wide Web. 931--942.

Digital Library

[24]

Behnam Neyshabur, Srinadh Bhojanapalli, David McAllester, and Nati Srebro. 2017. Exploring generalization in deep learning. In Advances in neural information processing systems. 5947--5956.

[25]

Vicki L O'Day and Robin Jeffries. 1993. Orienteering in an information landscape: how information seekers get from here to there. In Proceedings of the INTERACT'93 and CHI'93 conference on Human factors in computing systems. 438--445.

Digital Library

[26]

Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. 2018. Film: Visual reasoning with a general conditioning layer. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.

[27]

Peter Pirolli and Stuart Card. 1999. Information foraging. Psychological review, Vol. 106, 4 (1999), 643.

[28]

Benjamin Piwowarski, Ingo Frommholz, Mounia Lalmas, and Keith Van Rijsbergen. 2010. What can quantum theory bring to information retrieval. In Proceedings of the 19th ACM international conference on Information and knowledge management. 59--68.

Digital Library

[29]

Benjamin Piwowarski and Mounia Lalmas. 2009. A quantum-based model for interactive information retrieval. In Conference on the Theory of Information Retrieval. Springer, 224--231.

Digital Library

[30]

Daniel E Rose and Danny Levinson. 2004. Understanding user goals in web search. In Proceedings of the 13th international conference on World Wide Web. 13--19.

Digital Library

[31]

Adam Santoro, David Raposo, David G Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, and Timothy Lillicrap. 2017. A simple neural network module for relational reasoning. In Advances in neural information processing systems. 4967--4976.

[32]

Alessandro Sordoni, Jian-Yun Nie, and Yoshua Bengio. 2013. Modeling term dependencies with quantum language models for IR. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 653--662.

Digital Library

[33]

Chiheb Trabelsi, Olexa Bilaniuk, Ying Zhang, Dmitriy Serdyuk, Sandeep Subramanian, Joao Felipe Santos, Soroush Mehri, Negar Rostamzadeh, Yoshua Bengio, and Christopher J Pal. 2018. Deep Complex Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=H1T2hmZAb

[34]

Cornelis Joost Van Rijsbergen. 2004. The geometry of information retrieval. Cambridge University Press.

[35]

Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3156--3164.

[36]

Nam Vo, Lu Jiang, Chen Sun, Kevin Murphy, Li-Jia Li, Li Fei-Fei, and James Hays. 2019. Composing text and image for image retrieval-an empirical odyssey. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6439--6448.

[37]

Benyou Wang, Qiuchi Li, Massimo Melucci, and Dawei Song. 2019. Semantic Hilbert space for text representation learning. In The World Wide Web Conference. 3293--3299.

Digital Library

[38]

Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, and Heng Tao Shen. 2017. Adversarial cross-modal retrieval. In Proceedings of the 25th ACM international conference on Multimedia. 154--162.

Digital Library

[39]

Liwei Wang, Yin Li, and Svetlana Lazebnik. 2016. Learning deep structure-preserving image-text embeddings. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5005--5013.

[40]

Xiaohui Xie, Jiaxin Mao, Yiqun Liu, Maarten de Rijke, Qingyao Ai, Yufei Huang, Min Zhang, and Shaoping Ma. 2019. Improving Web Image Search with Contextual Information. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 1683--1692.

Digital Library

[41]

Oleg Zendel, Anna Shtok, Fiana Raiber, Oren Kurland, and J Shane Culpepper. 2019. Information needs, queries, and query performance prediction. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 395--404.

Digital Library

[42]

Liheng Zhang, Guo-Jun Qi, Liqiang Wang, and Jiebo Luo. 2019. Aet vs. aed: Unsupervised representation learning by auto-encoding transformations rather than data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2547--2555.

[43]

Liangli Zhen, Peng Hu, Xu Wang, and Dezhong Peng. 2019. Deep supervised cross-modal retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10394--10403.

Cited By

Taheri FRahbar KSalimi P(2022)Effective features in content-based image retrieval from a combination of low-level features and deep Boltzmann machineMultimedia Tools and Applications10.1007/s11042-022-13670-w82:24(37959-37982)Online publication date: 30-Aug-2022
https://dl.acm.org/doi/10.1007/s11042-022-13670-w

Index Terms

Semantic Hilbert Space for Interactive Image Retrieval
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval
        Image search
    2. Users and interactive retrieval
      1. Personalization

Recommendations

Image retrieval based on bag of images
ICIP'09: Proceedings of the 16th IEEE international conference on Image processing

Conventional relevance feedback schemes may not be suitable to all practical applications of content-based image retrieval (CBIR), since most ordinary users would like to complete their search in a single interaction, especially on the web search. In ...
Lire: lucene image retrieval: an extensible java CBIR library
MM '08: Proceedings of the 16th ACM international conference on Multimedia

LIRe (Lucene Image Retrieval) is a light weight open source Java library for content based image retrieval. It provides common and state of the art global image features and offers means for indexing and retrieval. Due to the fact that it is based on a ...
Visual information retrieval using Java and LIRE
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Visual information retrieval (VIR) is an active and vibrant research area, which attempts at providing means for organizing, indexing, annotating, and retrieving visual information (images and videos) form large, unstructured repositories. The goal of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICTIR '21: Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval

July 2021

334 pages

ISBN:9781450386111

DOI:10.1145/3471158

General Chair:
Faegheh Hasibi
Radboud University, Netherlands
,
Program Chairs:
Yi Fang
Santa Clara University, USA
,
Akiko Aizawa
National Institute of Informatics, Japan

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 August 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

MSCA European Union's H2020

Conference

ICTIR '21

Sponsor:

SIGIR

ICTIR '21: The 2021 ACM SIGIR International Conference on the Theory of Information Retrieval

July 11, 2021

Virtual Event, Canada

Acceptance Rates

Overall Acceptance Rate 235 of 527 submissions, 45%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
111
Total Downloads

Downloads (Last 12 months)22
Downloads (Last 6 weeks)4

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Taheri FRahbar KSalimi P(2022)Effective features in content-based image retrieval from a combination of low-level features and deep Boltzmann machineMultimedia Tools and Applications10.1007/s11042-022-13670-w82:24(37959-37982)Online publication date: 30-Aug-2022
https://dl.acm.org/doi/10.1007/s11042-022-13670-w

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten