Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3471158.3472253acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article

Semantic Hilbert Space for Interactive Image Retrieval

Published: 31 August 2021 Publication History

Abstract

The paper introduces a model for interactive image retrieval utilising the geometrical framework of information retrieval (IR). We tackle the problem of image retrieval based on an expressive user information need in form of a textual-visual query, where a user is attempting to find an image similar to the picture in their mind during querying. The user information need is expressed using guided visual feedback based on Information Foraging which lets the user perception embed within the model via semantic Hilbert space (SHS). This framework is based on the mathematical formalism of quantum probabilities and aims to understand the relationship between user textual and image input, where the image in the input is considered a form of visual feedback. We propose SHS, a quantum-inspired approach where the textual-visual query is regarded analogously to a physical system that allows for modelling different system states and their dynamic changes thereof based on observations (such as queries, relevance judgements). We will be able to learn the input multimodal representation and relationships between textual-image queries for retrieving images. Our experiments are conducted on the MIT States and Fashion200k datasets that demonstrate the effectiveness of finding particular images autocratically when the user inputs are semantically expressive.

References

[1]
Omri Abend, Tom Kwiatkowski, Nathaniel J Smith, Sharon Goldwater, and Mark Steedman. 2017. Bootstrapping language acquisition. Cognition, Vol. 164 (2017), 116--143.
[2]
Diederik Aerts. 2014. Quantum theory and human perception of the macro-world. Frontiers in Psychology, Vol. 5 (2014), 554.
[3]
Diederik Aerts, Liane Gabora, and Sandro Sozzo. 2013. Concepts and their dynamics: A quantum-theoretic modeling of human thought. Topics in Cognitive Science, Vol. 5, 4 (2013), 737--772.
[4]
Gerd Berget. 2020. " Information Needs of the End Users Have Never Been Discussed" An Investigation of the User-intermediary Interaction of People with Intellectual Impairments. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. 93--102.
[5]
Jerome R Busemeyer and Peter D Bruza. 2012. Quantum models of cognition and decision .Cambridge University Press.
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186.
[7]
Ingo Frommholz, Birger Larsen, Benjamin Piwowarski, Mounia Lalmas, Peter Ingwersen, and Keith Van Rijsbergen. 2010. Supporting polyrepresentation in a quantum-inspired geometrical retrieval framework. In Proceedings of the third symposium on Information interaction in context. 115--124.
[8]
Ingo Frommholz, Benjamin Piwowarski, Mounia Lalmas, and Keith Van Rijsbergen. 2011. Processing queries in session in a quantum-inspired IR framework. In European Conference on Information Retrieval. Springer, 751--754.
[9]
Dehong Gao, Linbo Jin, Ben Chen, Minghui Qiu, Peng Li, Yi Wei, Yi Hu, and Hao Wang. 2020. Fashionbert: Text and image matching with adaptive loss for cross-modal retrieval. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2251--2260.
[10]
Xintong Han, Zuxuan Wu, Phoenix X Huang, Xiao Zhang, Menglong Zhu, Yuan Li, Yang Zhao, and Larry S Davis. 2017. Automatic spatially-aware fashion concept discovery. In Proceedings of the IEEE International Conference on Computer Vision. 1463--1471.
[11]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[12]
Mehrdad Hosseinzadeh and Yang Wang. 2020. Composed Query Image Retrieval Using Locally Bounded Features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3596--3605.
[13]
Peter Ingwersen. 1996. Cognitive perspectives of information retrieval interaction: elements of a cognitive IR theory. Journal of documentation, Vol. 52, 1 (1996), 3--50.
[14]
Phillip Isola, Joseph J Lim, and Edward H Adelson. 2015. Discovering states and transformations in image collections. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1383--1391.
[15]
Amit Kumar Jaiswal, Haiming Liu, and Ingo Frommholz. 2020. Utilising information foraging theory for user interaction with image query auto-completion. In European Conference on Information Retrieval. Springer, 666--680.
[16]
Xiaoran Jin, Marc Sloan, and Jun Wang. 2013. Interactive exploratory search for multi page search results. In Proceedings of the 22nd international conference on World Wide Web. 655--666.
[17]
Ryan Kiros, Ruslan Salakhutdinov, and Richard S Zemel. 2014. Unifying visual-semantic embeddings with multimodal neural language models. arXiv preprint arXiv:1411.2539 (2014).
[18]
Adriana Kovashka, Devi Parikh, and Kristen Grauman. 2015. Whittlesearch: Interactive image search with relative attribute feedback. International Journal of Computer Vision, Vol. 115, 2 (2015), 185--210.
[19]
Jiayu Li, Min Zhang, Weizhi Ma, Yiqun Liu, and Shaoping Ma. 2020. A Multi-level Interactive Lifelog Search Engine with User Feedback. In Proceedings of the Third Annual Workshop on Lifelog Search Challenge. 29--35.
[20]
Qiuchi Li, Jingfei Li, Peng Zhang, and Dawei Song. 2015. Modeling multi-query retrieval tasks using density matrix transformation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. 871--874.
[21]
Qiuchi Li, Sagar Uprety, Benyou Wang, and Dawei Song. 2018. Quantum-Inspired Complex Word Embedding. In Proceedings of The Third Workshop on Representation Learning for NLP. 50--57.
[22]
Haiming Liu, Paul Mulholland, Dawei Song, Victoria Uren, and Stefan Rüger. 2010. Applying information foraging theory to understand user interaction with content-based image retrieval. In Proceedings of the third symposium on Information interaction in context. 135--144.
[23]
Yashar Moshfeghi and Joemon M Jose. 2013. On cognition, emotion, and interaction aspects of search tasks with different search intentions. In Proceedings of the 22nd international conference on World Wide Web. 931--942.
[24]
Behnam Neyshabur, Srinadh Bhojanapalli, David McAllester, and Nati Srebro. 2017. Exploring generalization in deep learning. In Advances in neural information processing systems. 5947--5956.
[25]
Vicki L O'Day and Robin Jeffries. 1993. Orienteering in an information landscape: how information seekers get from here to there. In Proceedings of the INTERACT'93 and CHI'93 conference on Human factors in computing systems. 438--445.
[26]
Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. 2018. Film: Visual reasoning with a general conditioning layer. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
[27]
Peter Pirolli and Stuart Card. 1999. Information foraging. Psychological review, Vol. 106, 4 (1999), 643.
[28]
Benjamin Piwowarski, Ingo Frommholz, Mounia Lalmas, and Keith Van Rijsbergen. 2010. What can quantum theory bring to information retrieval. In Proceedings of the 19th ACM international conference on Information and knowledge management. 59--68.
[29]
Benjamin Piwowarski and Mounia Lalmas. 2009. A quantum-based model for interactive information retrieval. In Conference on the Theory of Information Retrieval. Springer, 224--231.
[30]
Daniel E Rose and Danny Levinson. 2004. Understanding user goals in web search. In Proceedings of the 13th international conference on World Wide Web. 13--19.
[31]
Adam Santoro, David Raposo, David G Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, and Timothy Lillicrap. 2017. A simple neural network module for relational reasoning. In Advances in neural information processing systems. 4967--4976.
[32]
Alessandro Sordoni, Jian-Yun Nie, and Yoshua Bengio. 2013. Modeling term dependencies with quantum language models for IR. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 653--662.
[33]
Chiheb Trabelsi, Olexa Bilaniuk, Ying Zhang, Dmitriy Serdyuk, Sandeep Subramanian, Joao Felipe Santos, Soroush Mehri, Negar Rostamzadeh, Yoshua Bengio, and Christopher J Pal. 2018. Deep Complex Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=H1T2hmZAb
[34]
Cornelis Joost Van Rijsbergen. 2004. The geometry of information retrieval. Cambridge University Press.
[35]
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3156--3164.
[36]
Nam Vo, Lu Jiang, Chen Sun, Kevin Murphy, Li-Jia Li, Li Fei-Fei, and James Hays. 2019. Composing text and image for image retrieval-an empirical odyssey. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6439--6448.
[37]
Benyou Wang, Qiuchi Li, Massimo Melucci, and Dawei Song. 2019. Semantic Hilbert space for text representation learning. In The World Wide Web Conference. 3293--3299.
[38]
Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, and Heng Tao Shen. 2017. Adversarial cross-modal retrieval. In Proceedings of the 25th ACM international conference on Multimedia. 154--162.
[39]
Liwei Wang, Yin Li, and Svetlana Lazebnik. 2016. Learning deep structure-preserving image-text embeddings. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5005--5013.
[40]
Xiaohui Xie, Jiaxin Mao, Yiqun Liu, Maarten de Rijke, Qingyao Ai, Yufei Huang, Min Zhang, and Shaoping Ma. 2019. Improving Web Image Search with Contextual Information. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 1683--1692.
[41]
Oleg Zendel, Anna Shtok, Fiana Raiber, Oren Kurland, and J Shane Culpepper. 2019. Information needs, queries, and query performance prediction. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 395--404.
[42]
Liheng Zhang, Guo-Jun Qi, Liqiang Wang, and Jiebo Luo. 2019. Aet vs. aed: Unsupervised representation learning by auto-encoding transformations rather than data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2547--2555.
[43]
Liangli Zhen, Peng Hu, Xu Wang, and Dezhong Peng. 2019. Deep supervised cross-modal retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10394--10403.

Cited By

View all
  • (2022)Effective features in content-based image retrieval from a combination of low-level features and deep Boltzmann machineMultimedia Tools and Applications10.1007/s11042-022-13670-w82:24(37959-37982)Online publication date: 30-Aug-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICTIR '21: Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval
July 2021
334 pages
ISBN:9781450386111
DOI:10.1145/3471158
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. image search
  2. information retrieval
  3. quantum theory

Qualifiers

  • Research-article

Funding Sources

  • MSCA European Union's H2020

Conference

ICTIR '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 235 of 527 submissions, 45%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)4
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Effective features in content-based image retrieval from a combination of low-level features and deep Boltzmann machineMultimedia Tools and Applications10.1007/s11042-022-13670-w82:24(37959-37982)Online publication date: 30-Aug-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media