research-article

Teach Machine How to Read: Reading Behavior Inspired Relevance Estimation

Authors:

Shaoping MaAuthors Info & Claims

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 795 - 804

https://doi.org/10.1145/3331184.3331205

Published: 18 July 2019 Publication History

Abstract

Retrieval models aim to estimate the relevance of a document to a certain query. Although existing retrieval models have gained much success in both deepening our understanding of information seeking behavior and constructing practical retrieval systems (e.g. Web search engines), we have to admit that the models work in a rather different manner than how humans make relevance judgments. In this paper, we aim to reexamine the existing models as well as to propose new ones based on the findings in how human read documents during relevance judgment. First, we summarize a number of reading heuristics from practical user behavior patterns, which are categorized into implicit and explicit heuristics. By reviewing a variety of existing retrieval models, we find that most of them only satisfy a part of these reading heuristics. To evaluate the effectiveness of each heuristic, we conduct an ablation study and find that most heuristics have positive impacts on retrieval performance. We further integrate all the effective heuristics into a new retrieval model named Reading Inspired Model (RIM). Specifically, implicit reading heuristics are incorporated into the model framework and explicit reading heuristics are modeled as a Markov Decision Process and learned by reinforcement learning. Experimental results on a large-scale public available benchmark dataset and two test sets from NTCIR WWW tasks show that RIM outperforms most existing models, which illustrates the effectiveness of the reading heuristics. We believe that this work contributes to constructing retrieval models with both higher retrieval performance and better explainability.

Supplementary Material

MP4 File (cite4-17h00-d3.mp4)

Download
443.58 MB

References

[1]

Michael Bendersky, Donald Metzler, and W. Bruce Croft. 2010. Learning concept importance using a weighted dependence model. (2010), 31--40.

Digital Library

[2]

A. Chuklin, I. Markov, and M. de Rijke. 2015. Click models for web search. Synthesis lectures on information concepts, retrieval, and services.

[3]

Josipa Crnic. 1983. Introduction to Modern Information Retrieval.

[4]

Georges E. Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 331--338.

Digital Library

[5]

Michael Eisenberg and Carol Barry. 1988. Order effects: A study of the possible influence of presentation order on user judgments of document relevance. Journal of the American Society for Information Science 39, 5 (1988), 293--300.

[6]

Yixing Fan, Jiafeng Guo, Yanyan Lan, Jun Xu, Chengxiang Zhai, and Xueqi Cheng. 2018. Modeling Diverse Relevance Patterns in Ad-hoc Retrieval. International ACM SIGIR Conference on Research and development in Information Retrieval (2018), 375--384.

Digital Library

[7]

Hui Fang, Tao Tao, and ChengXiang Zhai. 2004. A formal study of information retrieval heuristics. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 49--56.

Digital Library

[8]

Tsujui Fu and Weiyun Ma. 2018. Speed Reading: Learning to Read ForBackward via Shuttle. (2018), 4439--4448.

[9]

Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A Deep Relevance Matching Model for Ad-hoc Retrieval. In Acm International on Conference on Information and Knowledge Management.

Digital Library

[10]

Michael Hahn and Frank Keller. 2016. Modeling Human Reading with Neural Attention. empirical methods in natural language processing (2016), 85--95.

[11]

Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional Neural Network Architectures for Matching Natural Language Sentences. Neural Information Processing Systems (2014), 2042--2050.

Digital Library

[12]

Po Sen Huang, Xiaodong He, Jianfeng Gao, Deng Li, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In Acm International Conference on Conference on Information and Knowledge Management.

Digital Library

[13]

Kai Hui, Andrew Yates, Klaus Berberich, and Gerard de Melo. 2017. Pacrr: A position-aware neural ir model for relevance matching. arXiv preprint arXiv:1704.03940 (2017).

[14]

Mao Jiaxin, Sakai Tetsuya, Luo Cheng, Xiao Peng, Liu Yiqun, and Dou Zhicheng. 2018. Overview of the ntcir-14 we want web task. Proc. NTCIR-14, To appear (2018).

[15]

Vijay R. Konda and John N. Tsitsiklis. 2000. Actor-critic algorithms. In Advances in neural information processing systems. 1008--1014.

Digital Library

[16]

Xiangsheng Li, Yiqun Liu, Jiaxin Mao, Zexue He, Min Zhang, and Shaoping Ma. 2018. Understanding Reading Attention Distribution during Relevance Judgement. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 733--742.

Digital Library

[17]

Pang Liang, Yanyan Lan, Jiafeng Guo, Jun Xu, and Xueqi Cheng. 2017. A Deep Investigation of Deep IR Models. (2017).

[18]

Xianggen Liu, Lili Mou, Haotian Cui, Zhengdong Lu, and Sen Song. 2018. Jumper: Learning when to make classification decisions in reading. arXiv preprint arXiv:1807.02314 (2018).

Digital Library

[19]

Cheng Luo, Tetsuya Sakai, Yiqun Liu, Zhicheng Dou, Chenyan Xiong, and Jingfang Xu. 2017. Overview of the ntcir-13 we want web task. Proc. NTCIR-13 (2017).

[20]

Yifan Nie, Yanling Li, and Jian-Yun Nie. 2018. Empirical Study of Multi-level Convolution Models for IR Based on Representations and Interactions. In Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval. ACM, 59--66.

Digital Library

[21]

Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text Matching as Image Recognition. In AAAI. 2793--2799.

Digital Library

[22]

Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, and Xueqi Cheng. 2017. Deeprank: A new deep architecture for relevance ranking in information retrieval. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 257--266.

Digital Library

[23]

K. Rayner. 2009. Eye movements and attention in reading, scene perception, and visual search. Quarterly Journal of Experimental Psychology 62, 8 (2009), 1457.

[24]

Erik D. Reichle, Alexander Pollatsek, Donald L. Fisher, and Keith Rayner. 1998. Toward a model of eye movement control in reading. Psychological Review 105, 1 (1998), 125--157.

[25]

E. D. Reichle, K. Rayner, and A. Pollatsek. 2003. The E-Z reader model of eyemovement control in reading: comparisons to other models. Behavioral and Brain Sciences 26, 4 (2003), 445--476.

[26]

Stephen E. Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. International ACM SIGIR Conference on Research and development in Information Retrieval (1994), 232--241.

Digital Library

[27]

Yelong Shen, Posen Huang, Jianfeng Gao, and Weizhu Chen. 2017. ReasoNet: Learning to Stop Reading in Machine Comprehension. knowledge discovery and data mining (2017), 1047--1055.

Digital Library

[28]

Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems. 1057--1063.

Digital Library

[29]

Tao Tao and ChengXiang Zhai. 2007. An exploration of proximity measures in information retrieval. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 295--302.

Digital Library

[30]

Ronald J. Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8, 3-4 (1992), 229--256.

Digital Library

[31]

Ho Chung Wu, Robert W. P. Luk, Kam-Fai Wong, and K. L. Kwok. 2007. A retrospective study of a hybrid document-context based retrieval model. Information processing & management 43, 5 (2007), 1308--1331.

Digital Library

[32]

Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-End Neural Ad-hoc Ranking with Kernel Pooling. International ACM SIGIR Conference on Research and development in Information Retrieval (2017).

Digital Library

[33]

Adams Wei Yu, Hongrae Lee, and Quoc V. Le. 2017. Learning to Skim Text. meeting of the association for computational linguistics 1 (2017), 1880--1890.

[34]

Keyi Yu, Yang Liu, Alexander G. Schwing, and Jian Peng. 2018. Fast and Accurate Text Classification: Skimming, Rereading and Early Stopping. (2018).

[35]

Tianyang Zhang, Minlie Huang, and Li Zhao. 2018. Learning Structured Representation for Text Classification via Reinforcement Learning. (2018), 6053--6060.

[36]

Yukun Zheng, Zhen Fan, Yiqun Liu, Cheng Luo, Min Zhang, and Shaoping Ma. 2018. Sogou-QCL: A New Dataset with Click Relevance Label. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 1117--1120.

Digital Library

Cited By

Gienapp LScells HDeckers NBevendorff JWang SKiesel JSyed SFröbe MZuccon GStein BHagen MPotthast MHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Evaluating Generative Ad Hoc Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657849(1916-1929)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657849
Wu ZMao JXu KSong DHuang H(2023)A Passage-Level Reading Behavior Model for Mobile SearchProceedings of the ACM Web Conference 202310.1145/3543507.3583343(3236-3246)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583343
Zhang KLv GWu LChen ELiu QWang M(2023)LadRa-Net: Locally Aware Dynamic Reread Attention Net for Sentence Semantic MatchingIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.310318534:2(853-866)Online publication date: Feb-2023
https://doi.org/10.1109/TNNLS.2021.3103185
Show More Cited By

Index Terms

Teach Machine How to Read: Reading Behavior Inspired Relevance Estimation
1. Information systems
  1. Information retrieval

Recommendations

Information retrieval models in the context of retrieval tasks

Information retrieval models are reviewed from the viewpoint of retrieval needs that cause different types of retrieval tasks. A generalized iterative query-response scheme of the retrieval process is presented. The characteristics of the system of ...
Understanding the role of human-inspired heuristics for retrieval models
Abstract
Relevance estimation is one of the core concerns of information retrieval (IR) studies. Although existing retrieval models gained much success in both deepening our understanding of information seeking behavior and building effective retrieval ...
Research on Domain Ontology Based Information Retrieval Model
IUCE '09: Proceedings of the 2009 International Symposium on Intelligent Ubiquitous Computing and Education

This paper gives out an information retrieval system model based on Ontology. First of all, the weak point and its reason are analyzed for traditional information retrieval system,. Then, the concept of Ontology and its application in intelligent ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2019

1512 pages

ISBN:9781450361729

DOI:10.1145/3331184

General Chairs:
Benjamin Piwowarski
CNRS - Sorbonne Universite, France
,
Max Chevalier
Universite de Toulouse, CNRS, France
,
Eric Gaussier
Universite Grenoble Alpes, CNRS, France
,
Program Chairs:
Yoelle Maarek
Amazon Research, Israel
,
Jian-Yun Nie
University of Montreal, Canada
,
Falk Scholer
RMIT University, Australia

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key Research and Development Program of China
Natural Science Foundation of China

Conference

SIGIR '19

Sponsor:

SIGIR

SIGIR '19: The 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

July 21 - 25, 2019

Paris, France

Acceptance Rates

SIGIR'19 Paper Acceptance Rate 84 of 426 submissions, 20%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
305
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)2

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gienapp LScells HDeckers NBevendorff JWang SKiesel JSyed SFröbe MZuccon GStein BHagen MPotthast MHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Evaluating Generative Ad Hoc Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657849(1916-1929)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657849
Wu ZMao JXu KSong DHuang H(2023)A Passage-Level Reading Behavior Model for Mobile SearchProceedings of the ACM Web Conference 202310.1145/3543507.3583343(3236-3246)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583343
Zhang KLv GWu LChen ELiu QWang M(2023)LadRa-Net: Locally Aware Dynamic Reread Attention Net for Sentence Semantic MatchingIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.310318534:2(853-866)Online publication date: Feb-2023
https://doi.org/10.1109/TNNLS.2021.3103185
Leonhardt JRudra KAnand A(2022)Extractive Explanations for Interpretable Text RankingACM Transactions on Information Systems10.1145/357692441:4(1-31)Online publication date: 16-Dec-2022
https://dl.acm.org/doi/10.1145/3576924
Ye ZXie XLiu YWang ZChen XZhang MMa S(2022)Towards a Better Understanding of Human Reading Comprehension with Brain SignalsProceedings of the ACM Web Conference 202210.1145/3485447.3511966(380-391)Online publication date: 25-Apr-2022
https://dl.acm.org/doi/10.1145/3485447.3511966
Chen JLiu YFang YMao JFang HYang SXie XZhang MMa SAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)Axiomatically Regularized Pre-training for Ad hoc SearchProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531943(1524-1534)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531943
Wu ZLiu YMao JZhang MMa S(2022)Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document RankingJournal of Computer Science and Technology10.1007/s11390-022-2031-y37:4(814-838)Online publication date: 30-Jul-2022
https://doi.org/10.1007/s11390-022-2031-y
Zhang KLv GWang MChen E(2022)DGA-Net: Dynamic Gaussian Attention Network for Sentence Semantic MatchingArtificial Intelligence10.1007/978-3-030-93049-3_17(203-214)Online publication date: 1-Jan-2022
https://doi.org/10.1007/978-3-030-93049-3_17
Li XMao JMa WLiu YZhang MMa SWang ZHe X(2021)Topic-enhanced knowledge-aware retrieval model for diverse relevance estimationProceedings of the Web Conference 202110.1145/3442381.3449943(756-767)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3442381.3449943
Muntean CNardini FPerego RTonellotto NFrieder O(2020)Weighting Passages Enhances AccuracyACM Transactions on Information Systems10.1145/342868739:2(1-11)Online publication date: 17-Dec-2020
https://dl.acm.org/doi/10.1145/3428687
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten