Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3331184.3331205acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Teach Machine How to Read: Reading Behavior Inspired Relevance Estimation

Published: 18 July 2019 Publication History

Abstract

Retrieval models aim to estimate the relevance of a document to a certain query. Although existing retrieval models have gained much success in both deepening our understanding of information seeking behavior and constructing practical retrieval systems (e.g. Web search engines), we have to admit that the models work in a rather different manner than how humans make relevance judgments. In this paper, we aim to reexamine the existing models as well as to propose new ones based on the findings in how human read documents during relevance judgment. First, we summarize a number of reading heuristics from practical user behavior patterns, which are categorized into implicit and explicit heuristics. By reviewing a variety of existing retrieval models, we find that most of them only satisfy a part of these reading heuristics. To evaluate the effectiveness of each heuristic, we conduct an ablation study and find that most heuristics have positive impacts on retrieval performance. We further integrate all the effective heuristics into a new retrieval model named Reading Inspired Model (RIM). Specifically, implicit reading heuristics are incorporated into the model framework and explicit reading heuristics are modeled as a Markov Decision Process and learned by reinforcement learning. Experimental results on a large-scale public available benchmark dataset and two test sets from NTCIR WWW tasks show that RIM outperforms most existing models, which illustrates the effectiveness of the reading heuristics. We believe that this work contributes to constructing retrieval models with both higher retrieval performance and better explainability.

Supplementary Material

MP4 File (cite4-17h00-d3.mp4)

References

[1]
Michael Bendersky, Donald Metzler, and W. Bruce Croft. 2010. Learning concept importance using a weighted dependence model. (2010), 31--40.
[2]
A. Chuklin, I. Markov, and M. de Rijke. 2015. Click models for web search. Synthesis lectures on information concepts, retrieval, and services.
[3]
Josipa Crnic. 1983. Introduction to Modern Information Retrieval.
[4]
Georges E. Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 331--338.
[5]
Michael Eisenberg and Carol Barry. 1988. Order effects: A study of the possible influence of presentation order on user judgments of document relevance. Journal of the American Society for Information Science 39, 5 (1988), 293--300.
[6]
Yixing Fan, Jiafeng Guo, Yanyan Lan, Jun Xu, Chengxiang Zhai, and Xueqi Cheng. 2018. Modeling Diverse Relevance Patterns in Ad-hoc Retrieval. International ACM SIGIR Conference on Research and development in Information Retrieval (2018), 375--384.
[7]
Hui Fang, Tao Tao, and ChengXiang Zhai. 2004. A formal study of information retrieval heuristics. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 49--56.
[8]
Tsujui Fu and Weiyun Ma. 2018. Speed Reading: Learning to Read ForBackward via Shuttle. (2018), 4439--4448.
[9]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A Deep Relevance Matching Model for Ad-hoc Retrieval. In Acm International on Conference on Information and Knowledge Management.
[10]
Michael Hahn and Frank Keller. 2016. Modeling Human Reading with Neural Attention. empirical methods in natural language processing (2016), 85--95.
[11]
Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional Neural Network Architectures for Matching Natural Language Sentences. Neural Information Processing Systems (2014), 2042--2050.
[12]
Po Sen Huang, Xiaodong He, Jianfeng Gao, Deng Li, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In Acm International Conference on Conference on Information and Knowledge Management.
[13]
Kai Hui, Andrew Yates, Klaus Berberich, and Gerard de Melo. 2017. Pacrr: A position-aware neural ir model for relevance matching. arXiv preprint arXiv:1704.03940 (2017).
[14]
Mao Jiaxin, Sakai Tetsuya, Luo Cheng, Xiao Peng, Liu Yiqun, and Dou Zhicheng. 2018. Overview of the ntcir-14 we want web task. Proc. NTCIR-14, To appear (2018).
[15]
Vijay R. Konda and John N. Tsitsiklis. 2000. Actor-critic algorithms. In Advances in neural information processing systems. 1008--1014.
[16]
Xiangsheng Li, Yiqun Liu, Jiaxin Mao, Zexue He, Min Zhang, and Shaoping Ma. 2018. Understanding Reading Attention Distribution during Relevance Judgement. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 733--742.
[17]
Pang Liang, Yanyan Lan, Jiafeng Guo, Jun Xu, and Xueqi Cheng. 2017. A Deep Investigation of Deep IR Models. (2017).
[18]
Xianggen Liu, Lili Mou, Haotian Cui, Zhengdong Lu, and Sen Song. 2018. Jumper: Learning when to make classification decisions in reading. arXiv preprint arXiv:1807.02314 (2018).
[19]
Cheng Luo, Tetsuya Sakai, Yiqun Liu, Zhicheng Dou, Chenyan Xiong, and Jingfang Xu. 2017. Overview of the ntcir-13 we want web task. Proc. NTCIR-13 (2017).
[20]
Yifan Nie, Yanling Li, and Jian-Yun Nie. 2018. Empirical Study of Multi-level Convolution Models for IR Based on Representations and Interactions. In Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval. ACM, 59--66.
[21]
Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text Matching as Image Recognition. In AAAI. 2793--2799.
[22]
Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, and Xueqi Cheng. 2017. Deeprank: A new deep architecture for relevance ranking in information retrieval. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 257--266.
[23]
K. Rayner. 2009. Eye movements and attention in reading, scene perception, and visual search. Quarterly Journal of Experimental Psychology 62, 8 (2009), 1457.
[24]
Erik D. Reichle, Alexander Pollatsek, Donald L. Fisher, and Keith Rayner. 1998. Toward a model of eye movement control in reading. Psychological Review 105, 1 (1998), 125--157.
[25]
E. D. Reichle, K. Rayner, and A. Pollatsek. 2003. The E-Z reader model of eyemovement control in reading: comparisons to other models. Behavioral and Brain Sciences 26, 4 (2003), 445--476.
[26]
Stephen E. Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. International ACM SIGIR Conference on Research and development in Information Retrieval (1994), 232--241.
[27]
Yelong Shen, Posen Huang, Jianfeng Gao, and Weizhu Chen. 2017. ReasoNet: Learning to Stop Reading in Machine Comprehension. knowledge discovery and data mining (2017), 1047--1055.
[28]
Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems. 1057--1063.
[29]
Tao Tao and ChengXiang Zhai. 2007. An exploration of proximity measures in information retrieval. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 295--302.
[30]
Ronald J. Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8, 3-4 (1992), 229--256.
[31]
Ho Chung Wu, Robert W. P. Luk, Kam-Fai Wong, and K. L. Kwok. 2007. A retrospective study of a hybrid document-context based retrieval model. Information processing & management 43, 5 (2007), 1308--1331.
[32]
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-End Neural Ad-hoc Ranking with Kernel Pooling. International ACM SIGIR Conference on Research and development in Information Retrieval (2017).
[33]
Adams Wei Yu, Hongrae Lee, and Quoc V. Le. 2017. Learning to Skim Text. meeting of the association for computational linguistics 1 (2017), 1880--1890.
[34]
Keyi Yu, Yang Liu, Alexander G. Schwing, and Jian Peng. 2018. Fast and Accurate Text Classification: Skimming, Rereading and Early Stopping. (2018).
[35]
Tianyang Zhang, Minlie Huang, and Li Zhao. 2018. Learning Structured Representation for Text Classification via Reinforcement Learning. (2018), 6053--6060.
[36]
Yukun Zheng, Zhen Fan, Yiqun Liu, Cheng Luo, Min Zhang, and Shaoping Ma. 2018. Sogou-QCL: A New Dataset with Click Relevance Label. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 1117--1120.

Cited By

View all
  • (2024)Evaluating Generative Ad Hoc Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657849(1916-1929)Online publication date: 10-Jul-2024
  • (2023)A Passage-Level Reading Behavior Model for Mobile SearchProceedings of the ACM Web Conference 202310.1145/3543507.3583343(3236-3246)Online publication date: 30-Apr-2023
  • (2023)LadRa-Net: Locally Aware Dynamic Reread Attention Net for Sentence Semantic MatchingIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.310318534:2(853-866)Online publication date: Feb-2023
  • Show More Cited By

Index Terms

  1. Teach Machine How to Read: Reading Behavior Inspired Relevance Estimation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2019
    1512 pages
    ISBN:9781450361729
    DOI:10.1145/3331184
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 July 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. reading behavior
    2. reinforcement learning
    3. retrieval model

    Qualifiers

    • Research-article

    Funding Sources

    • National Key Research and Development Program of China
    • Natural Science Foundation of China

    Conference

    SIGIR '19
    Sponsor:

    Acceptance Rates

    SIGIR'19 Paper Acceptance Rate 84 of 426 submissions, 20%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)17
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Evaluating Generative Ad Hoc Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657849(1916-1929)Online publication date: 10-Jul-2024
    • (2023)A Passage-Level Reading Behavior Model for Mobile SearchProceedings of the ACM Web Conference 202310.1145/3543507.3583343(3236-3246)Online publication date: 30-Apr-2023
    • (2023)LadRa-Net: Locally Aware Dynamic Reread Attention Net for Sentence Semantic MatchingIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.310318534:2(853-866)Online publication date: Feb-2023
    • (2022)Extractive Explanations for Interpretable Text RankingACM Transactions on Information Systems10.1145/357692441:4(1-31)Online publication date: 16-Dec-2022
    • (2022)Towards a Better Understanding of Human Reading Comprehension with Brain SignalsProceedings of the ACM Web Conference 202210.1145/3485447.3511966(380-391)Online publication date: 25-Apr-2022
    • (2022)Axiomatically Regularized Pre-training for Ad hoc SearchProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531943(1524-1534)Online publication date: 6-Jul-2022
    • (2022)Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document RankingJournal of Computer Science and Technology10.1007/s11390-022-2031-y37:4(814-838)Online publication date: 30-Jul-2022
    • (2022)DGA-Net: Dynamic Gaussian Attention Network for Sentence Semantic MatchingArtificial Intelligence10.1007/978-3-030-93049-3_17(203-214)Online publication date: 1-Jan-2022
    • (2021)Topic-enhanced knowledge-aware retrieval model for diverse relevance estimationProceedings of the Web Conference 202110.1145/3442381.3449943(756-767)Online publication date: 19-Apr-2021
    • (2020)Weighting Passages Enhances AccuracyACM Transactions on Information Systems10.1145/342868739:2(1-11)Online publication date: 17-Dec-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media