Relevance estimation with multiple information sources on search engine result pages
Proceedings of the 27th ACM International Conference on Information and …, 2018•dl.acm.org
Relevance estimation is among the most important tasks in the ranking of search results
because most search engines follow the Probability Ranking Principle. Current relevance
estimation methodologies mainly concentrate on text matching between the query and Web
documents, link analysis and user behavior models. However, users judge the relevance of
search results directly from Search Engine Result Pages (SERPs), which provide valuable
signals for reranking. Morden search engines aggregate heterogeneous information items …
because most search engines follow the Probability Ranking Principle. Current relevance
estimation methodologies mainly concentrate on text matching between the query and Web
documents, link analysis and user behavior models. However, users judge the relevance of
search results directly from Search Engine Result Pages (SERPs), which provide valuable
signals for reranking. Morden search engines aggregate heterogeneous information items …
Relevance estimation is among the most important tasks in the ranking of search results because most search engines follow the Probability Ranking Principle. Current relevance estimation methodologies mainly concentrate on text matching between the query and Web documents, link analysis and user behavior models. However, users judge the relevance of search results directly from Search Engine Result Pages (SERPs), which provide valuable signals for reranking. Morden search engines aggregate heterogeneous information items (such as images, news, and hyperlinks) to a single ranking list on SERPs. The aggregated search results have different visual patterns, textual semantics and presentation structures, and a better strategy should rely on all these information sources to improve ranking performance. In this paper, we propose a novel framework named Joint Relevance Estimation model (JRE), which learns the visual patterns from screenshots of search results, explores the presentation structures from HTML source codes and also adopts the semantic information of textual contents. To evaluate the performance of the proposed model, we construct a large scale practical Search Result Relevance (SRR) dataset which consists of multiple information sources and 4-grade relevance scores of over 60,000 search results. Experimental results show that the proposed JRE model achieves better performance than state-of-the-art ranking solutions as well as the original ranking of commercial search engines.
ACM Digital Library