research-article

Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks

Authors:

Aliaksei Severyn,

Alessandro MoschittiAuthors Info & Claims

SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 373 - 382

https://doi.org/10.1145/2766462.2767738

Published: 09 August 2015 Publication History

Abstract

Learning a similarity function between pairs of objects is at the core of learning to rank approaches. In information retrieval tasks we typically deal with query-document pairs, in question answering -- question-answer pairs. However, before learning can take place, such pairs needs to be mapped from the original space of symbolic words into some feature space encoding various aspects of their relatedness, e.g. lexical, syntactic and semantic. Feature engineering is often a laborious task and may require external knowledge sources that are not always available or difficult to obtain. Recently, deep learning approaches have gained a lot of attention from the research community and industry for their ability to automatically learn optimal feature representation for a given task, while claiming state-of-the-art performance in many tasks in computer vision, speech recognition and natural language processing. In this paper, we present a convolutional neural network architecture for reranking pairs of short texts, where we learn the optimal representation of text pairs and a similarity function to relate them in a supervised way from the available training data. Our network takes only words in the input, thus requiring minimal preprocessing. In particular, we consider the task of reranking short text pairs where elements of the pair are sentences. We test our deep learning system on two popular retrieval tasks from TREC: Question Answering and Microblog Retrieval. Our model demonstrates strong performance on the first task beating previous state-of-the-art systems by about 3\% absolute points in both MAP and MRR and shows comparable results on tweet reranking, while enjoying the benefits of no manual feature engineering and no additional syntactic parsers.

References

[1]

A. Agarwal, H. Raghavan, K. Subbian, P. Melville, D. Gondek, and R. Lawrence. Learning to rank for robust question answering. In CIKM, 2012.

Digital Library

[2]

J. W. Antoine Bordes and N. Usunier. Open question answering with weakly supervised embedding models. In ECML, Nancy, France, September 2014.

Digital Library

[3]

Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A neural probabilistic language model. Journal of Machine Learning Research, 3: 1137--1155, 2003.

Digital Library

[4]

R. Berendsen, M. Tsagkias, W. Weerkamp, and M. de Rijke. Pseudo test collections for training and tuning microblog rankers. In SIGIR, 2013.

Digital Library

[5]

A. Bordes, S. Chopra, and J. Weston. Question answering with subgraph embeddings. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 615--620, Doha, Qatar, October 2014. Association for Computational Linguistics.

[6]

Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: From pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning, ICML '07, pages 129--136, New York, NY, USA, 2007. ACM.

Digital Library

[7]

Y. Chen, M. Zhou, and S. Wang. Reranking answers from definitional QA using language models. In ACL, 2006.

Digital Library

[8]

R. Collobert and J. Weston. A unified architecture for natural language processing: deep neural networks with multitask learning. In ICML, pages 160--167, 2008.

Digital Library

[9]

H. Cui, M. Kan, and T. Chua. Generic soft pattern models for definitional QA. In SIGIR, Salvador, Brazil, 2005. ACM.

Digital Library

[10]

S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 1990.

[11]

M. Denil, A. Demiraj, N. Kalchbrenner, P. Blunsom, and N. de Freitas. Modelling, visualising and summarising documents with a single convolutional neural network. Technical report, University of Oxford, 2014.

[12]

J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12: 2121--2159, 2011.

Digital Library

[13]

A. Echihabi and D. Marcu. A noisy-channel approach to question answering. In ACL, 2003.

Digital Library

[14]

I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. C. Courville, and Y. Bengio. Maxout networks. In ICML, pages 1319--1327, 2013.

Digital Library

[15]

M. Heilman and N. A. Smith. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In NAACL, 2010.

Digital Library

[16]

M. Iyyer, J. Boyd-Graber, L. Claudino, R. Socher, and H. Daumé III. A neural network for factoid question answering over paragraphs. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 633--644, Doha, Qatar, October 2014. Association for Computational Linguistics.

[17]

J. Jeon, W. B. Croft, and J. H. Lee. Finding similar questions in large question and answer archives. In CIKM, 2005.

Digital Library

[18]

N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, June 2014.

[19]

Y. Kim. Convolutional neural networks for sentence classification. In EMNLP, pages 1746--1751, Doha, Qatar, October 2014.

[20]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26, pages 3111--3119, 2013.

Digital Library

[21]

A. Moschitti, S. Quarteroni, R. Basili, and S. Manandhar. Exploiting syntactic and shallow semantic kernels for question/answer classification. In ACL, 2007.

[22]

V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807--814, 2010.

Digital Library

[23]

I. Ounis, C. Macdonald, J. Lin, and I. Soboroff. Overview of the TREC-2011 microblog track. In TREC, 2011.

[24]

F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. CoRR, 2006.

[25]

Y. Sasaki. Question answering as question-biased term extraction: A new approach toward multilingual qa. In ACL, 2005.

Digital Library

[26]

A. Severyn and A. Moschitti. Automatic feature engineering for answer selection and extraction. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 458--467, Seattle, Washington, USA, October 2013. Association for Computational Linguistics.

[27]

A. Severyn, A. Moschitti, M. Tsagkias, R. Berendsen, and M. de Rijke. A syntax-aware re-ranker for microblog retrieval. In SIGIR, 2014.

Digital Library

[28]

D. Shen and M. Lapata. Using semantic roles to improve question answering. In EMNLP-CoNLL, 2007.

[29]

I. Soboroff, I. Ounis, J. Lin, and I. Soboroff. Overview of the TREC-2012 microblog track. In TREC, 2012.

[30]

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15: 1929--1958, 2014.

Digital Library

[31]

M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers to non-factoid questions from web collections. Comput. Linguist., 37 (2): 351--383, June 2011.

Digital Library

[32]

J. Suzuki, Y. Sasaki, and E. Maeda. Svm answer selection for open-domain question answering. In COLING, 2002.

Digital Library

[33]

W. tau Yih, M.-W. Chang, C. Meek, and A. Pastusiak. Question answering using enhanced lexical semantic models. In ACL, August 2013.

[34]

P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res., 11: 3371--3408, Dec. 2010.

Digital Library

[35]

M. Wang and C. D. Manning. Probabilistic tree-edit models with structured latent variables for textual entailment and question answering. In ACL, 2010.

Digital Library

[36]

M. Wang, N. A. Smith, and T. Mitaura. What is the jeopardy model? a quasi-synchronous grammar for qa. In EMNLP, 2007.

[37]

P. C. Xuchen Yao, Benjamin Van Durme and C. Callison-Burch. Answer extraction as sequence tagging with tree edit distance. In NAACL, 2013.

[38]

L. Yu, K. M. Hermann, P. Blunsom, and S. Pulman. Deep learning for answer sentence selection. CoRR, 2014.

[39]

M. D. Zeiler. Adadelta: An adaptive learning rate method. CoRR, 2012.

[40]

M. D. Zeiler and R. Fergus. Stochastic pooling for regularization of deep convolutional neural networks. CoRR, abs/1301.3557, 2013.

Cited By

Li MSu JSong ZQiu JLin Y(2025)An interactive address matching method based on a graph attention mechanismInternational Journal of Cognitive Computing in Engineering10.1016/j.ijcce.2024.12.0036(191-200)Online publication date: Dec-2025
https://doi.org/10.1016/j.ijcce.2024.12.003
Zhang CLi QSong DTiwari P(2025)Quantum-inspired semantic matching based on neural networks with the duality of density matricesEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109667140:COnline publication date: 15-Jan-2025
https://dl.acm.org/doi/10.1016/j.engappai.2024.109667
Chong LMa DChen YLv X(2025)Reusing Keywords for Fine-grained Representations and MatchingsDatabase Systems for Advanced Applications10.1007/978-981-97-5779-4_6(83-98)Online publication date: 11-Jan-2025
https://doi.org/10.1007/978-981-97-5779-4_6
Show More Cited By

Index Terms

Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks

Recommendations

A deep actor critic reinforcement learning framework for learning to rank
Abstract
In this paper, we propose a Deep Reinforcement learning based approach for Learning to rank task. Reinforcement Learning has been applied in the ranking task with good success, but the existing Policy Gradient based approaches suffer ...
Neural Learning to Rank using TensorFlow Ranking: A Hands-on Tutorial
ICTIR '19: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval

A number of open source packages harnessing the power of deep learning have emerged in recent years and are under active development, including TensorFlow, PyTorch and others. Supervised learning is one of the main use cases of deep learning packages. ...
Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

We describe a new deep learning architecture for learning to rank question answer pairs. Our approach extends the long short-term memory (LSTM) network with holographic composition to model the relationship between question and answer representations. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

August 2015

1198 pages

ISBN:9781450336215

DOI:10.1145/2766462

General Chair:
Ricardo Baeza-Yates
Yahoo Labs, USA
,
Program Chairs:
Mounia Lalmas
Yahoo Labs, UK
,
Alistair Moffat
University of Melbourne, Australia
,
Berthier Ribeiro-Neto
Google, Brazil, and UFMG, Brazil

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 August 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Google Europe Doctoral Fellowship Award 2013
CogNet H2020-ICT-2014-2

Conference

SIGIR '15

Sponsor:

SIGIR

SIGIR '15: The 38th International ACM SIGIR conference on research and development in Information Retrieval

August 9 - 13, 2015

Santiago, Chile

Acceptance Rates

SIGIR '15 Paper Acceptance Rate 70 of 351 submissions, 20%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

404
Total Citations
View Citations
4,738
Total Downloads

Downloads (Last 12 months)136
Downloads (Last 6 weeks)15

Reflects downloads up to 23 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li MSu JSong ZQiu JLin Y(2025)An interactive address matching method based on a graph attention mechanismInternational Journal of Cognitive Computing in Engineering10.1016/j.ijcce.2024.12.0036(191-200)Online publication date: Dec-2025
https://doi.org/10.1016/j.ijcce.2024.12.003
Zhang CLi QSong DTiwari P(2025)Quantum-inspired semantic matching based on neural networks with the duality of density matricesEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109667140:COnline publication date: 15-Jan-2025
https://dl.acm.org/doi/10.1016/j.engappai.2024.109667
Chong LMa DChen YLv X(2025)Reusing Keywords for Fine-grained Representations and MatchingsDatabase Systems for Advanced Applications10.1007/978-981-97-5779-4_6(83-98)Online publication date: 11-Jan-2025
https://doi.org/10.1007/978-981-97-5779-4_6
Sun XSong YHuang J(2024)Second-Order Text Matching Algorithm for Agricultural TextApplied Sciences10.3390/app1416701214:16(7012)Online publication date: 9-Aug-2024
https://doi.org/10.3390/app14167012
Cui YLiang M(2024)Automated Scoring of Translations with BERT Models: Chinese and English Language Case StudyApplied Sciences10.3390/app1405192514:5(1925)Online publication date: 26-Feb-2024
https://doi.org/10.3390/app14051925
Aggarwal SMittal AAggarwal SSingh A(2024)Dynamic Programming-Based White Box Adversarial Attack for Deep Neural NetworksAI10.3390/ai50300595:3(1216-1234)Online publication date: 24-Jul-2024
https://doi.org/10.3390/ai5030059
Rossi NLin JLiu FYang ZLee TMagnani ALiao CSerra ESpezzano F(2024)Relevance Filtering for Embedding-based RetrievalProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680095(4828-4835)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3680095
Zhao WTan SLi PHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)GUITAR: Gradient Pruning toward Fast Neural RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657728(163-173)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657728
Wu SYu WZhang YHuang M(2024)Multimodal learning with only image data: A deep unsupervised model for street view image retrieval by fusing visual and scene text features of imagesTransactions in GIS10.1111/tgis.1314628:3(486-508)Online publication date: 23-Feb-2024
https://doi.org/10.1111/tgis.13146
Deng YZhang WXu WShen YLam W(2024)Nonfactoid Question Answering as Query-Focused Summarization With Graph-Enhanced Multihop InferenceIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.325841335:8(11231-11245)Online publication date: Aug-2024
https://doi.org/10.1109/TNNLS.2023.3258413
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten