Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2766462.2767738acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks

Published: 09 August 2015 Publication History

Abstract

Learning a similarity function between pairs of objects is at the core of learning to rank approaches. In information retrieval tasks we typically deal with query-document pairs, in question answering -- question-answer pairs. However, before learning can take place, such pairs needs to be mapped from the original space of symbolic words into some feature space encoding various aspects of their relatedness, e.g. lexical, syntactic and semantic. Feature engineering is often a laborious task and may require external knowledge sources that are not always available or difficult to obtain. Recently, deep learning approaches have gained a lot of attention from the research community and industry for their ability to automatically learn optimal feature representation for a given task, while claiming state-of-the-art performance in many tasks in computer vision, speech recognition and natural language processing. In this paper, we present a convolutional neural network architecture for reranking pairs of short texts, where we learn the optimal representation of text pairs and a similarity function to relate them in a supervised way from the available training data. Our network takes only words in the input, thus requiring minimal preprocessing. In particular, we consider the task of reranking short text pairs where elements of the pair are sentences. We test our deep learning system on two popular retrieval tasks from TREC: Question Answering and Microblog Retrieval. Our model demonstrates strong performance on the first task beating previous state-of-the-art systems by about 3\% absolute points in both MAP and MRR and shows comparable results on tweet reranking, while enjoying the benefits of no manual feature engineering and no additional syntactic parsers.

References

[1]
A. Agarwal, H. Raghavan, K. Subbian, P. Melville, D. Gondek, and R. Lawrence. Learning to rank for robust question answering. In CIKM, 2012.
[2]
J. W. Antoine Bordes and N. Usunier. Open question answering with weakly supervised embedding models. In ECML, Nancy, France, September 2014.
[3]
Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A neural probabilistic language model. Journal of Machine Learning Research, 3: 1137--1155, 2003.
[4]
R. Berendsen, M. Tsagkias, W. Weerkamp, and M. de Rijke. Pseudo test collections for training and tuning microblog rankers. In SIGIR, 2013.
[5]
A. Bordes, S. Chopra, and J. Weston. Question answering with subgraph embeddings. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 615--620, Doha, Qatar, October 2014. Association for Computational Linguistics.
[6]
Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: From pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning, ICML '07, pages 129--136, New York, NY, USA, 2007. ACM.
[7]
Y. Chen, M. Zhou, and S. Wang. Reranking answers from definitional QA using language models. In ACL, 2006.
[8]
R. Collobert and J. Weston. A unified architecture for natural language processing: deep neural networks with multitask learning. In ICML, pages 160--167, 2008.
[9]
H. Cui, M. Kan, and T. Chua. Generic soft pattern models for definitional QA. In SIGIR, Salvador, Brazil, 2005. ACM.
[10]
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 1990.
[11]
M. Denil, A. Demiraj, N. Kalchbrenner, P. Blunsom, and N. de Freitas. Modelling, visualising and summarising documents with a single convolutional neural network. Technical report, University of Oxford, 2014.
[12]
J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12: 2121--2159, 2011.
[13]
A. Echihabi and D. Marcu. A noisy-channel approach to question answering. In ACL, 2003.
[14]
I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. C. Courville, and Y. Bengio. Maxout networks. In ICML, pages 1319--1327, 2013.
[15]
M. Heilman and N. A. Smith. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In NAACL, 2010.
[16]
M. Iyyer, J. Boyd-Graber, L. Claudino, R. Socher, and H. Daumé III. A neural network for factoid question answering over paragraphs. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 633--644, Doha, Qatar, October 2014. Association for Computational Linguistics.
[17]
J. Jeon, W. B. Croft, and J. H. Lee. Finding similar questions in large question and answer archives. In CIKM, 2005.
[18]
N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, June 2014.
[19]
Y. Kim. Convolutional neural networks for sentence classification. In EMNLP, pages 1746--1751, Doha, Qatar, October 2014.
[20]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26, pages 3111--3119, 2013.
[21]
A. Moschitti, S. Quarteroni, R. Basili, and S. Manandhar. Exploiting syntactic and shallow semantic kernels for question/answer classification. In ACL, 2007.
[22]
V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807--814, 2010.
[23]
I. Ounis, C. Macdonald, J. Lin, and I. Soboroff. Overview of the TREC-2011 microblog track. In TREC, 2011.
[24]
F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. CoRR, 2006.
[25]
Y. Sasaki. Question answering as question-biased term extraction: A new approach toward multilingual qa. In ACL, 2005.
[26]
A. Severyn and A. Moschitti. Automatic feature engineering for answer selection and extraction. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 458--467, Seattle, Washington, USA, October 2013. Association for Computational Linguistics.
[27]
A. Severyn, A. Moschitti, M. Tsagkias, R. Berendsen, and M. de Rijke. A syntax-aware re-ranker for microblog retrieval. In SIGIR, 2014.
[28]
D. Shen and M. Lapata. Using semantic roles to improve question answering. In EMNLP-CoNLL, 2007.
[29]
I. Soboroff, I. Ounis, J. Lin, and I. Soboroff. Overview of the TREC-2012 microblog track. In TREC, 2012.
[30]
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15: 1929--1958, 2014.
[31]
M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers to non-factoid questions from web collections. Comput. Linguist., 37 (2): 351--383, June 2011.
[32]
J. Suzuki, Y. Sasaki, and E. Maeda. Svm answer selection for open-domain question answering. In COLING, 2002.
[33]
W. tau Yih, M.-W. Chang, C. Meek, and A. Pastusiak. Question answering using enhanced lexical semantic models. In ACL, August 2013.
[34]
P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res., 11: 3371--3408, Dec. 2010.
[35]
M. Wang and C. D. Manning. Probabilistic tree-edit models with structured latent variables for textual entailment and question answering. In ACL, 2010.
[36]
M. Wang, N. A. Smith, and T. Mitaura. What is the jeopardy model? a quasi-synchronous grammar for qa. In EMNLP, 2007.
[37]
P. C. Xuchen Yao, Benjamin Van Durme and C. Callison-Burch. Answer extraction as sequence tagging with tree edit distance. In NAACL, 2013.
[38]
L. Yu, K. M. Hermann, P. Blunsom, and S. Pulman. Deep learning for answer sentence selection. CoRR, 2014.
[39]
M. D. Zeiler. Adadelta: An adaptive learning rate method. CoRR, 2012.
[40]
M. D. Zeiler and R. Fergus. Stochastic pooling for regularization of deep convolutional neural networks. CoRR, abs/1301.3557, 2013.

Cited By

View all
  • (2025)An interactive address matching method based on a graph attention mechanismInternational Journal of Cognitive Computing in Engineering10.1016/j.ijcce.2024.12.0036(191-200)Online publication date: Dec-2025
  • (2025)Quantum-inspired semantic matching based on neural networks with the duality of density matricesEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109667140:COnline publication date: 15-Jan-2025
  • (2025)Reusing Keywords for Fine-grained Representations and MatchingsDatabase Systems for Advanced Applications10.1007/978-981-97-5779-4_6(83-98)Online publication date: 11-Jan-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
August 2015
1198 pages
ISBN:9781450336215
DOI:10.1145/2766462
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 August 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. convolutional neural networks
  2. learning to rank
  3. microblog search
  4. question answering

Qualifiers

  • Research-article

Funding Sources

  • Google Europe Doctoral Fellowship Award 2013
  • CogNet H2020-ICT-2014-2

Conference

SIGIR '15
Sponsor:

Acceptance Rates

SIGIR '15 Paper Acceptance Rate 70 of 351 submissions, 20%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)136
  • Downloads (Last 6 weeks)15
Reflects downloads up to 23 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)An interactive address matching method based on a graph attention mechanismInternational Journal of Cognitive Computing in Engineering10.1016/j.ijcce.2024.12.0036(191-200)Online publication date: Dec-2025
  • (2025)Quantum-inspired semantic matching based on neural networks with the duality of density matricesEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109667140:COnline publication date: 15-Jan-2025
  • (2025)Reusing Keywords for Fine-grained Representations and MatchingsDatabase Systems for Advanced Applications10.1007/978-981-97-5779-4_6(83-98)Online publication date: 11-Jan-2025
  • (2024)Second-Order Text Matching Algorithm for Agricultural TextApplied Sciences10.3390/app1416701214:16(7012)Online publication date: 9-Aug-2024
  • (2024)Automated Scoring of Translations with BERT Models: Chinese and English Language Case StudyApplied Sciences10.3390/app1405192514:5(1925)Online publication date: 26-Feb-2024
  • (2024)Dynamic Programming-Based White Box Adversarial Attack for Deep Neural NetworksAI10.3390/ai50300595:3(1216-1234)Online publication date: 24-Jul-2024
  • (2024)Relevance Filtering for Embedding-based RetrievalProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680095(4828-4835)Online publication date: 21-Oct-2024
  • (2024)GUITAR: Gradient Pruning toward Fast Neural RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657728(163-173)Online publication date: 10-Jul-2024
  • (2024)Multimodal learning with only image data: A deep unsupervised model for street view image retrieval by fusing visual and scene text features of imagesTransactions in GIS10.1111/tgis.1314628:3(486-508)Online publication date: 23-Feb-2024
  • (2024)Nonfactoid Question Answering as Query-Focused Summarization With Graph-Enhanced Multihop InferenceIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.325841335:8(11231-11245)Online publication date: Aug-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media