Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3293339.3293345acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfireConference Proceedingsconference-collections
research-article

Question-Question Similarity in Online Forums

Published: 06 December 2018 Publication History

Abstract

In this paper, we applied deep learning framework to tackle the tasks of finding duplicate questions. We implemented some models following the siamese architecture using the popular recurrent network such as Long-Short term memory (LSTM), Bi-direction Long-Short term memory (biLSTM) to find the semantic similarity between questions. We started with a basic model and further extended the basic model into three different models. Our models provide a refined, composite representation of the questions. The addition of Convolutional Neural Network (CNN) with the recurrent networks is a new approach for the sentence representation. We also applied attention mechanism for getting better contextual meaning of the questions. We generated a representation of a question according to the context of another question for solving the task. As neural models are data driven, we trained our models extensively by making pairs, such as question-question over a large-scale real-life dataset. We used a datset consisting of 400K labeled question pairs which are published by a well known question-answer forum Quora. We evaluate our models based on metrics like accuracy, precision, recall, F1 scores. Our methods and experiments demonstrate some significant improvements over the baseline systems and the state-of-the-art systems.

References

[1]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR abs/1409.0473 (2014). http://arxiv.org/abs/1409.0473
[2]
Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Säckinger, and Roopak Shah. 1993. Signature Verification Using a "Siamese" Time Delay Neural Network. In Proceedings of the 6th International Conference on Neural Information Processing Systems (NIPS'93). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 737--744. http://dl.acm.org/citation.cfm?id=2987189.2987282
[3]
S. Chopra, R. Hadsell, and Y. LeCun. 2005. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), Vol. 1. 539--546 vol. 1.
[4]
Arpita Das, Harish Yenala, Manoj Kumar Chinnakotla, and Manish Shrivastava. 2016. Together we stand: Siamese Networks for Similar Question Retrieval. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers. http://aclweb.org/anthology/P/P16/P16-1036.pdf
[5]
Alex Graves, Navdeep Jaitly, and Abdel rahman Mohamed. 2013. Hybrid speech recognition with Deep Bidirectional LSTM. In ASRU.
[6]
Karl Moritz Hermann, Tomáš Kočiský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching Machines to Read and Comprehend. In Advances in Neural Information Processing Systems (NIPS). http://arxiv.org/abs/1506.03340
[7]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.
[8]
Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional Neural Network Architectures for Matching Natural Language Sentences. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2042--2050.
[9]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). arXiv:1412.6980 http://arxiv.org/abs/1412.6980
[10]
Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Lisbon, Portugal, 1412--1421.
[11]
Jonas Mueller and Aditya Thyagarajan. 2016. Siamese Recurrent Architectures for Learning Sentence Similarity. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI'16). AAAI Press, 2786--2792. http://dl.acm.org/citation.cfm?id=3016100.3016291
[12]
Ankur P. Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. 2016. A Decomposable Attention Model for Natural Language Inference. CoRR abs/1606.01933 (2016). http://arxiv.org/abs/1606.01933
[13]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1532--1543. http://www.aclweb.org/anthology/D14-1162
[14]
Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomás Kociský, and Phil Blunsom. 2015. Reasoning about Entailment with Neural Attention. CoRR abs/1509.06664 (2015). http://arxiv.org/abs/1509.06664
[15]
Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A Neural Attention Model for Abstractive Sentence Summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 379--389.
[16]
Ming Tan, Bing Xiang, and Bowen Zhou. 2015. LSTM-based Deep Learning Models for non-factoid answer selection. CoRR abs/1511.04108 (2015). http://arxiv.org/abs/1511.04108
[17]
Zhiguo Wang, Wael Hamza, and Radu Florian. 2017. Bilateral Multi-Perspective Matching for Natural Language Sentences. CoRR abs/1702.03814 (2017). http://arxiv.org/abs/1702.03814
[18]
Zhiguo Wang, Haitao Mi, and Abraham Ittycheriah. 2016. Sentence Similarity Learning by Lexical Decomposition and Composition. CoRR abs/1602.07019 (2016). http://arxiv.org/abs/1602.07019
[19]
Jason Weston, Antoine Bordes, Sumit Chopra, and Tomas Mikolov. 2015. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks. CoRR abs/1502.05698 (2015). http://arxiv.org/abs/1502.05698
[20]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alexander J. Smola, and Eduard H. Hovy. 2016. Hierarchical Attention Networks for Document Classification. In HLT-NAACL.
[21]
Xiaodong Zhang, Sujian Li, Lei Sha, and Houfeng Wang. 2017. Attentive Interactive Neural Networks for Answer Selection in Community Question Answering. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA. 3525--3531. http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14611

Cited By

View all
  • (2024)Revolutionizing Duplicate Question Detection: A Deep Learning Approach for Stack OverflowIgMin Research10.61927/igmin1352:1(001-005)Online publication date: 9-Jan-2024
  • (2024)Enhancing User Experience on Q&A Platforms: Measuring Text Similarity Based on Hybrid CNN-LSTM Model for Efficient Duplicate Question DetectionIEEE Access10.1109/ACCESS.2024.335842212(34512-34526)Online publication date: 2024
  • (2023)An answer recommendation framework for an online cancer community forumMultimedia Tools and Applications10.1007/s11042-023-15477-983:1(173-199)Online publication date: 15-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
FIRE '18: Proceedings of the 10th Annual Meeting of the Forum for Information Retrieval Evaluation
December 2018
68 pages
ISBN:9781450362085
DOI:10.1145/3293339
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • ISI: Information Sciences Institute

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 December 2018

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Canadian Network for Research and Innovation in Machining Technology, Natural Sciences and Engineering Research Council of Canada

Conference

FIRE'18
FIRE'18: Forum for Information Retrieval Evaluation
December 6 - 9, 2018
Gandhinagar, India

Acceptance Rates

Overall Acceptance Rate 19 of 64 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)5
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Revolutionizing Duplicate Question Detection: A Deep Learning Approach for Stack OverflowIgMin Research10.61927/igmin1352:1(001-005)Online publication date: 9-Jan-2024
  • (2024)Enhancing User Experience on Q&A Platforms: Measuring Text Similarity Based on Hybrid CNN-LSTM Model for Efficient Duplicate Question DetectionIEEE Access10.1109/ACCESS.2024.335842212(34512-34526)Online publication date: 2024
  • (2023)An answer recommendation framework for an online cancer community forumMultimedia Tools and Applications10.1007/s11042-023-15477-983:1(173-199)Online publication date: 15-May-2023
  • (2023)Identifying Duplicate Questions Leveraging Recurrent Neural NetworkProceedings of the Fourth International Conference on Trends in Computational and Cognitive Engineering10.1007/978-981-19-9483-8_28(331-341)Online publication date: 28-May-2023
  • (2022)Identifying Similar Questions in the Medical Domain Using a Fine-tuned Siamese-BERT Model2022 IEEE 19th India Council International Conference (INDICON)10.1109/INDICON56171.2022.10040144(1-6)Online publication date: 24-Nov-2022
  • (2022)Siamese Network with Transfer Learning for Similar Query Retrieval in Online Health Community ForumsAdvances in Micro-Electronics, Embedded Systems and IoT10.1007/978-981-16-8550-7_32(337-346)Online publication date: 23-Apr-2022
  • (2020)Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question SimilarityProceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence10.1145/3446132.3446403(1-8)Online publication date: 24-Dec-2020
  • (2020)Effective Transfer Learning for Identifying Similar QuestionsProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3412861(3458-3465)Online publication date: 23-Aug-2020
  • (2020)Leveraging Statistic and Semantic Features for Similar Question Detection Using Fusion XGBoostDatabase Systems for Advanced Applications. DASFAA 2020 International Workshops10.1007/978-3-030-59413-8_9(106-120)Online publication date: 22-Sep-2020

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media