Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content

Finding similar questions in collaborative question answering archives: toward bootstrapping-based equivalent pattern learning

Published: 01 June 2012 Publication History


Many questions submitted to Collaborative Question Answering (CQA) sites have similar questions answered before. We propose a precise approach of automatically finding an answer to such questions by automatically identifying “equivalent” questions submitted and answered, in the past. Our method is based on automatically generating equivalent question patterns by grouping together questions that have previously obtained the same answers. The generated patterns are used as seed patterns to match more questions to extract large number of equivalent patterns by a new bootstrapping-based learning method. The resulting patterns can be applied to match a new question to an equivalent one that has already been answered, and thus suggest potential answers automatically. We experimented with this approach over a large collection of more than 200,000 real questions drawn from the Yahoo! Answers archive, automatically acquiring over 16,991 groups of equivalent question patterns. These patterns allow our method to obtain over 57% recall and over 54% precision on suggesting an answer automatically to new questions, significantly improving over baseline methods.


Berger, A., Caruana, R., Cohn, D., Freitag, D., & Mittal, V. (2000). Bridging the lexical chasm: Statistical approaches to answer-finding. In Proceedings of ACM SIGIR conference (pp. 192–199). New York: ACM.
Bernhard, D., & Gurevych, I. (2008). Answering learners’ questions by retrieving question paraphrases from social Q&A sites. In Proceedings of the 3rd workshop on innovative use of NLP for building educational applications held in conjunction with ACL-08 (pp. 44–52).
Bian, J., Liu, Y., Agichtein, E., & Zha, H. (2008). Finding the right facts in the crowd: Factoid question answering over social media. In Proceedings of WWW (pp. 467–476).
GIZA++: Training of statistical translation models. (2011). http://fjoch.com/GIZA++.html.
Hammond, K., Bruke, R., Martin, C., & Lytinen, S. (1995). FAQ-finder: A case based approach to knowledge navigation. In Working notes of the AAAI spring symposium on information gathering from heterogeneous distributed environments (pp. 80–86). AAAI.
Hao TY, Hu DW, Liu WY, and Zeng QT Semantic patterns for user-interactive question answering Journal of Concurrency and Computation-Practice and Experience 2008 20 7 783-799
Hu, D. W., & Liu, W. Y. (2006). SIIPU*S: A semantic pattern learning algorithm. In Proceedings of the second international conference on semantics, knowledge and grid (pp. 52–55). Guilin, China.
Ion, M. (1999). Extraction patterns for information extraction tasks: A survey. In Workshop on machine learning for information extraction. Orlando.
Jeon, J., Croft, W. B., & Lee, J. H. (2005a). Finding semantically similar questions based on their answers. In Proceedings of the 28th annual international ACM SIGIR conference (pp. 617–618). Salvador, Brazil.
Jeon, J., Croft, W. B., & Lee, J. H. (2005b). Finding similar questions in large question and answer archives. In Proceedings of ACM fourteenth conference on information and knowledge management (pp. 84–90).
Jijkoun, V., & Rijke, M. D. (2005). Retrieving answers from frequently asked questions pages on the web. In Proceedings of the 14th ACM international conference on Information and knowledge management (pp. 76–83). Bremen, Germany.
Kosseim L and Yousefi J Improving the performance of question answering with semantically equivalent answer patterns Journal of Data & Knowledge Engineering 2008 66 57-67
Lenz, M., Hübner, A., & Kunze, M. (1998). Question answering with textual CBR. In Proceedings of the international conference on FQAS (pp. 236–247). Denmark.
Mark, A. G., & Horacio, S. (2004). A pattern based approach to answering factoid, list and definition questions. In Proceedings of the 7th RIAO conference (pp. 617–618). Avignon, France.
Ravichandran, D., & Hovy, E. (2002). Learning surface text patterns for a question answering system. In Proceedings of the 40th ACL conference (pp. 41–47). Philadelphia.
Sneiders, E. (2002). Automated question answering using question templates that cover the conceptual model of the database, natural language processing and information systems. In Proceedings of the NLDB2002 (pp. 235–239). Sweden.
The Code Project. (2011). Cosine similarity in Term frequency/Inverse document frequency implementation.
Tomuro N Question terminology and representation for question type classification Terminology 2004 10 1 153-168
Wang, K., Ming, Z., & Chua, T. S. (2009). A syntactic tree matching approach to finding similar questions in community-based Q&A services. In Proceedings of SIGIR 2009 (pp. 187–194).
Whitehead SD Auto-FAQ: An experiment in cyberspace leveraging Journal of Computer Networks and ISDN Systems 1995 28 137-146
Wu CH, Yeh JF, and Chen MJ Domain-specific FAQ retrieval using independent aspects Journal of ACM Transactions on Asian Language Information Processing 2005 4 1 1-17
Zhang, D., & Lee, W. (2002). Web based pattern mining and matching approach to question answering. In Proceedings of the 11th text retrieval conference (TREC) (pp. 505–512). Gaithersburg, MD: MIST.

Cited By

View all
  • (2018)Web Forum Retrieval and Text AnalyticsFoundations and Trends in Information Retrieval10.1561/150000006212:1(1-163)Online publication date: 3-Jan-2018
  • (2017)Revealing Learner Interests through Topic Mining from Question-Answering DataInternational Journal of Distance Education Technologies10.4018/IJDET.201704010215:2(18-32)Online publication date: 1-Apr-2017
  • (2016)User authority ranking models for community question answeringJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-16909431:5(2533-2542)Online publication date: 1-Jan-2016
  • Show More Cited By

Index Terms

  1. Finding similar questions in collaborative question answering archives: toward bootstrapping-based equivalent pattern learning
        Index terms have been assigned to the content through auto-classification.



        Information & Contributors


        Published In

        cover image Information Retrieval
        Information Retrieval  Volume 15, Issue 3-4
        Jun 2012
        233 pages


        Kluwer Academic Publishers

        United States

        Publication History

        Published: 01 June 2012
        Accepted: 23 January 2012
        Received: 01 April 2011

        Author Tags

        1. Collaborative question answering
        2. Equivalent pattern
        3. Bootstrapping
        4. Pattern extension


        • Research-article


        Other Metrics

        Bibliometrics & Citations


        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 08 Feb 2025

        Other Metrics


        Cited By

        View all
        • (2018)Web Forum Retrieval and Text AnalyticsFoundations and Trends in Information Retrieval10.1561/150000006212:1(1-163)Online publication date: 3-Jan-2018
        • (2017)Revealing Learner Interests through Topic Mining from Question-Answering DataInternational Journal of Distance Education Technologies10.4018/IJDET.201704010215:2(18-32)Online publication date: 1-Apr-2017
        • (2016)User authority ranking models for community question answeringJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-16909431:5(2533-2542)Online publication date: 1-Jan-2016
        • (2016)QSemJournal of Information Science10.1177/016555151560245742:5(583-596)Online publication date: 1-Oct-2016
        • (2015)Community-aware ranking algorithms for expert identification in question-answer forumsProceedings of the 15th International Conference on Knowledge Technologies and Data-driven Business10.1145/2809563.2809592(1-8)Online publication date: 21-Oct-2015
        • (2014)Tag-based expert recommendation in community question answeringProceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.5555/3191835.3192024(960-963)Online publication date: 17-Aug-2014
        • (2014)Exploring user expertise and descriptive ability in community question answeringProceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.5555/3191835.3191900(320-327)Online publication date: 17-Aug-2014
        • (2013)CQArankProceedings of the 22nd ACM international conference on Information & Knowledge Management10.1145/2505515.2505720(99-108)Online publication date: 27-Oct-2013

        View Options

        View options






        Share this Publication link

        Share on social media