The use of categorization information in language models for question retrieval
Proceedings of the 18th ACM conference on Information and knowledge management, 2009•dl.acm.org
Community Question Answering (CQA) has emerged as a popular type of service meeting a
wide range of information needs. Such services enable users to ask and answer questions
and to access existing question-answer pairs. CQA archives contain very large volumes of
valuable user-generated content and have become important information resources on the
Web. To make the body of knowledge accumulated in CQA archives accessible, effective
and efficient question search is required. Question search in a CQA archive aims to retrieve …
wide range of information needs. Such services enable users to ask and answer questions
and to access existing question-answer pairs. CQA archives contain very large volumes of
valuable user-generated content and have become important information resources on the
Web. To make the body of knowledge accumulated in CQA archives accessible, effective
and efficient question search is required. Question search in a CQA archive aims to retrieve …
Community Question Answering (CQA) has emerged as a popular type of service meeting a wide range of information needs. Such services enable users to ask and answer questions and to access existing question-answer pairs. CQA archives contain very large volumes of valuable user-generated content and have become important information resources on the Web. To make the body of knowledge accumulated in CQA archives accessible, effective and efficient question search is required. Question search in a CQA archive aims to retrieve historical questions that are relevant to new questions posed by users. This paper proposes a category-based framework for search in CQA archives. The framework embodies several new techniques that use language models to exploit categories of questions for improving question-answer search. Experiments conducted on real data from Yahoo! Answers demonstrate that the proposed techniques are effective and efficient and are capable of outperforming baseline methods significantly.
ACM Digital Library