Enforcing Topic Diversity in a Document Recommender for Conversations

Author

Habibi, Maryam and Popescu-Belis, Andrei

Conference

Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

Year

2014

Figures & Tables

Figure 1: The four stages of our document recommendation approach (shown vertically: 1–4) and the four options considered in this paper (bottom line: SimM, Round-robin, DivM, and DivS).

Table 3: Example of retrieved Wikipedia pages from the four different methods tested in this paper. Results of diverse merging (DivM) appear to cover more topics relevant to the conversation fragment than other methods. The average ranking (DivM > Round-robin > SimM > DivS) is also observed in this example.

Table 1: Comparative scores of the recommended document lists from four methods: DivS, SimM,Round-robin, and DivM, evaluated by human judges over the ELEA Corpus. Subset A gathers fragments with fewer than or exactly five topics, while subset B gathers all the other fragments. The results imply the following ranking: DivM > Round-robin > SimM > DivS.

Table 2: Example of implicit queries built from the keyword list extracted from a sample fragment of the ELEA Corpus. Each query covers one of the main topics of the fragment and has a different weight.

Abstract
1 Introduction
2 Related Work
3 Framework of our Document Recommender System
4 Diverse Merging of the Results of Multiple Queries
- 4.1 Document and Query Representation
- 4.2 Diverse Merging Problem
- 4.3 Defining a Diverse Reward Function
- 4.4 Finding the Optimal Document List
5 Data, Settings and Evaluation Method
- 5.1 Conversational Corpus
- 5.2 Parameter Settings for Experimentation
- 5.3 Evaluation Protocol and Metrics
6 Experimental Results
- 6.1 Diverse Re-ranking vs. Similarity Merging
- 6.2 Comparison across Merging Techniques
- 6.3 Impact of the Topical Diversity of Fragments
- 6.4 Example of Document Results
7 Conclusion
Acknowledgments
References

References

3Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, and Samuel Ieong. 2009. Diversifying search results. In Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages 5–14. ACM.View this Paper
2Jagdev Bhogal, Andy Macfarlane, and Peter Smith. 2007. A review of ontology based query expansion. Information and Processing Management, 43:866–886.
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3:993–1022.View this Paper
Jonathan Boyd-Graber, Jordan Chang, Sean Gerrish, Chong Wang, and David Blei. 2009. Reading tea leaves:How humans interpret topic models. In Proceedings of the 23rd Annual Conference on Neural Information Processing Systems (NIPS), pages 1–9.View this Paper
2Jay Budzik and Kristian J. Hammond. 2000. User interactions with everyday applications as context for just-in-time information access. In Proceedings of the 5th International Conference on Intelligent User Interfaces(IUI), pages 44–51. ACM.View this Paper
2Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 335–336. ACM.View this Paper
2Claudio Carpineto and Giovanni Romano. 2012. A survey of automatic query expansion in information retrieval. ACM Computing Surveys (CSUR), 44(1):1–56.View this Paper
Ben Carterette and Praveen Chandar. 2009. Probabilistic models of ranking novel documents for faceted topic retrieval. In Proceedings of the 18th ACM conference on Information and knowledge management, pages 1287–1296.View this Paper
Philip N. Garner, John Dines, Thomas Hain, Asmaa El Hannani, Martin Karafiat, Danil Korchagin, Mike Lincoln,Vincent Wan, and Le Zhang. 2009. Real-time ASR from meetings. In Proceedings of Interspeech 2009 (10th Annual Conference of the International Speech Communication Association), pages 2119–2122.View this Paper
Maryam Habibi and Andrei Popescu-Belis. 2012. Using crowdsourcing to compare document recommendation strategies for conversations. In Workshop on Recommendation Utility Evaluation: Beyond RMSE (RUE 2011),pages 15–20.View this Paper
4Maryam Habibi and Andrei Popescu-Belis. 2013. Diverse keyword extraction from conversations. In Proceedings of the ACL 2013 (51th Annual Meeting of the Association for Computational Linguistics), pages 651–657.View this Paper
Maryam Habibi and Andrei Popescu-Belis. submitted. Keyword extraction and clustering for document recommendation in conversations. Manuscript submitted for publication.View this Paper
Peter E. Hart and Jamey Graham. 1997. Query-free information retrieval. International Journal of Intelligent Systems Technologies and Applications, 12(5):32–37.View this Paper
Matthew D. Hoffman, David M. Blei, and Francis Bach. 2010. Online learning for Latent Dirichlet Allocation. Proceedings of 24th Annual Conference on Neural Information Processing Systems, 23:856–864.View this Paper
Gareth J.F. Jones and Peter J. Brown. 2004. Context-aware retrieval for ubiquitous computing environments. In Mobile and ubiquitous information access, pages 227–243. Springer.View this Paper
Jingxuan Li, Lei Li, and Tao Li. 2012. Multi-document summarization via submodularity. Applied Intelligence,37(3):420–430.View this Paper
Hui Lin and Jeff Bilmes. 2011. A class of submodular functions for document summarization. In Proceedings of the ACL 2011 (49th Annual Meeting of the Association for Computational Linguistics), pages 510–520.View this Paper
Andrew K. McCallum. 2002. MALLET: A machine learning for language toolkit. http://mallet.cs.umass.edu.
Mehryar Mohri, Pedro Moreno, and Eugene Weinstein. 2010. Discriminative topic segmentation of text and speech. In International Conference on Artificial Intelligence and Statistics, pages 533–540.View this Paper
George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions. Mathematical Programming Journal, 14(1):265–294.View this Paper
2Andrei Popescu-Belis, Erik Boertjes, Jonathan Kilgour, Peter Poller, Sandro Castronovo, Theresa Wilson, Alejandro Jaimes, and Jean Carletta. 2008. The AMIDA Automatic Content Linking Device: Just-in-time document retrieval in meetings. In Proceedings of MLMI 2008 (Machine Learning for Multimodal Interaction), LNCS 5237, pages 272–283.View this Paper
Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen, and Philip N. Garner. 2011. A speech-based just-in-time retrieval system using semantic search. In Proceedings of 49th Annual Meeting of the ACL, pages 80–85.View this Paper
2Filip Radlinski and Susan Dumais. 2006. Improving personalized web search using result diversification. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 691–692. ACM.View this Paper
Bradley J. Rhodes and Pattie Maes. 2000. Just-in-time information retrieval agents. IBM Systems Journal,39(3.4):685–704.View this Paper
Stephen E. Robertson. 1997. The probability ranking principle in IR. In Karen Sparck Jones and Peter Willett,editors, Readings in information retrieval, pages 281–286. Morgan Kaufmann Publishers Inc.
Dairazalia Sanchez-Cortes, Oya Aran, Marianne Schmid Mast, and Daniel Gatica-Perez. 2012. A nonverbal behavior approach to identify emergent leaders in small groups. IEEE Trans. on Multimedia, 14(3):816–832.View this Paper
6Rodrygo L.T. Santos, Craig Macdonald, and Iadh Ounis. 2010. Exploiting query reformulations for web search result diversification. In Proceedings of the 19th Int. Conf. on the World Wide Web, pages 881–890. ACM.View this Paper
2Saúl Vargas, Pablo Castells, and David Vallet. 2012. Explicit relevance models in intent-oriented information retrieval diversification. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, pages 75–84. ACM.View this Paper
Jun Wang and Jianhan Zhu. 2009. Portfolio theory of information retrieval. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 115–122. ACM.View this Paper
Shengli Wu and Sally McClean. 2007. Result merging methods in distributed information retrieval with overlapping databases. Information Retrieval, 10(3):297–319.View this Paper
Cheng Xiang Zhai, William W. Cohen, and John Lafferty. 2003. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, pages 10–17. ACM.View this Paper
Appendix A. Transcript of a Conversation Fragment from the ELEA Corpus The following transcript of a conversation fragment (speakers noted A through C) was submitted to the document recommender system and is exemplified in Section 6.4. The corresponding implicit queries and recommendations are respectively shown in Tables 2 and 3.
A: okay I start.
B: how how do you want to proceed?A: I guess --C: yes what is the most important?A: I guess fire light. B: fire lighter?A: fire, yes. I would say if we had something we can fire with -- I guess that
the lighter is useful in getting some sparks. B: hopefully. A: so we can use either newspaper or -- something like that. C: but again - first it is more important to have enough err clothes. A: and for me, more important to know where to go. I would say that the compass. C: I mean -- if you don’t have enough clothes so -- at one point you can ---B: you can die. C: yes you can -- you will die. so first issue, try to keep yourself alive and
then you can ---A: but -- but you already have some ---B: basics. you everything. you have enormous which is and so is no shoes here. C: okay that we have shoes so -- okay. B: because seventy kilometers will take you how many days? err in the snow --
what do you think?A: two or three. B: it can be two or three days?C: yes, but okay you cannot always have fire with you -- but you need always
have clothes with you. I mean it is the only thing that protects you when you are walking. B: oh yes. and erm you can make an igloo during the evening. not that cold.only about five degrees. so lighting a fire is not so important. C: I guess fire is an extra. I mean it is important but err for me first it is important that when you keep walking you should be protected.

+- Similar Papers (10)