Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2983323.2983683acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Digesting Multilingual Reader Comments via Latent Discussion Topics with Commonality and Specificity

Published: 24 October 2016 Publication History

Abstract

Many news websites from different regions in the world allow readers to write comments in their own languages about an event. Digesting such enormous amount of comments in different languages is difficult. One elegant way to digest and organize these comments is to detect latent discussion topics with the consideration of language attributes. Some discussion topics are common topics shared between languages whereas some topics are specifically dominated by a particular language. To tackle this task of discovering discussion topics that exhibit commonality or specificity from news reader comments written in different languages, we propose a new model called TDCS based on graphical models, which can cope with the language gap and detect language-common and language-specific latent discussion topics simultaneously. Our TDCS model also exploits comment-oriented clues via a scalable Dirichlet Multinomial Regression method. To learn the model parameters, we develop an inference method which alternates between EM and Gibbs sampling. Experimental results show that our proposed TDCS model can provide an effective way to digest multilingual news reader comments.

References

[1]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of Machine Learning Research, 3:993--1022, 2003.
[2]
J. Boyd-Graber and D. M. Blei. Multilingual topic models for unaligned text. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pages 75--82, 2009.
[3]
M. K. Das, T. Bansal, and C. Bhattacharyya. Going beyond corr-lda for detecting specific comments on news & blogs. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pages 483--492, 2014.
[4]
T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1):5228--5235, 2004.
[5]
T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine learning, 42(1--2):177--196, 2001.
[6]
D. C. Liu and J. Nocedal. On the limited memory bfgs method for large scale optimization. Mathematical Programming, 45(1--3):503--528, 1989.
[7]
Z. Ma, A. Sun, Q. Yuan, and G. Cong. Topic-driven reader comments summarization. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 265--274, 2012.
[8]
S. Melluish. Globalization, culture and psychology. International Review of Psychiatry, 26(5):538--543, 2014.
[9]
D. Mimno and A. McCallum. Topic models conditioned on arbitrary features with dirichlet-multinomial regression. arXiv preprint arXiv:1206.3278, 2012.
[10]
D. Newman, J. H. Lau, K. Grieser, and T. Baldwin. Automatic evaluation of topic coherence. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 100--108, 2010.
[11]
R. E. Prasojo, M. Kacimi, and W. Nutt. Entity and aspect extraction for organizing news comments. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pages 233--242, 2015.
[12]
C. Shaoul. The westbury lab wikipedia corpus. Edmonton, AB: University of Alberta, 2010.
[13]
C. Tamer, K. Gary, and R. Riesenberger John. International business: Strategy, management and the new realities, 2010.
[14]
H. M. Wallach. Topic modeling: beyond bag-of-words. In Proceedings of the 23rd International Conference on Machine Learning, pages 977--984, 2006.
[15]
D. Zhang, Q. Mei, and C. Zhai. Cross-lingual latent topic extraction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1128--1137, 2010.

Index Terms

  1. Digesting Multilingual Reader Comments via Latent Discussion Topics with Commonality and Specificity

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
        October 2016
        2566 pages
        ISBN:9781450340731
        DOI:10.1145/2983323
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 24 October 2016

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. commonality and specificity
        2. latent discussion topics
        3. multilingual news reader comments

        Qualifiers

        • Research-article

        Conference

        CIKM'16
        Sponsor:
        CIKM'16: ACM Conference on Information and Knowledge Management
        October 24 - 28, 2016
        Indiana, Indianapolis, USA

        Acceptance Rates

        CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;
        Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

        Upcoming Conference

        CIKM '25

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 108
          Total Downloads
        • Downloads (Last 12 months)1
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 25 Jan 2025

        Other Metrics

        Citations

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media