Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/ICDM.2008.140guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking

Published: 15 December 2008 Publication History

Abstract

This paper presents Online Topic Model (OLDA), a topic model that automatically captures the thematic patterns and identifies emerging topics of text streams and their changes over time. Our approach allows the topic modeling framework, specifically the Latent Dirichlet Allocation (LDA) model, to work in an online fashion such that it incrementally builds an up-to-date model (mixture of topics per document and mixture of words per topic) when a new document (or a set of documents) appears. A solution based on the Empirical Bayes method is proposed. The idea is to incrementally update the current model according to the information inferred from the new stream of data with no need to access previous data. The dynamics of the proposed approach also provide an efficient mean to track the topics over time and detect the emerging topics in real time. Our method is evaluated both qualitatively and quantitatively using benchmark datasets. In our experiments, the OLDA has discovered interesting patterns by just analyzing a fraction of data at a time. Our tests also prove the ability of OLDA to align the topics across the epochs with which the evolution of the topics over time is captured. The OLDA is also comparable to, and sometimes better than, the original LDA in predicting the likelihood of unseen documents.

Cited By

View all
  • (2023)Topic modeling methods for short textsJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-22383445:2(1971-1990)Online publication date: 1-Jan-2023
  • (2023)Dirichlet-Survival Process: Scalable Inference of Topic-Dependent Diffusion NetworksAdvances in Information Retrieval10.1007/978-3-031-28238-6_47(562-570)Online publication date: 2-Apr-2023
  • (2022)SIDEWAYS-2022 @ HT-2022: 7th International Workshop on Social Media World SensorsProceedings of the 7th International Workshop on Social Media World Sensors10.1145/3544795.3544844(1-4)Online publication date: 28-Jun-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICDM '08: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
December 2008
1145 pages
ISBN:9780769535029

Publisher

IEEE Computer Society

United States

Publication History

Published: 15 December 2008

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Topic modeling methods for short textsJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-22383445:2(1971-1990)Online publication date: 1-Jan-2023
  • (2023)Dirichlet-Survival Process: Scalable Inference of Topic-Dependent Diffusion NetworksAdvances in Information Retrieval10.1007/978-3-031-28238-6_47(562-570)Online publication date: 2-Apr-2023
  • (2022)SIDEWAYS-2022 @ HT-2022: 7th International Workshop on Social Media World SensorsProceedings of the 7th International Workshop on Social Media World Sensors10.1145/3544795.3544844(1-4)Online publication date: 28-Jun-2022
  • (2022)SIDEWAYS-2022 @ HT-2022: 7th International Workshop on Social Media World SensorsProceedings of the 33rd ACM Conference on Hypertext and Social Media10.1145/3511095.3532573(265-268)Online publication date: 28-Jun-2022
  • (2021)Network Public Opinion Detection During the Coronavirus Pandemic: A Short-Text Relational Topic ModelACM Transactions on Knowledge Discovery from Data10.1145/348024616:3(1-27)Online publication date: 22-Oct-2021
  • (2021)Topic Modeling Using Latent Dirichlet allocationACM Computing Surveys10.1145/346247854:7(1-35)Online publication date: 17-Sep-2021
  • (2019)Neural Variational Correlated Topic ModelingThe World Wide Web Conference10.1145/3308558.3313561(1142-1152)Online publication date: 13-May-2019
  • (2019)Understanding the mechanism of social tie in the propagation process of social network with communication channelFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-018-7453-x13:6(1296-1308)Online publication date: 1-Dec-2019
  • (2019)Frontier knowledge discovery and visualization in cancer field based on KOS and LDAScientometrics10.1007/s11192-018-2989-y118:3(979-1010)Online publication date: 1-Mar-2019
  • (2019)Latent Dirichlet allocation (LDA) and topic modelingMultimedia Tools and Applications10.1007/s11042-018-6894-478:11(15169-15211)Online publication date: 1-Jun-2019
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media