Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access

Probabilistic topic models

Published: 01 April 2012 Publication History

Abstract

Surveying a suite of algorithms that offer a solution to managing large document archives.

References

[1]
Asuncion, A., Welling, M., Smyth, P., Teh, Y. On smoothing and inference for topic models. In Uncertainty in Artificial Intelligence (2009).
[2]
Bart, E., Welling, M., Perona, P. Unsupervised organization of image collections: Taxonomies and beyond. Trans. Pattern Recognit. Mach. Intell. 33, 11 (2010) (2301--2315).
[3]
Blei, D., Griffiths, T., Jordan, M. The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM 57, 2 (2010), 1--30.
[4]
Blei, D., Jordan, M. Modeling annotated data. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2003), ACM Press, 127--134.
[5]
Blei, D., Lafferty, J. Dynamic topic models. In International Conference on Machine Learning (2006), ACM, New York, NY, USA, 113--120.
[6]
Blei, D., Lafferty, J. A correlated topic model of Science. Ann. Appl. Stat., 1, 1 (2007), 17--35.
[7]
Blei, D., McAuliffe, J. Supervised topic models. In Neural Information Processing Systems (2007).
[8]
Blei, D., Ng, A., Jordan, M. Latent Dirichlet allocation. J. Mach. Learn. Res. 3 (January 2003), 993--1022.
[9]
Box, G. Sampling and Bayes' inference in scientific modeling and robustness. J. Roy. Stat. Soc. 143, 4 (1980), 383--430.
[10]
Boyd-Graber, J., Blei, D. Syntactic topic models. In Neural Information Processing Systems (2009).
[11]
Buntine, W. Variational extensions to EM and multinomial PCA. In European Conference on Machine Learning (2002).
[12]
Buntine, W., Jakulin, A. Discrete component analysis. Subspace, Latent Structure and Feature Selection. C. Saunders, M. Grobelink, S. Gunn, and J. Shawe-Taylor, Eds. Springer, 2006.
[13]
Chang, J., Blei, D. Hierarchical relational models for document networks. Ann. Appl. Stat. 4, 1 (2010).
[14]
Deerwester, S., Dumais, S., Landauer, T., Furnas, G., Harshman, R. Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41, 6 (1990), 391--407.
[15]
Doyle, G., Elkan, C., Accounting for burstiness in topic models. In International Conference on Machine Learning (2009), ACM, 281--288.
[16]
Fei-Fei, L., Perona, P. A Bayesian hierarchical model for learning natural scene categories. In IEEE Computer Vision and Pattern Recognition (2005), 524--531.
[17]
Gerrish, S., Blei, D. A language-based approach to measuring scholarly impact. In International Conference on Machine Learning (2010).
[18]
Griffiths, T., Steyvers, M., Blei, D., Tenenbaum, J. Integrating topics and syntax. Advances in Neural Information Processing Systems 17. L. K. Saul, Y. Weiss, and L. Bottou, eds. MIT Press, Cambridge, MA, 2005, 537--544.
[19]
Grimmer, J. A Bayesian hierarchical topic model for political texts: Measuring expressed agendas in senate press releases. Polit. Anal. 18, 1 (2010), 1.
[20]
Hoffman, M., Blei, D., Bach, F. On-line learning for latent Dirichlet allocation. In Neural Information Processing Systems (2010).
[21]
Hofmann, T. Probabilistic latent semantic analysis. In Uncertainty in Artificial Intelligence (UAI) (1999).
[22]
Jordan, M., Ghahramani, Z., Jaakkola, T., Saul, L. Introduction to variational methods for graphical models. Mach. Learn. 37 (1999), 183--233.
[23]
Li, J., Wang, C., Lim, Y., Blei, D., Fei-Fei, L., Building and using a semantivisual image hierarchy. In Computer Vision and Pattern Recognition (2010).
[24]
Li, W., McCallum, A. Pachinko allocation: DAG-structured mixture models of topic correlations. In International Conference on Machine Learning (2006), 577--584.
[25]
Mimno, D., McCallum, A. Topic models conditioned on arbitrary features with Dirichlet-multinomial regression. In Uncertainty in Artificial Intelligence (2008).
[26]
Newman, D., Chemudugunta, C., Smyth, P. Statistical entity-topic models. In Knowledge Discovery and Data Mining (2006).
[27]
Pritchard, J., Stephens, M., Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155 (June 2000), 945--959.
[28]
Reisinger, J., Waters, A., Silverthorn, B., Mooney, R. Spherical topic models. In International Conference on Machine Learning (2010).
[29]
Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smith, P., The author-topic model for authors and documents. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (2004), AUAI Press, 487--494.
[30]
Rubin, D. Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12, 4 (1984), 1151--1172.
[31]
Sivic, J., Russell, B., Zisserman, A., Freeman, W., Efros, A., Unsupervised discovery of visual object class hierarchies. In Conference on Computer Vision and Pattern Recognition (2008).
[32]
Socher, R., Gershman, S., Perotte, A., Sederberg, P., Blei, D., Norman, K. A Bayesian analysis of dynamics in free recall. In Advances in Neural Information Processing Systems 22. Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, Eds, 2009.
[33]
Steyvers, M., Griffiths, T. Probabilistic topic models. Latent Semantic Analysis: A Road to Meaning. T. Landauer, D. McNamara, S. Dennis, and W. Kintsch, eds. Lawrence Erlbaum, 2006.
[34]
Teh, Y., Jordan, M., Beal, M., Blei, D. Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101, 476 (2006), 1566--1581.
[35]
Wainwright, M., Jordan, M. Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1--2) (2008), 1--305.
[36]
Wallach, H. Topic modeling: Beyond bag of words. In Proceedings of the 23rd International Conference on Machine Learning (2006).
[37]
Wang, C., Blei, D. Decoupling sparsity and smoothness in the discrete hierarchical Dirichlet process. Advances in Neural Information Processing Systems 22. Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, Eds. 2009, 1982--1989.
[38]
Wang, C., Thiesson, B., Meek, C., Blei, D. Markov topic models. In Artificial Intelligence and Statistics (2009).

Cited By

View all
  • (2025)Inhibitors in ridesharing firms from developing Nations: A novel Integrated MCDM – Text Mining approach using Large-Scale dataTransportation Research Part E: Logistics and Transportation Review10.1016/j.tre.2024.103832193(103832)Online publication date: Jan-2025
  • (2025)Facilitating topic modeling in tourism research:Comprehensive comparison of new AI technologiesTourism Management10.1016/j.tourman.2024.105007106(105007)Online publication date: Feb-2025
  • (2025)A socio-cognitive analysis of innovation diffusion: Interventionism and substantivenessTechnological Forecasting and Social Change10.1016/j.techfore.2024.123847210(123847)Online publication date: Jan-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 55, Issue 4
April 2012
110 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/2133806
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2012
Published in CACM Volume 55, Issue 4

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Popular
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5,054
  • Downloads (Last 6 weeks)659
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Inhibitors in ridesharing firms from developing Nations: A novel Integrated MCDM – Text Mining approach using Large-Scale dataTransportation Research Part E: Logistics and Transportation Review10.1016/j.tre.2024.103832193(103832)Online publication date: Jan-2025
  • (2025)Facilitating topic modeling in tourism research:Comprehensive comparison of new AI technologiesTourism Management10.1016/j.tourman.2024.105007106(105007)Online publication date: Feb-2025
  • (2025)A socio-cognitive analysis of innovation diffusion: Interventionism and substantivenessTechnological Forecasting and Social Change10.1016/j.techfore.2024.123847210(123847)Online publication date: Jan-2025
  • (2025)Identifying contextual content-based risk drivers for advanced risk management strategiesResearch in International Business and Finance10.1016/j.ribaf.2024.10264373(102643)Online publication date: Jan-2025
  • (2024)Future Disease Risk Analysis in Korean Society as Reflected in the Media: Topic ModelingJournal of Digital Contents Society10.9728/dcs.2024.25.5.137325:5(1373-1386)Online publication date: 31-May-2024
  • (2024)Research Trends on Living Donors for Liver Transplantation: A Text Network Analysis and Topic ModelingJournal of Korean Academy of Fundamentals of Nursing10.7739/jkafn.2024.31.2.15731:2(157-167)Online publication date: 31-May-2024
  • (2024)Mobile app review analysis for crowdsourcing of software requirements: a mapping study of automated and semi-automated toolsPeerJ Computer Science10.7717/peerj-cs.240110(e2401)Online publication date: 5-Nov-2024
  • (2024)What are developers talking about information security? A large-scale study using semantic analysis of Q&A postsPeerJ Computer Science10.7717/peerj-cs.195410(e1954)Online publication date: 26-Mar-2024
  • (2024)Computational Legal Studies Comes of AgeEuropean Journal of Empirical Legal Studies10.62355/ejels.196841:1(89-104)Online publication date: 13-May-2024
  • (2024)Mapping the Geometry of Law Using Natural Language ProcessingEuropean Journal of Empirical Legal Studies10.62355/ejels.180731:1(49-68)Online publication date: 13-May-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media