Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3023476.3023525guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Topic models conditioned on arbitrary features with Dirichlet-multinomial regression

Published: 09 July 2008 Publication History

Abstract

Although fully generative models have been successfully used to model the contents of text documents, they are often awkward to apply to combinations of text data and document metadata. In this paper we propose a Dirichlet-multinomial regression (DMR) topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such as author, publication venue, references, and dates. We show that by selecting appropriate features, DMR topic models can meet or exceed the performance of several previously published topic models designed for specific data.

References

[1]
D. Blei and M. Jordan. Modeling annotated data. In SIGIR, 2003.
[2]
D. Blei and J. D. McAuliffe. Supervised topic models. In NIPS, 2007.
[3]
D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993-1022, January 2003.
[4]
L. Dietz, S. Bickel, and T. Scheffer. Unsupervised prediction of citation influences. In ICML, 2007.
[5]
E. Erosheva, S. Fienberg, and J. Lafferty. Mixed membership models of scientific publications. PNAS, 101(Suppl. 1):5220-5227, 2004.
[6]
P. Guimaraes and R. Lindrooth. Dirichlet-multinomial regression. Econometrics 0509001, Econ-WPA, Sept. 2005.
[7]
W. Li and A. McCallum. Pachinko allocation: DAG-structured mixture models of topic correlations. In ICML, 2006.
[8]
D. Liu and J. Nocedal. On the limited memory method for large scale optimization. Mathematical Programming B, 45(3):503-528, 1989.
[9]
A. McCallum, A. Corrada-Emmanuel, and X. Wang. Topic and role discovery in social networks. In IJCAI, 2005.
[10]
A. K. McCallum. MALLET: A machine learning for language toolkit. http://mallet.cs.umass.edu, 2002.
[11]
D. Mimno and A. McCallum. Expertise modeling for matching papers with reviewers. In KDD, 2007.
[12]
D. Newman, C. Chemudugunta, and P. Smyth. Statistical entity-topic models. In KDD, 2006.
[13]
M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. The author-topic model for authors and documents. In UAI, 2004.
[14]
H. M. Wallach. Topic modeling: beyond bag-of-words. In ICML, 2006.
[15]
X. Wang and A. McCallum. Topics over time: a non-Markov continuous-time model of topical trends. In KDD, 2006.
[16]
X. Wang, N. Mohanty, and A. McCallum. Group and topic discovery from relations and their attributes. In NIPS, 2005.

Cited By

View all
  • (2023)A Review of Stability in Topic Modeling: Metrics for Assessing and Techniques for Improving StabilityACM Computing Surveys10.1145/362326956:5(1-32)Online publication date: 27-Nov-2023
  • (2022)Topic Modeling Techniques for Text Mining Over a Large-Scale Scientific and Biomedical Text CorpusInternational Journal of Ambient Computing and Intelligence10.4018/IJACI.29313713:1(1-18)Online publication date: 29-Apr-2022
  • (2022)Transferable adversarial examples can efficiently fool topic modelsComputers and Security10.1016/j.cose.2022.102749118:COnline publication date: 1-Jul-2022
  • Show More Cited By
  1. Topic models conditioned on arbitrary features with Dirichlet-multinomial regression

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    UAI'08: Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence
    July 2008
    609 pages
    ISBN:0974903949

    Publisher

    AUAI Press

    Arlington, Virginia, United States

    Publication History

    Published: 09 July 2008

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 02 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A Review of Stability in Topic Modeling: Metrics for Assessing and Techniques for Improving StabilityACM Computing Surveys10.1145/362326956:5(1-32)Online publication date: 27-Nov-2023
    • (2022)Topic Modeling Techniques for Text Mining Over a Large-Scale Scientific and Biomedical Text CorpusInternational Journal of Ambient Computing and Intelligence10.4018/IJACI.29313713:1(1-18)Online publication date: 29-Apr-2022
    • (2022)Transferable adversarial examples can efficiently fool topic modelsComputers and Security10.1016/j.cose.2022.102749118:COnline publication date: 1-Jul-2022
    • (2020)Topic Modeling of Short Texts Using Anchor WordsProceedings of the 10th International Conference on Web Intelligence, Mining and Semantics10.1145/3405962.3405968(210-219)Online publication date: 30-Jun-2020
    • (2019)A Survey of Multi-Label Topic ModelsACM SIGKDD Explorations Newsletter10.1145/3373464.337347421:2(61-79)Online publication date: 26-Nov-2019
    • (2019)Modeling Spatio-Temporal App Usage for a Large User PopulationProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/33144143:1(1-23)Online publication date: 29-Mar-2019
    • (2019)Variational low rank multinomials for collaborative filtering with side-informationProceedings of the 13th ACM Conference on Recommender Systems10.1145/3298689.3347036(340-347)Online publication date: 10-Sep-2019
    • (2019)Homogeneity-Based Transmissive Process to Model True and False News in Social NetworksProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3291009(348-356)Online publication date: 30-Jan-2019
    • (2019)Exploiting the value of class labels on high-dimensional feature spacesPattern Analysis & Applications10.1007/s10044-017-0629-422:2(299-309)Online publication date: 1-May-2019
    • (2018)MultiCellProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/32649162:3(1-25)Online publication date: 18-Sep-2018
    • Show More Cited By

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media