Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2783258.2783264acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Diversifying Restricted Boltzmann Machine for Document Modeling

Published: 10 August 2015 Publication History
  • Get Citation Alerts
  • Abstract

    Restricted Boltzmann Machine (RBM) has shown great effectiveness in document modeling. It utilizes hidden units to discover the latent topics and can learn compact semantic representations for documents which greatly facilitate document retrieval, clustering and classification. The popularity (or frequency) of topics in text corpora usually follow a power-law distribution where a few dominant topics occur very frequently while most topics (in the long-tail region) have low probabilities. Due to this imbalance, RBM tends to learn multiple redundant hidden units to best represent dominant topics and ignore those in the long-tail region, which renders the learned representations to be redundant and non-informative. To solve this problem, we propose Diversified RBM (DRBM) which diversifies the hidden units, to make them cover not only the dominant topics, but also those in the long-tail region. We define a diversity metric and use it as a regularizer to encourage the hidden units to be diverse. Since the diversity metric is hard to optimize directly, we instead optimize its lower bound and prove that maximizing the lower bound with projected gradient ascent can increase this diversity metric. Experiments on document retrieval and clustering demonstrate that with diversification, the document modeling power of DRBM can be greatly improved.

    Supplementary Material

    MP4 File (p1315.mp4)

    References

    [1]
    C. Archambeau, B. Lakshminarayanan, and G. Bouchard. Latent ibp compound dirichlet allocation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2015.
    [2]
    Y. Bengio, H. Schwenk, J.-S. Senécal, F. Morin, and J.-L. Gauvain. Neural probabilistic language models. In Innovations in Machine Learning. Springer, 2006.
    [3]
    D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 2003.
    [4]
    D. Cai and X. He. Manifold adaptive experimental design for text categorization. Knowledge and Data Engineering, IEEE Transactions on, 2012.
    [5]
    D. Cai, X. He, and J. Han. Document clustering using locality preserving indexing. Knowledge and Data Engineering, IEEE Transactions on, 2005.
    [6]
    D. Cai, X. He, and J. Han. Locally consistent concept factorization for document clustering. Knowledge and Data Engineering, IEEE Transactions on, 2011.
    [7]
    Z. Cao, S. Li, Y. Liu, W. Li, and H. Ji. A novel neural topic model and its supervised extension. AAAI Conference on Artificial Intelligence, 2015.
    [8]
    Z. Chen and B. Liu. Mining topics in documents: standing on the shoulders of big data. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014.
    [9]
    M. Denil, A. Demiraj, N. Kalchbrenner, P. Blunsom, and N. de Freitas. Modelling, visualising and summarising documents with a single convolutional neural network. arXiv preprint arXiv:1406.3830, 2014.
    [10]
    J. H. Friedman, J. L. Bentley, and R. A. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 1977.
    [11]
    R. Guha. Towards a model theory for distributed representations. arXiv preprint arXiv:1410.5859, 2014.
    [12]
    G. Hinton. Training products of experts by minimizing contrastive divergence. Neural computation, 2002.
    [13]
    G. E. Hinton and R. R. Salakhutdinov. Replicated softmax: an undirected topic model. In Advances in neural information processing systems, 2009.
    [14]
    T. Hofmann. Probabilistic latent semantic analysis. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, 1999.
    [15]
    C. Huang, X. Qiu, and X. Huang. Text classification with document embeddings. In Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. Springer, 2014.
    [16]
    A. Kulesza and B. Taskar. Determinantal point processes for machine learning. arXiv preprint arXiv:1207.6083, 2012.
    [17]
    J. T. Kwok and R. P. Adams. Priors for diversity in generative latent variable models. In Advances in Neural Information Processing Systems. 2012.
    [18]
    H. Larochelle and S. Lauly. A neural autoregressive topic model. In Advances in Neural Information Processing Systems, 2012.
    [19]
    Q. V. Le and T. Mikolov. Distributed representations of sentences and documents. International Conference on Machine Learning, 2014.
    [20]
    T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, 2013.
    [21]
    J. Pitman and M. Yor. The two-parameter poisson-dirichlet distribution derived from a stable subordinator. The Annals of Probability, 1997.
    [22]
    I. Sato and H. Nakagawa. Topic models with power-law using pitman-yor process. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2010.
    [23]
    I. R. Shafarevich, A. Remizov, D. P. Kramer, and L. Nekludova. Linear algebra and geometry. Springer Science & Business Media, 2012.
    [24]
    P. Smolensky. Information processing in dynamical systems: Foundations of harmony theory. 1986.
    [25]
    R. Socher, C. C. Lin, C. Manning, and A. Y. Ng. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of the 28th international conference on machine learning, 2011.
    [26]
    N. Srivastava, R. Salakhutdinov, and G. Hinton. Modeling documents with a deep boltzmann machine. In Uncertainty in Artificial Intelligence, 2013.
    [27]
    Y. Wang, X. Zhao, Z. Sun, H. Yan, L. Wang, Z. Jin, L. Wang, Y. Gao, C. Law, and J. Zeng. Peacock: Learning long-tail topic features for industrial applications. arXiv preprint arXiv:1405.4402, 2014.

    Cited By

    View all
    • (2024)The importance of nodal plane orientation diversity for earthquake focal mechanism stress inversionsGeological Society, London, Special Publications10.1144/SP546-2023-63546:1Online publication date: 5-Jan-2024
    • (2024)Automatic approach for breast cancer detection based on deep belief network using histopathology imagesMultimedia Tools and Applications10.1007/s11042-024-18949-8Online publication date: 25-Mar-2024
    • (2022)Investigating the Efficient Use of Word Embedding with Neural-Topic Models for Interpretable Topics from Short TextsSensors10.3390/s2203085222:3(852)Online publication date: 23-Jan-2022
    • Show More Cited By

    Index Terms

    1. Diversifying Restricted Boltzmann Machine for Document Modeling

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
      August 2015
      2378 pages
      ISBN:9781450336642
      DOI:10.1145/2783258
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 10 August 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. diversified restricted boltzmann machine
      2. diversity
      3. document modeling
      4. power-law distribution
      5. topic modeling

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      KDD '15
      Sponsor:

      Acceptance Rates

      KDD '15 Paper Acceptance Rate 160 of 819 submissions, 20%;
      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Upcoming Conference

      KDD '24

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)9
      • Downloads (Last 6 weeks)1

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)The importance of nodal plane orientation diversity for earthquake focal mechanism stress inversionsGeological Society, London, Special Publications10.1144/SP546-2023-63546:1Online publication date: 5-Jan-2024
      • (2024)Automatic approach for breast cancer detection based on deep belief network using histopathology imagesMultimedia Tools and Applications10.1007/s11042-024-18949-8Online publication date: 25-Mar-2024
      • (2022)Investigating the Efficient Use of Word Embedding with Neural-Topic Models for Interpretable Topics from Short TextsSensors10.3390/s2203085222:3(852)Online publication date: 23-Jan-2022
      • (2022)Darwin's Theory of CensorshipProceedings of the 21st Workshop on Privacy in the Electronic Society10.1145/3559613.3563206(103-108)Online publication date: 7-Nov-2022
      • (2022)Multi Task Mutual Learning for Joint Sentiment Classification and Topic DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.299948934:4(1915-1927)Online publication date: 1-Apr-2022
      • (2022)Spherical Zero-Shot LearningIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2021.306706732:2(634-645)Online publication date: Feb-2022
      • (2022)Eccentric regularization: minimizing hyperspherical energy without explicit projection2022 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN55064.2022.9892944(1-9)Online publication date: 18-Jul-2022
      • (2021)A Lightweight Knowledge Graph Embedding Framework for Efficient Inference and StorageProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482224(1909-1918)Online publication date: 26-Oct-2021
      • (2021)A Survey of Automatic Text Summarization: Progress, Process and ChallengesIEEE Access10.1109/ACCESS.2021.31297869(156043-156070)Online publication date: 2021
      • (2021)A Topic Coverage Approach to Evaluation of Topic ModelsIEEE Access10.1109/ACCESS.2021.31094259(123280-123312)Online publication date: 2021
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media