Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2736277.2741643acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content

Published: 18 May 2015 Publication History

Abstract

We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a hierarchical framework with two embedded neural language models. In particular, we exploit the context of documents in streams and use one of the language models to model the document sequences, and the other to model word sequences within them. The models learn continuous vector representations for both word tokens and documents such that semantically similar documents and words are close in a common vector space. We discuss extensions to our model, which can be applied to personalized recommendation and social relationship mining by adding further user layers to the hierarchy, thus learning user-specific vectors to represent individual preferences. We validated the learned representations on a public movie rating data set from MovieLens, as well as on a large-scale Yahoo News data comprising three months of user activity logs collected on Yahoo servers. The results indicate that the proposed model can learn useful representations of both documents and word tokens, outperforming the current state-of-the-art by a large margin.

References

[1]
R. Baeza-Yates, D. Jiang, F. Silvestri, and B. Harrison. Predicting the next app that you are going to use. In Proceedings of the 8th ACM international conference on Web search and data mining. ACM, 2015.
[2]
Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137--1155, 2003.
[3]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003.
[4]
A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems, pages 2787--2795, 2013.
[5]
C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011.
[6]
R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, pages 160--167. ACM, 2008.
[7]
N. Djuric, V. Radosavljevic, M. Grbovic, and N. Bhamidipati. Hidden conditional random fields with distributed user embeddings for ad targeting. In IEEE International Conference on Data Mining, 2014.
[8]
M. Grbovic and S. Vucetic. Generating ad targeting rules using sparse principal component analysis with constraints. In International World Wide Web Conference, pages 283--284, 2014.
[9]
M. Hoffman, F. R. Bach, and D. M. Blei. Online learning for Latent Dirichlet Allocation. In Advances in Neural Information Processing Systems, pages 856--864, 2010.
[10]
T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 50--57. ACM, 1999.
[11]
R. Kiros, R. Zemel, and R. Salakhutdinov. Multimodal neural language models. In Proceedings of the 31th International Conference on Machine Learning, 2014.
[12]
R. Kiros, R. S. Zemel, and R. Salakhutdinov. A multiplicative model for learning distributed text-based attribute representations. arXiv preprint arXiv:1406.2710, 2014.
[13]
Q. V. Le and T. Mikolov. Distributed representations of sentences and documents. arXiv preprint arXiv:1405.4053, 2014.
[14]
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
[15]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pages 3111--3119, 2013.
[16]
A. Mnih and G. Hinton. Three new graphical models for statistical language modelling. In Proceedings of the 24th international conference on Machine learning, pages 641--648. ACM, 2007.
[17]
A. Mnih and Y. W. Teh. A fast and simple algorithm for training neural probabilistic language models. arXiv preprint arXiv:1206.6426, 2012.
[18]
F. Morin and Y. Bengio. Hierarchical probabilistic neural network language model. In AISTATS, volume 5, pages 246--252, 2005.
[19]
B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. arXiv preprint arXiv:1403.6652, 2014.
[20]
R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In Advances in Neural Information Processing Systems, pages 926--934, 2013.

Cited By

View all
  • (2023)Deep Learning for Natural Language Processing: A SurveyJournal of Mathematical Sciences10.1007/s10958-023-06519-6273:4(533-582)Online publication date: 26-Jun-2023
  • (2022)2D Vector Representation of Binomial Hierarchical Tree Items2022 International Conference on Theoretical and Applied Computer Science and Engineering (ICTASCE)10.1109/ICTACSE50438.2022.10009738(120-126)Online publication date: 29-Sep-2022
  • (2022)AI Algorithms in NetworksArtificial Intelligence and Quantum Computing for Advanced Wireless Networks10.1002/9781119790327.ch7(227-360)Online publication date: 15-Apr-2022
  • Show More Cited By

Index Terms

  1. Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '15: Proceedings of the 24th International Conference on World Wide Web
      May 2015
      1460 pages
      ISBN:9781450334693

      Sponsors

      • IW3C2: International World Wide Web Conference Committee

      In-Cooperation

      Publisher

      International World Wide Web Conferences Steering Committee

      Republic and Canton of Geneva, Switzerland

      Publication History

      Published: 18 May 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. distributed representations
      2. document embeddings
      3. document modeling
      4. machine learning
      5. word embeddings

      Qualifiers

      • Research-article

      Conference

      WWW '15
      Sponsor:
      • IW3C2

      Acceptance Rates

      WWW '15 Paper Acceptance Rate 131 of 929 submissions, 14%;
      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)15
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 18 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Deep Learning for Natural Language Processing: A SurveyJournal of Mathematical Sciences10.1007/s10958-023-06519-6273:4(533-582)Online publication date: 26-Jun-2023
      • (2022)2D Vector Representation of Binomial Hierarchical Tree Items2022 International Conference on Theoretical and Applied Computer Science and Engineering (ICTASCE)10.1109/ICTACSE50438.2022.10009738(120-126)Online publication date: 29-Sep-2022
      • (2022)AI Algorithms in NetworksArtificial Intelligence and Quantum Computing for Advanced Wireless Networks10.1002/9781119790327.ch7(227-360)Online publication date: 15-Apr-2022
      • (2021)Identifying Salient Entities of News Articles Using Binary Salient Classifier2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9671567(1541-1549)Online publication date: 15-Dec-2021
      • (2020)Goal-driven Command Recommendations for AnalystsProceedings of the 14th ACM Conference on Recommender Systems10.1145/3383313.3412255(160-169)Online publication date: 22-Sep-2020
      • (2020)Network Representation Learning: A SurveyIEEE Transactions on Big Data10.1109/TBDATA.2018.28500136:1(3-28)Online publication date: 1-Mar-2020
      • (2020)Story embeddingArtificial Intelligence10.1016/j.artint.2020.103235281:COnline publication date: 1-Apr-2020
      • (2019)Extreme Multiclass Classification CriteriaComputation10.3390/computation70100167:1(16)Online publication date: 12-Mar-2019
      • (2019)On the influence of low-level visual features in film classificationPLOS ONE10.1371/journal.pone.021140614:2(e0211406)Online publication date: 22-Feb-2019
      • (2019)From Free-text User Reviews to Product Recommendation using Paragraph Vectors and Matrix FactorizationCompanion Proceedings of The 2019 World Wide Web Conference10.1145/3308560.3316601(335-343)Online publication date: 13-May-2019
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media