research-article

Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content

Authors:

Nemanja Djuric,

Vladan Radosavljevic,

Mihajlo Grbovic,

Narayan BhamidipatiAuthors Info & Claims

WWW '15: Proceedings of the 24th International Conference on World Wide Web

Pages 248 - 255

https://doi.org/10.1145/2736277.2741643

Published: 18 May 2015 Publication History

Abstract

We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a hierarchical framework with two embedded neural language models. In particular, we exploit the context of documents in streams and use one of the language models to model the document sequences, and the other to model word sequences within them. The models learn continuous vector representations for both word tokens and documents such that semantically similar documents and words are close in a common vector space. We discuss extensions to our model, which can be applied to personalized recommendation and social relationship mining by adding further user layers to the hierarchy, thus learning user-specific vectors to represent individual preferences. We validated the learned representations on a public movie rating data set from MovieLens, as well as on a large-scale Yahoo News data comprising three months of user activity logs collected on Yahoo servers. The results indicate that the proposed model can learn useful representations of both documents and word tokens, outperforming the current state-of-the-art by a large margin.

References

[1]

R. Baeza-Yates, D. Jiang, F. Silvestri, and B. Harrison. Predicting the next app that you are going to use. In Proceedings of the 8th ACM international conference on Web search and data mining. ACM, 2015.

Digital Library

[2]

Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137--1155, 2003.

Digital Library

[3]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003.

Digital Library

[4]

A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems, pages 2787--2795, 2013.

Digital Library

[5]

C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011.

Digital Library

[6]

R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, pages 160--167. ACM, 2008.

Digital Library

[7]

N. Djuric, V. Radosavljevic, M. Grbovic, and N. Bhamidipati. Hidden conditional random fields with distributed user embeddings for ad targeting. In IEEE International Conference on Data Mining, 2014.

Digital Library

[8]

M. Grbovic and S. Vucetic. Generating ad targeting rules using sparse principal component analysis with constraints. In International World Wide Web Conference, pages 283--284, 2014.

Digital Library

[9]

M. Hoffman, F. R. Bach, and D. M. Blei. Online learning for Latent Dirichlet Allocation. In Advances in Neural Information Processing Systems, pages 856--864, 2010.

Digital Library

[10]

T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 50--57. ACM, 1999.

Digital Library

[11]

R. Kiros, R. Zemel, and R. Salakhutdinov. Multimodal neural language models. In Proceedings of the 31th International Conference on Machine Learning, 2014.

[12]

R. Kiros, R. S. Zemel, and R. Salakhutdinov. A multiplicative model for learning distributed text-based attribute representations. arXiv preprint arXiv:1406.2710, 2014.

[13]

Q. V. Le and T. Mikolov. Distributed representations of sentences and documents. arXiv preprint arXiv:1405.4053, 2014.

[14]

T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.

[15]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pages 3111--3119, 2013.

Digital Library

[16]

A. Mnih and G. Hinton. Three new graphical models for statistical language modelling. In Proceedings of the 24th international conference on Machine learning, pages 641--648. ACM, 2007.

Digital Library

[17]

A. Mnih and Y. W. Teh. A fast and simple algorithm for training neural probabilistic language models. arXiv preprint arXiv:1206.6426, 2012.

[18]

F. Morin and Y. Bengio. Hierarchical probabilistic neural network language model. In AISTATS, volume 5, pages 246--252, 2005.

[19]

B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. arXiv preprint arXiv:1403.6652, 2014.

Digital Library

[20]

R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In Advances in Neural Information Processing Systems, pages 926--934, 2013.

Digital Library

Cited By

Arkhangelskaya ENikolenko S(2023)Deep Learning for Natural Language Processing: A SurveyJournal of Mathematical Sciences10.1007/s10958-023-06519-6273:4(533-582)Online publication date: 26-Jun-2023
https://doi.org/10.1007/s10958-023-06519-6
Donmez IKarateke SZontul M(2022)2D Vector Representation of Binomial Hierarchical Tree Items2022 International Conference on Theoretical and Applied Computer Science and Engineering (ICTASCE)10.1109/ICTACSE50438.2022.10009738(120-126)Online publication date: 29-Sep-2022
https://doi.org/10.1109/ICTACSE50438.2022.10009738
Glisic SLorenzo B(2022)AI Algorithms in NetworksArtificial Intelligence and Quantum Computing for Advanced Wireless Networks10.1002/9781119790327.ch7(227-360)Online publication date: 15-Apr-2022
https://doi.org/10.1002/9781119790327.ch7
Show More Cited By

Index Terms

Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content
1. Applied computing
  1. Document management and text processing
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Enhanced Word Embeddings from a Hierarchical Neural Language Model
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

This paper proposes a neural language model to capture the interaction of text units of different levels, i.e.., documents, paragraphs, sentences, words in an hierarchical structure. At each paralleled level, the model incorporates Markov property while ...
Doc2Sent2Vec: A Novel Two-Phase Approach for Learning Document Representation
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Doc2Sent2Vec is an unsupervised approach to learn low-dimensional feature vector (or embedding) for a document. This embedding captures the semantics of the document and can be fed as input to machine learning algorithms to solve a myriad number of ...
Application of the distributed document representation in the authorship attribution task for small corpora

Distributed word representation in a vector space (word embeddings) is a novel technique that allows to represent words in terms of the elements in the neighborhood. Distributed representations can be extended to larger language structures like phrases, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '15: Proceedings of the 24th International Conference on World Wide Web

May 2015

1460 pages

ISBN:9781450334693

General Chairs:
Aldo Gangemi
National Research Council, Italy & Paris 13 University-CNRS, France
,
Stefano Leonardi
Sapienza University of Rome, Italy
,
Alessandro Panconesi
Sapienza University of Rome, Italy

Copyright © 2015 Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Sponsors

IW3C2: International World Wide Web Conference Committee

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 18 May 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '15

Sponsor:

IW3C2

WWW '15: 24th International World Wide Web Conference

May 18 - 22, 2015

Florence, Italy

Acceptance Rates

WWW '15 Paper Acceptance Rate 131 of 929 submissions, 14%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

50
Total Citations
View Citations
594
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Arkhangelskaya ENikolenko S(2023)Deep Learning for Natural Language Processing: A SurveyJournal of Mathematical Sciences10.1007/s10958-023-06519-6273:4(533-582)Online publication date: 26-Jun-2023
https://doi.org/10.1007/s10958-023-06519-6
Donmez IKarateke SZontul M(2022)2D Vector Representation of Binomial Hierarchical Tree Items2022 International Conference on Theoretical and Applied Computer Science and Engineering (ICTASCE)10.1109/ICTACSE50438.2022.10009738(120-126)Online publication date: 29-Sep-2022
https://doi.org/10.1109/ICTACSE50438.2022.10009738
Glisic SLorenzo B(2022)AI Algorithms in NetworksArtificial Intelligence and Quantum Computing for Advanced Wireless Networks10.1002/9781119790327.ch7(227-360)Online publication date: 15-Apr-2022
https://doi.org/10.1002/9781119790327.ch7
Appiktala NHuang SSankar BTripathi SGoldman E(2021)Identifying Salient Entities of News Articles Using Binary Salient Classifier2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9671567(1541-1549)Online publication date: 15-Dec-2021
https://doi.org/10.1109/BigData52589.2021.9671567
Aggarwal SGarg RSancheti AGuda BBurhanuddin I(2020)Goal-driven Command Recommendations for AnalystsProceedings of the 14th ACM Conference on Recommender Systems10.1145/3383313.3412255(160-169)Online publication date: 22-Sep-2020
https://dl.acm.org/doi/10.1145/3383313.3412255
Zhang DYin JZhu XZhang C(2020)Network Representation Learning: A SurveyIEEE Transactions on Big Data10.1109/TBDATA.2018.28500136:1(3-28)Online publication date: 1-Mar-2020
https://doi.org/10.1109/TBDATA.2018.2850013
Lee OJung J(2020)Story embeddingArtificial Intelligence10.1016/j.artint.2020.103235281:COnline publication date: 1-Apr-2020
https://dl.acm.org/doi/10.1016/j.artint.2020.103235
Choromanska AKumar Jain I(2019)Extreme Multiclass Classification CriteriaComputation10.3390/computation70100167:1(16)Online publication date: 12-Mar-2019
https://doi.org/10.3390/computation7010016
Álvarez FSánchez FHernández-Peñaloza GJiménez DMenéndez JCisneros G(2019)On the influence of low-level visual features in film classificationPLOS ONE10.1371/journal.pone.021140614:2(e0211406)Online publication date: 22-Feb-2019
https://doi.org/10.1371/journal.pone.0211406
Alexandridis GTagaris TSiolas GStafylopatis A(2019)From Free-text User Reviews to Product Recommendation using Paragraph Vectors and Matrix FactorizationCompanion Proceedings of The 2019 World Wide Web Conference10.1145/3308560.3316601(335-343)Online publication date: 13-May-2019
https://dl.acm.org/doi/10.1145/3308560.3316601
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents