Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3371158.3371195acmotherconferencesArticle/Chapter ViewAbstractPublication PagescodsConference Proceedingsconference-collections
research-article

A Hybrid Distributed Model for Learning Representation of Short Texts with Attribute Labels

Published: 15 January 2020 Publication History

Abstract

Short text documents in real-world applications, such as incident tickets, bug tickets, feedback texts etc. contain fixed field entries in the form of certain attribute instances as well as free text entries capturing the summaries of them. We propose an approach based on the Paragraph Vector (due to Le and Mikolov) to learn fixed-length feature representation from these short texts of varying lengths appended with attribute instances. Our method contributes to the existing approach by learning representation from summary of tickets as well as their attribute contents captured using fixed field entries. Further we show such representation of short texts produce better performance on a few learning tasks compared to the other popular representations.

References

[1]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. http://tensorflow.org/ Software available from tensorflow.org.
[2]
Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A Neural Probabilistic Language Model. Journal of Machine Learning Research 3 (2003), 1137--1155.
[3]
Cedric De Boom, Steven Van Canneyt, Thomas Demeester, and Bart Dhoedt. 2016. Representation learning for very short texts using weighted word embedding aggregation. Pattern Recognition Letters 80 (2016), 150--156.
[4]
Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. 2011. Robust Principal Component Analysis? J. ACM 58, 3 (June 2011), 11:1--11:37.
[5]
Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Brian Strope, and Ray Kurzweil. 2018. Universal Sentence Encoder for English. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'18): System Demonstrations. Association for Computational Linguistics, 169--174.
[6]
Ronan Collobert and Jason Weston. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In Proceedings of the 25th International Conference on Machine Learning (ICML '08). ACM, 160--167.
[7]
Jeffrey L. Elman. 1990. Finding structure in time. COGNITIVE SCIENCE 14, 2 (1990), 179--211.
[8]
Eric Jones, Travis Oliphant, Pearu Peterson, et al. 2001--. SciPy: Open source scientific tools for Python.
[9]
Ryan Kiros, Richard S. Zemel, and Ruslan Salakhutdinov. 2014. A Multiplicative Model for Learning Distributed Text-Based Attribute Representations. In NIPS'14. 2348--2356.
[10]
Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. 2015. Skip-thought Vectors. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2 (NIPS'15). MIT Press, Cambridge, MA, USA, 3294--3302.
[11]
Shashi Kumar, Suman Roy, and Vishal Pathak. 2018. A Hybrid Distributed Modor Learning Representation of Short Texts. Technical Report. Optum Global Solutions India Pvt. Ltd. available on request.
[12]
Ahmed Lamkanfi, Javier Pérez, and Serge Demeyer. 2013. The eclipse and mozilla defect tracking dataset: a genuine dataset for mining bug information. In Proceedings of the 10th Working Conference on Mining Software Repositories, MSR '13. IEEE Computer Society, 203--206.
[13]
Quoc V. Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of the 31th International Conference on Machine Learning, ICML'14. 1188--1196.
[14]
Thang Luong, Richard Socher, and Christopher D. Manning. 2013. Better Word Representations with Recursive Neural Networks for Morphology. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, CoNLL 2013, Sofia, Bulgaria, August 8-9, 2013. 104--113.
[15]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations, ICLR, Workshop Track Proceedings.
[16]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems'13. Proceedings. 3111--3119.
[17]
David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1988. Neurocomputing: Foundations of Research. A Bradford Book, Chapter Learning Representations by Back-propagating Errors, 696--699.
[18]
Tian Shi, Kyeongpil Kang, Jaegul Choo, and Chandan K. Reddy. 2018. Short-Text Topic Modeling via Non-negative Matrix Factorization Enriched with Local Word-Context Correlations. In Proceedings of the WWW18. 1105--1114.
[19]
Richard Socher, Eric H. Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning. 2011. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, Proceedings. 801--809.
[20]
Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng, and Christopher D. Manning. 2011. Parsing Natural Scenes and Natural Language with Recursive Neural Networks. In Proceedings of the 28th International Conference on Machine Learning, ICML'11. 129--136.
[21]
Konrad Zolna. 2017. Improving the performance of neural networks in regression tasks using drawering. In International Joint Conference on Neural Networks, IJCNN'17, Anchorage, AK, USA, May 14-19, 2017. 2533--2538.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CoDS COMAD 2020: Proceedings of the 7th ACM IKDD CoDS and 25th COMAD
January 2020
399 pages
ISBN:9781450377386
DOI:10.1145/3371158
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 January 2020

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CoDS COMAD 2020
CoDS COMAD 2020: 7th ACM IKDD CoDS and 25th COMAD
January 5 - 7, 2020
Hyderabad, India

Acceptance Rates

CoDS COMAD 2020 Paper Acceptance Rate 78 of 275 submissions, 28%;
Overall Acceptance Rate 197 of 680 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 98
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media