research-article

Outline Extraction with Question-Specific Memory Cells

Authors:

Zhengdong LuAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Volume 19, Issue 4

Article No.: 48, Pages 1 - 17

https://doi.org/10.1145/3377707

Published: 29 March 2020 Publication History

Abstract

Outline extraction has been widely applied in online consultation to help experts quickly understand individual cases. Given a specific case described as unstructured plain text, outline extraction aims to make a summary for this case by answering a set of questions, which in fact is a new type of machine reading comprehension task. Inspired by a recently popular memory network, we propose a novel question-specific memory cell network (QSMCN) to extract information related to multiple questions on-the-fly as it reads texts. QSMCN constructs a specific memory cell for each question, which is sequentially expanded in recurrent neural network style. Each cell contains three specific vectors to first identify whether current input is related to corresponding question and then update question-specific case representation. We add a penalization term in the loss function to make extracted knowledge more reasonable and interpretable. To support this study, we construct a new outline extraction corpus, InjuryCase,¹ which is composed of 3,995 real Chinese occupational injury cases. Experimental results show that our method makes a significant improvement. We further apply the proposed framework on two multi-aspect extraction tasks and find that the proposed model also remarkably outperforms existing state-of-the-art methods of the aspect extraction task.

References

[1]

Sarath Chandar, Sungjin Ahn, Hugo Larochelle, Pascal Vincent, Gerald Tesauro, and Yoshua Bengio. 2016. Hierarchical memory networks. arXiv:1605.07427.

[2]

Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016. Long short-term memory-networks for machine reading. arXiv:1601.06733.

[3]

Kyunghyun Cho, Bart Van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14).

[4]

Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, NY, 160--167.

Digital Library

[5]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.

[6]

Chris Dyer, Miguel Ballesteros, Ling Wang, Austin Matthews, and Noah A. Smith. 2015. Transition-based dependency parsing with stack long short-term memory. Computer Science 37, 2 (2015), 321--332.

[7]

Alex Graves, Abdelrahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’13). 6645--6649.

[8]

Alex Graves, Greg Wayne, and Ivo Danihelka. 2014. Neural Turing machines. arXiv:1410.5401.

[9]

Alex Graves, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwińska, et al. 2016. Hybrid computing using a neural network with dynamic external memory. Nature 538, 7626 (2016), 471--476.

[10]

Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, and Yann LeCun. 2016. Tracking the world state with recurrent entity networks. arXiv:1612.03969.

[11]

Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. In Proceedings of Advances in Neural Information Processing Systems 28 (NIPS’15). 1693--1701.

[12]

Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. 2015. The goldilocks principle: Reading children’s books with explicit memory representations. arXiv:1511.02301.

[13]

Felix Hill, Kyunghyun Cho, and Anna Korhonen. 2016. Learning distributed representations of sentences from unlabelled data. In Proceedings of the 15th Annual Conference of the North American Chapter of the Associationfor Computational Linguistics: Human Language Technologies (NAACL HLT’16). 1367--1377.

[14]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.

Digital Library

[15]

Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, and Ming Zhou. 2017. Reinforced mnemonic reader for machine reading comprehension. arXiv:1705.02798.

[16]

Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 655–665.

[17]

Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1746–1751.

[18]

Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980.

[19]

Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. 2015. Skip-thought vectors. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS’15). 3294–3302.

[20]

J. Kolen and S. Kremer. 2007. Gradient flow in recurrent nets: The difficulty of learning longterm dependencies. In A Field Guide to Dynamical Recurrent Networks. Wiley-IEEE Press, Hoboken, NJ, 237–243.

[21]

Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. 2017. Race: Large-scale reading comprehension dataset from examinations. arXiv:1704.04683.

[22]

Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. Proceedings of Machine Learning Research 32, 2 (2014), 1188–1196.

[23]

Tao Lei, Regina Barzilay, and Tommi Jaakkola. 2016. Rationalizing neural predictions. (2016).

[24]

Zhouhan Lin, Minwei Feng, Cicero Nogueira Dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2016. A structured self-attentive sentence embedding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 107–117.

[25]

Yang Liu, Chengjie Sun, Lei Lin, and Xiaolong Wang. 2016. Learning natural language inference using bidirectional LSTM model and inner-attention. arXiv:1605.09090.

[26]

Mingbo Ma, Liang Huang, Bing Xiang, and Bowen Zhou. 2015. Dependency-based convolutional neural networks for sentence embedding. In Proceedings of the 53rd Annual Meeting of the Association for ComputationalLinguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 174--179.

[27]

Julian McAuley, Jure Leskovec, and Dan Jurafsky. 2012. Learning attitudes and attributes from multi-aspect reviews. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining. 1020--1025.

Digital Library

[28]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781.

[29]

Lili Mou, Hao Peng, Ge Li, Yan Xu, Lu Zhang, and Zhi Jin. 2015. Discriminative neural sentence modeling by tree-based convolution. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 2315–2325.

[30]

Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv:1802.05365.

[31]

Pranav Rajpurkar, Robin Jia, and Percy Liang. 2018. Know what you don’t know: Unanswerable questions for SQuAD. arXiv:1806.03822.

[32]

Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 1631--1642.

[33]

Saku Sugawara, Kentaro Inui, Satoshi Sekine, and Akiko Aizawa. 2018. What makes reading comprehension questions easier? arXiv:1808.09384.

[34]

Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, and Rob Fergus. 2015. End-to-end memory networks. arXiv:1503.08895.

[35]

Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1556--1566.

[36]

Jason Weston, Sumit Chopra, and Antoine Bordes. 2014. Memory networks. arXiv:1410.3916.

[37]

Caiming Xiong, Stephen Merity, and Richard Socher. 2016. Dynamic memory networks for visual and textual question answering. In Proceedings of the International Conference on Machine Learning. 2397--2406.

Digital Library

Cited By

Gao WYang L(2021)Quantum Election Protocol Based on Quantum Public Key CryptosystemSecurity and Communication Networks10.1155/2021/55512492021Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1155/2021/5551249

Index Terms

Outline Extraction with Question-Specific Memory Cells

Recommendations

Implicit aspect extraction in sentiment analysis

Sentiment analysis is a text classification branch, which is defined as the process of extracting sentiment terms (i.e. feature/aspect, or opinion) and determining their opinion semantic orientation. At aspect level, aspect extraction is the core task ...
Hierarchical features-based targeted aspect extraction from online reviews

With the prevalence of online review websites, large-scale data promote the necessity of focused analysis. This task aims to capture the information that is highly relevant to a specific aspect. However, the broad scope of the aspects of the ...
Opinion Knowledge Injection Network for Aspect Extraction
Neural Information Processing
Abstract
Aspect term extraction (ATE) is to extract explicit aspect expressions from online reviews. This paper focused on the supervised extraction of aspect term. Previous models for ATE either ignored the opinion information or improperly utilized the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 19, Issue 4

July 2020

291 pages

ISSN:2375-4699

EISSN:2375-4702

DOI:10.1145/3391538

Editor:
Imed Zitouni
Microsoft, USA

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 March 2020

Accepted: 01 December 2019

Revised: 01 September 2019

Received: 01 July 2018

Published in TALLIP Volume 19, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

MoE-CMCC “Artifical Intelligence”
National Natural Science Foundation of China
State Grid Corporation of China
Project Research on Key Technologies of Knowledge Graph in Power System Fault Management

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
470
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gao WYang L(2021)Quantum Election Protocol Based on Quantum Public Key CryptosystemSecurity and Communication Networks10.1155/2021/55512492021Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1155/2021/5551249

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Issue’s Table of Contents