Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Outline Extraction with Question-Specific Memory Cells

Published: 29 March 2020 Publication History

Abstract

Outline extraction has been widely applied in online consultation to help experts quickly understand individual cases. Given a specific case described as unstructured plain text, outline extraction aims to make a summary for this case by answering a set of questions, which in fact is a new type of machine reading comprehension task. Inspired by a recently popular memory network, we propose a novel question-specific memory cell network (QSMCN) to extract information related to multiple questions on-the-fly as it reads texts. QSMCN constructs a specific memory cell for each question, which is sequentially expanded in recurrent neural network style. Each cell contains three specific vectors to first identify whether current input is related to corresponding question and then update question-specific case representation. We add a penalization term in the loss function to make extracted knowledge more reasonable and interpretable. To support this study, we construct a new outline extraction corpus, InjuryCase,1 which is composed of 3,995 real Chinese occupational injury cases. Experimental results show that our method makes a significant improvement. We further apply the proposed framework on two multi-aspect extraction tasks and find that the proposed model also remarkably outperforms existing state-of-the-art methods of the aspect extraction task.

References

[1]
Sarath Chandar, Sungjin Ahn, Hugo Larochelle, Pascal Vincent, Gerald Tesauro, and Yoshua Bengio. 2016. Hierarchical memory networks. arXiv:1605.07427.
[2]
Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016. Long short-term memory-networks for machine reading. arXiv:1601.06733.
[3]
Kyunghyun Cho, Bart Van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14).
[4]
Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, NY, 160--167.
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.
[6]
Chris Dyer, Miguel Ballesteros, Ling Wang, Austin Matthews, and Noah A. Smith. 2015. Transition-based dependency parsing with stack long short-term memory. Computer Science 37, 2 (2015), 321--332.
[7]
Alex Graves, Abdelrahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’13). 6645--6649.
[8]
Alex Graves, Greg Wayne, and Ivo Danihelka. 2014. Neural Turing machines. arXiv:1410.5401.
[9]
Alex Graves, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwińska, et al. 2016. Hybrid computing using a neural network with dynamic external memory. Nature 538, 7626 (2016), 471--476.
[10]
Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, and Yann LeCun. 2016. Tracking the world state with recurrent entity networks. arXiv:1612.03969.
[11]
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. In Proceedings of Advances in Neural Information Processing Systems 28 (NIPS’15). 1693--1701.
[12]
Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. 2015. The goldilocks principle: Reading children’s books with explicit memory representations. arXiv:1511.02301.
[13]
Felix Hill, Kyunghyun Cho, and Anna Korhonen. 2016. Learning distributed representations of sentences from unlabelled data. In Proceedings of the 15th Annual Conference of the North American Chapter of the Associationfor Computational Linguistics: Human Language Technologies (NAACL HLT’16). 1367--1377.
[14]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.
[15]
Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, and Ming Zhou. 2017. Reinforced mnemonic reader for machine reading comprehension. arXiv:1705.02798.
[16]
Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 655–665.
[17]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1746–1751.
[18]
Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980.
[19]
Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. 2015. Skip-thought vectors. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS’15). 3294–3302.
[20]
J. Kolen and S. Kremer. 2007. Gradient flow in recurrent nets: The difficulty of learning longterm dependencies. In A Field Guide to Dynamical Recurrent Networks. Wiley-IEEE Press, Hoboken, NJ, 237–243.
[21]
Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. 2017. Race: Large-scale reading comprehension dataset from examinations. arXiv:1704.04683.
[22]
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. Proceedings of Machine Learning Research 32, 2 (2014), 1188–1196.
[23]
Tao Lei, Regina Barzilay, and Tommi Jaakkola. 2016. Rationalizing neural predictions. (2016).
[24]
Zhouhan Lin, Minwei Feng, Cicero Nogueira Dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2016. A structured self-attentive sentence embedding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 107–117.
[25]
Yang Liu, Chengjie Sun, Lei Lin, and Xiaolong Wang. 2016. Learning natural language inference using bidirectional LSTM model and inner-attention. arXiv:1605.09090.
[26]
Mingbo Ma, Liang Huang, Bing Xiang, and Bowen Zhou. 2015. Dependency-based convolutional neural networks for sentence embedding. In Proceedings of the 53rd Annual Meeting of the Association for ComputationalLinguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 174--179.
[27]
Julian McAuley, Jure Leskovec, and Dan Jurafsky. 2012. Learning attitudes and attributes from multi-aspect reviews. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining. 1020--1025.
[28]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781.
[29]
Lili Mou, Hao Peng, Ge Li, Yan Xu, Lu Zhang, and Zhi Jin. 2015. Discriminative neural sentence modeling by tree-based convolution. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 2315–2325.
[30]
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv:1802.05365.
[31]
Pranav Rajpurkar, Robin Jia, and Percy Liang. 2018. Know what you don’t know: Unanswerable questions for SQuAD. arXiv:1806.03822.
[32]
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 1631--1642.
[33]
Saku Sugawara, Kentaro Inui, Satoshi Sekine, and Akiko Aizawa. 2018. What makes reading comprehension questions easier? arXiv:1808.09384.
[34]
Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, and Rob Fergus. 2015. End-to-end memory networks. arXiv:1503.08895.
[35]
Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1556--1566.
[36]
Jason Weston, Sumit Chopra, and Antoine Bordes. 2014. Memory networks. arXiv:1410.3916.
[37]
Caiming Xiong, Stephen Merity, and Richard Socher. 2016. Dynamic memory networks for visual and textual question answering. In Proceedings of the International Conference on Machine Learning. 2397--2406.

Cited By

View all
  • (2021)Quantum Election Protocol Based on Quantum Public Key CryptosystemSecurity and Communication Networks10.1155/2021/55512492021Online publication date: 1-Jan-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing
ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 4
July 2020
291 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3391538
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 March 2020
Accepted: 01 December 2019
Revised: 01 September 2019
Received: 01 July 2018
Published in TALLIP Volume 19, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Outline extraction
  2. aspect extraction
  3. memory cell network

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • MoE-CMCC “Artifical Intelligence”
  • National Natural Science Foundation of China
  • State Grid Corporation of China
  • Project Research on Key Technologies of Knowledge Graph in Power System Fault Management

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Quantum Election Protocol Based on Quantum Public Key CryptosystemSecurity and Communication Networks10.1155/2021/55512492021Online publication date: 1-Jan-2021

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media