Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
short-paper

S3-NET: SRU-Based Sentence and Self-Matching Networks for Machine Reading Comprehension

Published: 20 February 2020 Publication History

Abstract

Machine reading comprehension question answering (MRC-QA) is the task of understanding the context of a given passage to find a correct answer within it. A passage is composed of several sentences; therefore, the length of the input sentence becomes longer, leading to diminished performance. In this article, we propose S3-NET, which adds sentence-based encoding to solve this problem. S3-NET, which is based on a simple recurrent unit architecture, is a deep learning model that solves the MRC-QA by applying matching network to sentence-level encoding. In addition, S3-NET utilizes self-matching networks to compute attention weight for its own recurrent neural network sequences. We perform MRC-QA for the SQuAD dataset of English and MindsMRC dataset of Korean. The experimental results show that for SQuAD, the S3-NET model proposed in this article produces 71.91% and 74.12% exact match and 81.02% and 82.34% F1 in single and ensemble models, respectively, and for MindsMRC, our model achieves 69.43% and 71.28% exact match and 81.53% and 82.77% F1 in single and ensemble models, respectively.

References

[1]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv:1409.473.
[2]
Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading Wikipedia to answer open-domain questions. arXiv:1704.00051.
[3]
Zheqian Chen, Rongqin Yang, Bin Cao, Zhou Zhao, Deng Cai, and Xiaofei He. 2017. Smarnet: Teaching machines to read and comprehend like human. arXiv:1710.02772.
[4]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078.
[5]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555.
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.
[7]
Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. 2015. The Goldilocks principle: Reading children’s books with explicit memory representations. arXiv:1511.02301.
[8]
Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, and Ming Zhou. 2017. Reinforced mnemonic reader for machine reading comprehension. arXiv:1705.02798.
[9]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv:1408.5882.
[10]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980.
[11]
Rupesh Kumar Srivastava, Klaus Greff, and Jürgen Schmidhuber. 2015. Highway networks. arxiv:cs.LG/1505.00387.
[12]
Tao Lei and Yu Zhang. 2017. Training RNNs as fast as CNNs. arXiv:1709.02755.
[13]
Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. arXiv:1611.09268.
[14]
Cheoneum Park and Changki Lee. 2017. Coreference resolution using hierarchical pointer networks. KIISE Transaction on Computing Practices 23, 9 (2017), 542–549.
[15]
Cheoneum Park, Changki Lee, Lynn Hong, Yigyu Hwang, Taejoon Yoo, Jaeyong Jang, Yunki Hong, Kyung-Hoon Bae, and Hyun-Ki Kim. 2019. S-Net: Machine reading comprehension with SRU-based self-matching network. ETRI Journal 41, 3 (2019), 371–382.
[16]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532–1543.
[17]
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv:1802.05365.
[18]
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. Squad: 100,000+ questions for machine comprehension of text. arXiv:1606.05250.
[19]
Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2016. Bidirectional attention flow for machine comprehension. arXiv:1611.01603.
[20]
Yelong Shen, Po-Sen Huang, Jianfeng Gao, and Weizhu Chen. 2017. ReasoNet: Learning to stop reading in machine comprehension. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 1047–1055.
[21]
Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. In Advances in Neural Information Processing Systems. 2692–2700.
[22]
Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, and Ming Zhou. 2017. Gated self-matching networks for reading comprehension and question answering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 189–198.
[23]
Dirk Weissenborn, Georg Wiese, and Laura Seiffe. 2017. Making neural QA as simple as possible but not simpler. arXiv:1703.04816.

Cited By

View all
  • (2023)Design of English pronunciation quality evaluation system based on the deep learning modelApplied Mathematics and Nonlinear Sciences10.2478/amns.2023.1.004608:2(2805-2816)Online publication date: 26-Jun-2023
  • (2022)Improving Machine Reading Comprehension with Multi-Task Learning and Self-TrainingMathematics10.3390/math1003031010:3(310)Online publication date: 19-Jan-2022
  • (2022)New Vietnamese Corpus for Machine Reading Comprehension of Health News ArticlesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/352763121:5(1-28)Online publication date: 2-May-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing
ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 3
May 2020
228 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3378675
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 February 2020
Accepted: 01 September 2019
Revised: 01 July 2019
Received: 01 August 2018
Published in TALLIP Volume 19, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Korean MRC-QA
  2. Machine reading comprehension
  3. hierarchical model
  4. question answering
  5. sentence and self-matching network
  6. simple recurrent unit

Qualifiers

  • Short-paper
  • Research
  • Refereed

Funding Sources

  • This work was supported by Institute for Information & Communications Technology Promotion (IITP) grants funded by the Korean government (MSIT) ( Development of Knowledge Evolutionary WiseQA Platform Technology for Human Knowledge Augmented Services)
  • This work was supported by Institute for Information & Communications Technology Promotion (IITP) grants funded by the Korean government (MSIT) (Artificial Intelligence Contact Center Solution)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)2
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Design of English pronunciation quality evaluation system based on the deep learning modelApplied Mathematics and Nonlinear Sciences10.2478/amns.2023.1.004608:2(2805-2816)Online publication date: 26-Jun-2023
  • (2022)Improving Machine Reading Comprehension with Multi-Task Learning and Self-TrainingMathematics10.3390/math1003031010:3(310)Online publication date: 19-Jan-2022
  • (2022)New Vietnamese Corpus for Machine Reading Comprehension of Health News ArticlesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/352763121:5(1-28)Online publication date: 2-May-2022
  • (2022)VSCA: A Sentence Matching Model Incorporating Visual PerceptionCognitive Computation10.1007/s12559-022-10074-815:1(323-336)Online publication date: 8-Dec-2022
  • (2021)Deep bi-directional interaction network for sentence matchingApplied Intelligence10.1007/s10489-020-02156-751:7(4305-4329)Online publication date: 1-Jul-2021
  • (2021)Sprelog: Log-Based Anomaly Detection with Self-matching Networks and Pre-trained ModelsService-Oriented Computing10.1007/978-3-030-91431-8_50(736-743)Online publication date: 22-Nov-2021
  • (2020)Enhancing Lexical-Based Approach With External Knowledge for Vietnamese Multiple-Choice Machine Reading ComprehensionIEEE Access10.1109/ACCESS.2020.30357018(201404-201417)Online publication date: 2020

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media