research-article

Condition Aware and Revise Transformer for Question Answering

Authors:

Huanhuan ChenAuthors Info & Claims

WWW '20: Proceedings of The Web Conference 2020

Pages 2377 - 2387

https://doi.org/10.1145/3366423.3380301

Published: 20 April 2020 Publication History

Abstract

The study of question answering has received increasing attention in recent years. This work focuses on providing an answer that compatible with both user intent and conditioning information corresponding to the question, such as delivery status and stock information in e-commerce. However, these conditions may be wrong or incomplete in real-world applications. Although existing question answering systems have considered the external information, such as categorical attributes and triples in knowledge base, they all assume that the external information is correct and complete. To alleviate the effect of defective condition values, this paper proposes condition aware and revise Transformer (CAR-Transformer). CAR-Transformer (1) revises each condition value based on the whole conversation and original conditions values, and (2) it encodes the revised conditions and utilizes the conditions embedding to select an answer. Experimental results on a real-world customer service dataset demonstrate that the CAR-Transformer can still select an appropriate reply when conditions corresponding to the question exist wrong or missing values, and substantially outperforms baseline models on automatic and human evaluations. The proposed CAR-Transformer can be extended to other NLP tasks which need to consider conditioning information.

References

[1]

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450(2016).

[2]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proc. ICLR.

[3]

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proc. SIGMOD. 1247–1250.

Digital Library

[4]

Antoine Bordes, Sumit Chopra, and Jason Weston. 2014. Question Answering with Subgraph Embeddings. In Proc. EMNLP. 615–620.

[5]

Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Proc. EMNLP. 1724–1734.

[6]

Sajal Choudhary, Prerna Srivastava, Lyle Ungar, and João Sedoc. 2017. Domain aware neural dialog system. arXiv preprint arXiv:1708.00897(2017).

[7]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. NAACL. 4171–4186.

[8]

Li Dong, Furu Wei, Ming Zhou, and Ke Xu. 2015. Question answering over freebase with multi-column convolutional neural networks. In Proc. ACL. 260–269.

[9]

Jianfeng Gao, Michel Galley, and Lihong Li. 2019. Neural Approaches to Conversational AI: Question Answering, Task-oriented Dialogues and Social Chatbots. Now Foundations and Trends.

[10]

Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N Dauphin. 2017. Convolutional sequence to sequence learning. In Proc. ICML. 1243–1252.

[11]

Felix A Gers, Jürgen Schmidhuber, and Fred Cummins. 2000. Learning to Forget: Continual Prediction with LSTM. Neural Computation 12, 10 (2000), 2451–2471.

Digital Library

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proc. CVPR. 770–778.

[13]

Jonathan Herzig, Michal Shmueli-Scheuer, Tommy Sandbank, and David Konopnicki. 2017. Neural response generation for customer service based on personality traits. In Proc. INLG. 252–256.

[14]

Chaitanya K Joshi, Fei Mi, and Boi Faltings. 2017. Personalization in goal-oriented dialog. arXiv preprint arXiv:1706.07503(2017).

[15]

Yoon Kim, Carl Denton, Luong Hoang, and Alexander M Rush. 2017. Structured attention networks. (2017).

[16]

Jiwei Li, Michel Galley, Chris Brockett, Georgios Spithourakis, Jianfeng Gao, and Bill Dolan. 2016. A Persona-Based Neural Conversation Model. In Proc. ACL. 994–1003.

[17]

Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. In Proc. ICLR.

[18]

Xiaodong Liu, Yelong Shen, Kevin Duh, and Jianfeng Gao. 2018. Stochastic Answer Networks for Machine Reading Comprehension. In Proc. ACL. 1694–1704.

[19]

Liangchen Luo, Wenhao Huang, Qi Zeng, Zaiqing Nie, and Xu Sun. 2019. Learning personalized end-to-end goal-oriented dialog. In Proc. AAAI, Vol. 33. 6794–6801.

[20]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proc. ACL. Association for Computational Linguistics, 311–318.

[21]

Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. 2016. A Decomposable Attention Model for Natural Language Inference. In Proc. EMNLP. 2249–2255.

[22]

Qiao Qian, Minlie Huang, Haizhou Zhao, Jingfang Xu, and Xiaoyan Zhu. 2018. Assigning personality/identity to a chatting machine for coherent conversation generation. In Proc. IJCAI. 4279–4285.

[23]

Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2017. Bidirectional attention flow for machine comprehension. In Proc. ICLR.

[24]

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proc. NIPS. 3104–3112.

[25]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proc. NIPS. 5998–6008.

[26]

Yue Wang, Richong Zhang, Cheng Xu, and Yongyi Mao. 2018. The APVA-TURBO Approach To Question Answering in Knowledge Base. In Proc. COLING. 1998–2009.

[27]

Zhe Wang, Wei He, Hua Wu, Haiyang Wu, Wei Li, Haifeng Wang, and Enhong Chen. 2016. Chinese Poetry Generation with Planning based Neural Network. In Proc. COLING. 1051–1060.

[28]

Tsung-Hsien Wen, Milica Gasic, Nikola Mrkšić, Pei-Hao Su, David Vandyke, and Steve Young. 2015. Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems. In Proc. EMNLP. 1711–1721.

[29]

Jason Weston, Sumit Chopra, and Antoine Bordes. 2015. Memory networks. In Proc. ICLR.

[30]

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144(2016).

[31]

Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. 2017. Topic aware neural response generation. In Proc. AAAI. 3351–3357.

[32]

Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2014. Embedding entities and relations for learning and inference in knowledge bases. arXiv preprint arXiv:1412.6575(2014).

[33]

Min Yang, Zhou Zhao, Wei Zhao, Xiaojun Chen, Jia Zhu, Lianqiang Zhou, and Zigang Cao. 2017. Personalized response generation via domain adaptation. In Proc. SIGIR. ACM, 1021–1024.

Digital Library

[34]

Wenpeng Yin, Mo Yu, Bing Xiang, Bowen Zhou, and Hinrich Schütze. 2016. Simple Question Answering by Attentive Convolutional Neural Network. In Proc. COLING. 1746–1756.

[35]

Mo Yu, Wenpeng Yin, Kazi Saidul Hasan, Cicero dos Santos, Bing Xiang, and Bowen Zhou. 2017. Improved Neural Relation Detection for Knowledge Base Question Answering. In Proc. ACL. 571–581.

[36]

Wei-Nan Zhang, Qingfu Zhu, Yifa Wang, Yanyan Zhao, and Ting Liu. 2019. Neural personalized response generation as domain adaptation. World Wide Web 22, 4 (2019), 1427–1446.

Digital Library

Cited By

Lyu SSun LYi HLiu YChen HMiao C(2024)Converse Attention Knowledge Transfer for Low-Resource Named Entity RecognitionInternational Journal of Crowd Science10.26599/IJCS.2023.91000148:3(140-148)Online publication date: Aug-2024
https://doi.org/10.26599/IJCS.2023.9100014
Ban TWang XChen LWu XChen QChen H(2024)Quality Evaluation of Triples in Knowledge Graph by Incorporating Internal With External ConsistencyIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318603335:2(1980-1992)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3186033
Lyu SZhou XWu XChen QChen H(2024)Self-Attention Over Tree for Relation Extraction With Data-Efficiency and Computational EfficiencyIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2023.32862688:2(1253-1263)Online publication date: Apr-2024
https://doi.org/10.1109/TETCI.2023.3286268
Show More Cited By

Index Terms

Condition Aware and Revise Transformer for Question Answering
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Question answering

Index terms have been assigned to the content through auto-classification.

Recommendations

Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

Community Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...
Compositional question answering: A divide and conquer approach

This paper describes how questions can be characterized for question answering (QA) along different facets and focuses on questions that cannot be answered directly but can be divided into simpler ones so that they can be answered directly using ...
Improving the Precision of RDF Question/Answering Systems: A Why Not Approach
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

Given a natural language question _qNL over an RDF dataset D, an RDF Question/Answering (Q/A) system first translatesq_NL into a SPARQL query graph Q and then evaluates Q over the underlying knowledge graph to figure out the answers Q(D). However, due to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '20: Proceedings of The Web Conference 2020

April 2020

3143 pages

ISBN:9781450370233

DOI:10.1145/3366423

Editors:
Yennun Huang
Acadmica sinica, Taiwan
,
Irwin King
The Chinese University of Hong Kong, Hong Kong
,
Tie-Yan Liu
Microsoft Research Asia, China
,
Maarten van Steen
University of Twente, Netherlands

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '20

Sponsor:

SIGWEB

WWW '20: The Web Conference 2020

April 20 - 24, 2020

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
484
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)1

Reflects downloads up to 18 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lyu SSun LYi HLiu YChen HMiao C(2024)Converse Attention Knowledge Transfer for Low-Resource Named Entity RecognitionInternational Journal of Crowd Science10.26599/IJCS.2023.91000148:3(140-148)Online publication date: Aug-2024
https://doi.org/10.26599/IJCS.2023.9100014
Ban TWang XChen LWu XChen QChen H(2024)Quality Evaluation of Triples in Knowledge Graph by Incorporating Internal With External ConsistencyIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318603335:2(1980-1992)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3186033
Lyu SZhou XWu XChen QChen H(2024)Self-Attention Over Tree for Relation Extraction With Data-Efficiency and Computational EfficiencyIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2023.32862688:2(1253-1263)Online publication date: Apr-2024
https://doi.org/10.1109/TETCI.2023.3286268
Andreasen TBordogna GTré GKacprzyk JLarsen HZadrożny S(2024)The power and potentials of Flexible Query Answering SystemsData & Knowledge Engineering10.1016/j.datak.2023.102246149:COnline publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1016/j.datak.2023.102246
Zhao XChen LChen H(2023)A Weighted Heterogeneous Graph-Based Dialog SystemIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.312464034:8(5212-5217)Online publication date: Aug-2023
https://doi.org/10.1109/TNNLS.2021.3124640
Zhao XFeng XChen H(2023)A Background Knowledge Revising and Incorporating Dialogue ModelIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.312312834:8(3874-3884)Online publication date: Aug-2023
https://doi.org/10.1109/TNNLS.2021.3123128
Zhao XChen HXing ZMiao C(2023)Brain-Inspired Search Engine Assistant Based on Knowledge GraphIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.311302634:8(4386-4400)Online publication date: Aug-2023
https://doi.org/10.1109/TNNLS.2021.3113026
Liu QLi X(2023)6Former: Transformer-Based IPv6 Address Generation2023 IEEE Symposium on Computers and Communications (ISCC)10.1109/ISCC58397.2023.10218311(1142-1148)Online publication date: 9-Jul-2023
https://doi.org/10.1109/ISCC58397.2023.10218311
Wang XLyu SWang XWu XChen H(2023)Temporal knowledge graph embedding via sparse transfer matrixInformation Sciences: an International Journal10.1016/j.ins.2022.12.019623:C(56-69)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1016/j.ins.2022.12.019
Thabet BZanichelli NZanichelli F(2023)Q&A Generation for Flashcards Within a Transformer-Based FrameworkHigher Education Learning Methodologies and Technologies Online10.1007/978-3-031-29800-4_59(789-806)Online publication date: 1-May-2023
https://doi.org/10.1007/978-3-031-29800-4_59
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents