Article

Domain-specific Answer Sentence Selection with Terminology Augmentation and Cascade Attention

Authors: Xiaotong Lyu, Denghao Ma,

Yueguo ChenAuthors Info & Claims

Database Systems for Advanced Applications: 29th International Conference, DASFAA 2024, Gifu, Japan, July 2–5, 2024, Proceedings, Part V

Pages 88 - 103

https://doi.org/10.1007/978-981-97-5569-1_6

Published: 13 December 2024 Publication History

Abstract

The online consulting service has become a popular and convenient channel for people to seek professional replies. Since the replies are often lengthy, the answer highlighting is critical for users to identify the core answers. So we study the task of

\bar{A}

nswer

\bar{S}

entence

\bar{S}

election in specific

\bar{d}

omains (ASSD), which is to select core answer sentences in replies for highlighting. Even pre-trained language models (PLMs) have made great progress in ASSD, there is still a significant untapped potential in deep understanding domain-specific texts which are replete with specialized terminologies. So we propose a novel Terminology-augmented Cascade Attention(TACA) framework by incorporating domain-specific knowledge into PLMs, to achieve better text understanding and then accomplish the ASSD task more effectively. In the framework, we first design the terminology-augmented multi-channel semantic model to deeply mine the semantics of both questions and answers. Second, the cascade attention mechanism is proposed to incorporate multi-channel semantics and achieve fine-grained semantic matching between questions and answers. Extensive experiments on two datasets show that TACA significantly improves the accuracy in the ASSD task.

References

[1]

Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net (2020). https://openreview.net/forum?id=r1xMH1BtvB.

[2]

Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). pp. 4171–4186. Association for Computational Linguistics (2019)., https://doi.org/10.18653/v1/n19-1423

[3]

Duan, X., et al.: CJRC: a reliable human-annotated benchmark dataset for Chinese judicial reading comprehension. In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds.) Chinese Computational Linguistics - 18th China National Conference, CCL 2019, Kunming, China, October 18-20, 2019, Proceedings. Lecture Notes in Computer Science, vol. 11856, pp. 439–451. Springer (2019)., https://doi.org/10.1007/978-3-030-32381-3_36

Digital Library

[4]

Garg, S., Vu, T., Moschitti, A.: TANDA: transfer and adapt pre-trained transformer models for answer sentence selection. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. pp. 7780–7788. AAAI Press (2020)., https://doi.org/10.1609/aaai.v34i05.6282

[5]

Gururangan, S., et al.: Don’t stop pretraining: adapt language models to domains and tasks. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 8342–8360. Association for Computational Linguistics (2020)., https://doi.org/10.18653/v1/2020.acl-main.740

[6]

Lai, T.M., Bui, T., Li, S.: A review on deep learning techniques applied to answer selection. In: Bender, E.M., Derczynski, L., Isabelle, P. (eds.) Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, Santa Fe, New Mexico, USA, August 20-26, 2018, pp. 2132–2144. Association for Computational Linguistics (2018), https://aclanthology.org/C18-1181/

[7]

Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net (2020). https://openreview.net/forum?id=H1eA7AEtvS

[8]

Lauriola, I., Moschitti, A.: Answer sentence selection using local and global context in transformer models. In: Hiemstra, D., Moens, M., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds.) Advances in Information Retrieval - 43rd European Conference on IR Research, ECIR 2021, Virtual Event, March 28 - April 1, 2021, Proceedings, Part I. Lecture Notes in Computer Science, vol. 12656, pp. 298–312. Springer (2021)., https://doi.org/10.1007/978-3-030-72113-8_20

Digital Library

[9]

Li, P., et al.: Dataset and neural recurrent sequence labeling model for open-domain factoid question answering. arXiv preprint arXiv:1607.06275 (2016)

[10]

Liello, L.D., Garg, S., Moschitti, A.: Context-aware transformer pre-training for answer sentence selection. In: Rogers, A., Boyd-Graber, J.L., Okazaki, N. (eds.) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pp. 458–468. Association for Computational Linguistics (2023)., https://doi.org/10.18653/v1/2023.acl-short.40

[11]

Liu, W., et al.: K-BERT: enabling language representation with knowledge graph. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. pp. 2901–2908. AAAI Press (2020)., https://doi.org/10.1609/aaai.v34i03.5681

[12]

Liu, Y., et al.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019), http://arxiv.org/abs/1907.11692

[13]

Ma, D., Chang, K.C., Chen, Y., Lv, X., Shen, L.: A principled decomposition of pointwise mutual information for intention template discovery. In: Frommholz, I., Hopfgartner, F., Lee, M., Oakes, M., Lalmas, M., Zhang, M., Santos, R.L.T. (eds.) Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023, Birmingham, United Kingdom, October 21-25, 2023. pp. 1746–1755. ACM (2023)., https://doi.org/10.1145/3583780.3614767

Digital Library

[14]

Sarkar, R., Dutta, S., Assem, H., Arcan, M., McCrae, J.P.: Semantic aware answer sentence selection using self-learning based domain adaptation. In: Zhang, A., Rangwala, H. (eds.) KDD 2022: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14 - 18, 2022. pp. 3849–3857. ACM (2022)., https://doi.org/10.1145/3534678.3539162

Digital Library

[15]

Wang, J., et al.: Knowledge prompting in pre-trained language model for natural language understanding. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022. pp. 3164–3177. Association for Computational Linguistics (2022)., https://doi.org/10.18653/v1/2022.emnlp-main.207

[16]

Yu, L., Hermann, K.M., Blunsom, P., Pulman, S.: Deep learning for answer sentence selection. CoRR abs/1412.1632 (2014), http://arxiv.org/abs/1412.1632

[17]

Yu, W., et al.: Dict-BERT: enhancing language model pre-training with dictionary. In: Muresan, S., Nakov, P., Villavicencio, A. (eds.) Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, May 22-27, 2022. pp. 1907–1918. Association for Computational Linguistics (2022)., https://doi.org/10.18653/v1/2022.findings-acl.150

[18]

Zhang, Z., Han, X., Liu, Z., Jiang, X., Sun, M., Liu, Q.: ERNIE: enhanced language representation with informative entities. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. pp. 1441–1451. Association for Computational Linguistics (2019)., https://doi.org/10.18653/v1/p19-1139

Index Terms

Domain-specific Answer Sentence Selection with Terminology Augmentation and Cascade Attention
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction
      2. Language resources
2. Information systems
  1. Information retrieval

Index terms have been assigned to the content through auto-classification.

Recommendations

Semantic Aware Answer Sentence Selection Using Self-Learning Based Domain Adaptation
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Selecting an appropriate and relevant context forms an essential component for the efficacy of several information retrieval applications like Question Answering (QA) systems. The problem of Answer Sentence Selection (AS2) refers to the task of selecting ...
Entity-aware answer sentence selection for question answering with transformer-based language models
Abstract
The Answer Sentence Selection (AS2) task is defined as the task of ranking the candidate answers for each question based on a matching score. The matching score is the probability of being a correct answer for a given question. Detecting the ...
Enriching domain ontologies using question-answer datasets
CODS-COMAD '18: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data

Ontologies are knowledge representation structures that are used to model a domain in the form of concepts, entities and relations between them. Existing works on domain ontology enrichment in the literature mainly make use of the web corpus. Community ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Database Systems for Advanced Applications: 29th International Conference, DASFAA 2024, Gifu, Japan, July 2–5, 2024, Proceedings, Part V

Jul 2024

561 pages

ISBN:978-981-97-5568-4

DOI:10.1007/978-981-97-5569-1

Editors:
Makoto Onizuka
Osaka University, Suita, Osaka, Japan
,
Jae-Gil Lee
KAIST, Daejeon, Korea (Republic of)
,
Yongxin Tong
https://ror.org/00wk2mp56Beihang University, Beijing, China
,
Chuan Xiao
Osaka University, Osaka, Japan
,
Yoshiharu Ishikawa
Nagoya University, Nagoya, Japan
,
Sihem Amer-Yahia
https://ror.org/05v727m31University of Grenoble Alpes, Saint-Martin d’Hères, France
,
H. V. Jagadish
https://ror.org/00jmfr291University of Michigan, Ann Arbor, MI, USA
,
Kejing Lu
https://ror.org/04chrp450Nagoya University, Nagoya, Japan

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 13 December 2024

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents