Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Multi-level Contrastive Learning for Commonsense Question Answering

  • Conference paper
  • First Online:
Knowledge Science, Engineering and Management (KSEM 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14120))

  • 941 Accesses

Abstract

Recent studies have shown that the integration of external knowledge greatly improves the performance of commonsense question answering. However, the problems of semantic representation discrepancy between questions and external knowledge as well as weak discrimination between choices have not been well ameliorated. To address the above problems, we propose Multi-Level Contrastive Learning named MLCL for commonsense question answering, which includes instance-level and class-level contrastive learning modules. The instance-level contrastive module aims to align questions with knowledge of correct choice in semantic space, and class-level contrastive module focuses on how to make it easier to distinguish between correct and wrong choices. The model achieves state-of-the-art result in CommonsenseQA dataset and outperforms competitive approaches in OpenBookQA. In addition, adequate experiments verify the effectiveness of contrastive learning in multi-choice commonsense question answering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barkan, O., Caciularu, A., Dagan, I.: Within-between lexical relation classification. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3521–3527 (2020)

    Google Scholar 

  2. Bian, N., Han, X., Sun, L., Lin, H., Lu, Y., He, B.: ChatGPT is a knowledgeable but inexperienced solver: an investigation of commonsense problem in large language models. arXiv preprint arXiv:2303.16421 (2023)

  3. Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., Choi, Y.: Comet: commonsense transformers for automatic knowledge graph construction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4762–4779 (2019)

    Google Scholar 

  4. Feng, Y., Chen, X., Lin, B.Y., Wang, P., Yan, J., Ren, X.: Scalable multi-hop relational reasoning for knowledge-aware question answering. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1295–1309 (2020)

    Google Scholar 

  5. Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 6894–6910 (2021)

    Google Scholar 

  6. Ghosal, D., Majumder, N., Mihalcea, R., Poria, S.: Two is better than many? Binary classification as an effective approach to multi-choice question answering. arXiv preprint arXiv:2210.16495 (2022)

  7. Khosla, P., et al.: Supervised contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 18661–18673 (2020)

    Google Scholar 

  8. Lin, B.Y., Chen, X., Chen, J., Ren, X.: KagNet: knowledge-aware graph networks for commonsense reasoning. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2829–2839 (2019)

    Google Scholar 

  9. Lin, J.: Knowledge chosen by relations (2020). https://github.com/jessionlin/csqa/blob/master/Model_details.md

  10. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International Conference on Learning Representations (2018)

    Google Scholar 

  11. Manning, C.D.: Introduction to Information Retrieval. Syngress Publishing (2008)

    Google Scholar 

  12. Mihaylov, T., Clark, P., Khot, T., Sabharwal, A.: Can a suit of armor conduct electricity? A new dataset for open book question answering. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2381–2391 (2018)

    Google Scholar 

  13. Miller, T.: Contrastive explanation: a structural-model approach. Knowl. Eng. Rev. 36 (2021)

    Google Scholar 

  14. Oord, A.v.d., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)

  15. Pan, X., et al.: Improving question answering with external knowledge. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering, pp. 27–37 (2019)

    Google Scholar 

  16. Paranjape, B., Michael, J., Ghazvininejad, M., Hajishirzi, H., Zettlemoyer, L.: Prompting contrastive explanations for commonsense reasoning tasks. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 4179–4192 (2021)

    Google Scholar 

  17. Petroni, F., et al.: Language models as knowledge bases? In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2463–2473 (2019)

    Google Scholar 

  18. Sap, M., et al.: Atomic: an atlas of machine commonsense for if-then reasoning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3027–3035 (2019)

    Google Scholar 

  19. Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 593–607. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_38

    Chapter  Google Scholar 

  20. Speer, R., Chin, J., Havasi, C.: Conceptnet 5.5: an open multilingual graph of general knowledge. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  21. Talmor, A., Herzig, J., Lourie, N., Berant, J.: CommonSenseQA: a question answering challenge targeting commonsense knowledge. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4149–4158 (2019)

    Google Scholar 

  22. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  23. Wang, K., Zhang, Y., Yang, D., Song, L., Qin, T.: GNN is a counter? Revisiting GNN for question answering. In: International Conference on Learning Representations

    Google Scholar 

  24. Wang, S., et al.: Training data is more valuable than you think: a simple and effective method by retrieving from training data. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 3170–3179 (2022)

    Google Scholar 

  25. Xu, Y., et al.: Human parity on commonsenseQA: augmenting self-attention with external attention. arXiv preprint arXiv:2112.03254 (2021)

  26. Xu, Y., Zhu, C., Xu, R., Liu, Y., Zeng, M., Huang, X.: Fusing context into knowledge graph for commonsense question answering. In: Association for Computational Linguistics (ACL) (2021)

    Google Scholar 

  27. Yasunaga, M., Ren, H., Bosselut, A., Liang, P., Leskovec, J.: QA-GNN: reasoning with language models and knowledge graphs for question answering. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 535–546 (2021)

    Google Scholar 

  28. Ye, Z.X., Chen, Q., Wang, W., Ling, Z.H.: Align, mask and select: a simple method for incorporating commonsense knowledge into language representation models. arXiv preprint arXiv:1908.06725 (2019)

  29. Zhan, X., Li, Y., Dong, X., Liang, X., Hu, Z., Carin, L.: elBERTo: self-supervised commonsense learning for question answering. Knowl.-Based Syst. 258, 109964 (2022)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 62006243). We thank the reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minghao Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fang, Q. et al. (2023). Multi-level Contrastive Learning for Commonsense Question Answering. In: Jin, Z., Jiang, Y., Buchmann, R.A., Bi, Y., Ghiran, AM., Ma, W. (eds) Knowledge Science, Engineering and Management. KSEM 2023. Lecture Notes in Computer Science(), vol 14120. Springer, Cham. https://doi.org/10.1007/978-3-031-40292-0_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-40292-0_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-40291-3

  • Online ISBN: 978-3-031-40292-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics