Deep QA: An Open-Domain Dataset of Deep Questions and Comprehensive Answers

Anbarasu, Hariharasudan Savithri; Navalli, Harshavardhan Veeranna; Vidapanakal, Harshita; Gowd, K. Manish; Das, Bhaskarjyoti

doi:10.1007/978-3-031-35299-7_16

Hariharasudan Savithri Anbarasu¹⁰,
Harshavardhan Veeranna Navalli¹⁰,
Harshita Vidapanakal¹⁰,
K. Manish Gowd¹⁰ &
…
Bhaskarjyoti Das¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1823))

Included in the following conference series:

International Conference on Computer and Communication Engineering

213 Accesses

Abstract

Current available question answering (QA) datasets fall short in two aspects - providing comprehensive answers that span over a few sentences and questions being deep or analytical in nature. Though individually these issues are addressed, a dataset that addresses both these issues is still not available. To address this gap, we introduce Deep QA(DQA), i.e., a dataset consisting of 12816 questions broadly classified into 4 types of questions. The generated dataset has been analyzed and compared with a standard QA dataset to prove that it demands higher cognitive skills. To prove the point further, state of art models trained on remembering type factive QA dataset have been pre-trained on the proposed dataset and are shown to perform poorly on the question types generated. Finally, some preliminary investigation using a graph neural model has been done to probe the possibility of an alternative answer generation technique on such a dataset of deeper questions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A short survey on end-to-end simple question answering systems

Article 23 April 2020

Recent progress in leveraging deep learning methods for question answering

Article 16 January 2022

Approaches to Question Answering Using LSTM and Memory Networks

Notes

1.
https://doi.org/10.5281/zenodo.7538113.

References

Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)
Artetxe, M., Ruder, S., Yogatama, D.: On the cross-lingual transferability of mono-lingual representations. arXiv preprint arXiv:1910.11856 (2019)
Bordes, A., Usunier, N., Chopra, S., Weston, J.: Large-scale simple question answering with memory networks. arXiv preprint arXiv:1506.02075 (2015)
Jia, X., Zhou, W., Sun, X., Wu, Y.: Eqg-race: examination-type question generation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 13143–13151 (2021)
Google Scholar
Cao, S., et al.: KQA pro: a dataset with explicit compositional programs for complex question answering over knowledge base. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), vol. 1:(Long Papers), pp. 6101–6119. Association Computational Linguistics-ACL (2022)
Google Scholar
Sen, P., Aji, A.F., Saffari, A.: Mintaka: a complex, natural, and multilingual dataset for end-to-end question answering. arXiv preprint arXiv:2210.01613 (2022)
Gu, Y., et al.: Beyond IID: three levels of generalization for question answering on knowledge bases. In: Proceedings of the Web Conference 2021, pp. 3477–3488 (2021)
Google Scholar
Yang, Z., et al.: Hotpotqa: a dataset for diverse, explainable multi-hop question answering. arXiv preprint arXiv:1809.09600 (2018)
Wang, Z.: Modern question answering datasets and benchmarks: a survey. arXiv preprint arXiv:2206.15030 (2022)
Boratko, M., et al.: A systematic classification of knowledge, reasoning, and context within the arc dataset. arXiv preprint arXiv:1806.00358 (2018)
Liu, J., Cui, L., Liu, H., Huang, D., Wang, Y., Zhang, Y.: LogiQA: a challenge dataset for machine reading comprehension with logical reasoning. arXiv preprint arXiv:2007.08124 (2020)
Mihaylov, T., Clark, P., Khot, T., Sabharwal, A.: Can a suit of armor conduct electricity? A new dataset for open book question answering. arXiv preprint arXiv:1809.02789 (2018)
Clark, C., Lee, K., Chang, M.W., Kwiatkowski, T., Collins, M., Toutanova, K.: Boolq: exploring the surprising difficulty of natural yes/no questions. arXiv preprint arXiv:1905.10044 (2019)
Voskarides, N., Li, D., Panteli, A., Ren, P.: ILPS at TREC 2019 conversational assistant track. In: TREC (2019)
Google Scholar
Trischler, A., et al.: NewsQA: a machine comprehension dataset. arXiv preprint arXiv:1611.09830 (2016)
Dunn, M., Sagun, L., Higgins, M., Guney, V.U., Cirik, V., Cho, K.: SearchQA: a new Q&A dataset augmented with context from a search engine. arXiv preprint arXiv:1704.05179 (2017)
Choi, E., et al.: Quac: question answering in context. arXiv preprint arXiv:1808.07036 (2018)
Tafjord, O., Gardner, M., Lin, K., Clark, P.: Quartz: an open-domain dataset of qualitative relationship questions. arXiv preprint arXiv:1909.03553 (2019)
Nguyen, T., et al.: MS marco: a human generated machine reading comprehension dataset. Choice 2640, 660 (2016)
Google Scholar
Joshi, M., Choi, E., Weld, D.S., Zettlemoyer, L.: TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension. arXiv preprint arXiv:1705.03551 (2017)
Fan, A., Jernite, Y., Perez, E., Grangier, D., Weston, J., Auli, M.: Eli5: long form question answering. arXiv preprint arXiv:1907.09190 (2019)
Ullrich, S., Geierhos, M.: Using bloom’s taxonomy to classify question complexity. In: Proceedings of the Fourth International Conference on Natural Language and Speech Processing (ICNLSP 2021), pp. 285–289 (2021)
Google Scholar
Palmer, M., Gildea, D., Xue, N.: Semantic role labeling. Synth. Lect. Hum. Lang. Technol. 3(1), 1–103 (2010)
Article Google Scholar
Palmer, M., Gildea, D., Kingsbury, P.: The proposition bank: an annotated corpus of semantic roles. Comput. Linguist. 31(1), 71–106 (2005)
Article Google Scholar
Nguyen, D., Nguyen, T.: A question answering model based evaluation for OVL (ontology for Vietnamese language). Int. J. Comput. Theory Eng. 347–351 (2011). https://doi.org/10.7763/IJCTE.2011.V3.330, https://huggingface.co/ChuVN/bart-base-finetuned-squad2
https://huggingface.co/ChuVN/bart-base-finetuned-squad2
https://huggingface.co/csarron/bert-base-uncased-squad-v1
https://huggingface.co/ozcangundes/T5-base-for-BioQA
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y., et al.: Graph attention networks. Stat 1050(20), 10–48550 (2017)
Google Scholar
Mrini, K., Farcas, E., Nakashole, N.: Recursive tree-structured self-attention for answer sentence selection. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 4651–4661 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

PES University, Bengaluru, Karnataka, India
Hariharasudan Savithri Anbarasu, Harshavardhan Veeranna Navalli, Harshita Vidapanakal, K. Manish Gowd & Bhaskarjyoti Das

Authors

Hariharasudan Savithri Anbarasu
View author publications
You can also search for this author in PubMed Google Scholar
Harshavardhan Veeranna Navalli
View author publications
You can also search for this author in PubMed Google Scholar
Harshita Vidapanakal
View author publications
You can also search for this author in PubMed Google Scholar
K. Manish Gowd
View author publications
You can also search for this author in PubMed Google Scholar
Bhaskarjyoti Das
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Harshavardhan Veeranna Navalli .

Editor information

Editors and Affiliations

University of Naples Federico II, Naples, Italy
Filippo Neri
Concordia University, Montreal, QC, Canada
Ke-Lin Du
The University of New South Wales, Sydney, NSW, Australia
Vijayakumar Varadarajan
Miguel Hernández University of Elche, Elche, Spain
Angel-Antonio San-Blas
Northwestern Polytechnical University, Xi'an, China
Zhiyu Jiang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Anbarasu, H.S., Navalli, H.V., Vidapanakal, H., Gowd, K.M., Das, B. (2023). Deep QA: An Open-Domain Dataset of Deep Questions and Comprehensive Answers. In: Neri, F., Du, KL., Varadarajan, V., San-Blas, AA., Jiang, Z. (eds) Computer and Communication Engineering. CCCE 2023. Communications in Computer and Information Science, vol 1823. Springer, Cham. https://doi.org/10.1007/978-3-031-35299-7_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-35299-7_16
Published: 14 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35298-0
Online ISBN: 978-3-031-35299-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Deep QA: An Open-Domain Dataset of Deep Questions and Comprehensive Answers

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A short survey on end-to-end simple question answering systems

Recent progress in leveraging deep learning methods for question answering

Approaches to Question Answering Using LSTM and Memory Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Deep QA: An Open-Domain Dataset of Deep Questions and Comprehensive Answers

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A short survey on end-to-end simple question answering systems

Recent progress in leveraging deep learning methods for question answering

Approaches to Question Answering Using LSTM and Memory Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation