research-article

CBench: towards better evaluation of question answering over knowledge graphs

Authors:

Abdelghny Orogat,

Ahmed El-RobyAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 14, Issue 8

Pages 1325 - 1337

https://doi.org/10.14778/3457390.3457398

Published: 01 April 2021 Publication History

Abstract

Recently, there has been an increase in the number of knowledge graphs that can be only queried by experts. However, describing questions using structured queries is not straightforward for non-expert users who need to have sufficient knowledge about both the vocabulary and the structure of the queried knowledge graph, as well as the syntax of the structured query language used to describe the user's information needs. The most popular approach introduced to overcome the aforementioned challenges is to use natural language to query these knowledge graphs. Although several question answering benchmarks can be used to evaluate question-answering systems over a number of popular knowledge graphs, choosing a benchmark to accurately assess the quality of a question answering system is a challenging task.

In this paper, we introduce CBench, an extensible, and more informative benchmarking suite for analyzing benchmarks and evaluating question answering systems. CBench can be used to analyze existing benchmarks with respect to several fine-grained linguistic, syntactic, and structural properties of the questions and queries in the benchmark. We show that existing benchmarks vary significantly with respect to these properties deeming choosing a small subset of them unreliable in evaluating QA systems. Until further research improves the quality and comprehensiveness of benchmarks, CBench can be used to facilitate this evaluation using a set of popular benchmarks that can be augmented with other user-provided benchmarks. CBench not only evaluates a question answering system based on popular single-number metrics but also gives a detailed analysis of the linguistic, syntactic, and structural properties of answered and unanswered questions to better help the developers of question answering systems to better understand where their system excels and where it struggles.

References

[1]

SPARQL 1.1 query language. http://www.w3.org/TR/sparql11-query/, 2013.

[2]

RDF 1.1 concepts and abstract syntax. http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/, 2014.

[3]

A. Abujabal, R. Saha Roy, M. Yahya, and G. Weikum. ComQA: A community-sourced dataset for complex factoid question answering with paraphrase clusters. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HL), 2019.

[4]

A. Abujabal, M. Yahya, M. Riedewald, and G. Weikum. Automated template generation for question answering over knowledge graphs. In Proceedings of the International Conference on World Wide Web (WWW), 2017.

Digital Library

[5]

A. Akbik, D. Blythe, and R. Vollgraf. Contextual string embeddings for sequence labeling. In Proceedings of the International Conference on Computational Linguistics, 2018.

[6]

S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A nucleus for a web of open data. In Proceedings of the International Semantic Web Conference (ISWC). 2007.

Digital Library

[7]

M. Azmy, P. Shi, J. Lin, and I. Ilyas. Farewell Freebase: Migrating the simple-questions dataset to dbpedia. In Proceedings of the International Conference on Computational Linguistics, 2018.

[8]

J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on Freebase from question-answer pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2013.

[9]

K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008.

Digital Library

[10]

A. Bonifati, W. Martens, and T. Timm. An analytical study of large SPARQL query logs. Proceedings of the VLDB Endowment, 11(2), 2017.

Digital Library

[11]

A. Bordes, N. Usunier, S. Chopra, and J. Weston. Large-scale simple question answering with memory networks. arXiv preprint arXiv:1506.02075, 2015.

[12]

E. Cabrio, P. Cimiano, V. Lopez, and S. Walter. QALD-3: Multilingual question answering over linked data. In CLEF Working Notes Papers, 2013.

[13]

E. Cabrio, J. Cojan, A. P. Aprosio, B. Magnini, A. Lavelli, and F. Gandon. QAKiS: an open domain QA system based on relational patterns. In Proceedings of the International Semantic Web Conference (ISWC), 2012.

Digital Library

[14]

E. Cabrio, J. Cojan, B. Magnini, F. Gandon, and A. Lavelli. Qakis@ QALD-2. 2012.

[15]

Q. Cai and A. Yates. Large-scale semantic parsing via schema matching and lexicon extension. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2013.

[16]

A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. Hruschka, and T. M. Mitchell. Toward an architecture for never-ending language learning. In Proceedings of the AAAI conference on artificial intelligence, 2010.

Digital Library

[17]

J. D. Choi, J. Tetreault, and A. Stent. It depends: Dependency parser comparison using a web-based evaluation tool. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) and the International Joint Conference on Natural Language Processing (IJCNLP), 2015.

[18]

W. Cui, Y. Xiao, H. Wang, Y. Song, S.-w. Hwang, and W. Wang. KBQA: Learning question answering over QA corpora and knowledge bases. Proceedings of the VLDB Endowment, 10(5), 2017.

Digital Library

[19]

D. Diefenbach, K. Singh, and P. Maret. WDAqua-core0: A question answering component for the research community. In Semantic Web Evaluation Challenge, 2017.

[20]

X. Dong and A. Halevy. Indexing dataspaces. In Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007.

Digital Library

[21]

M. Dubey, S. Dasgupta, A. Sharma, K. Höffner, and J. Lehmann. AskNow: A framework for natural language query formalization in sparql. In European Semantic Web Conference (ESWC), 2016.

Digital Library

[22]

A. Fader, L. Zettlemoyer, and O. Etzioni. Open question answering over curated and extracted knowledge bases. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014.

Digital Library

[23]

K. Höffner, S. Walter, E. Marx, R. Usbeck, J. Lehmann, and A.-C. Ngonga Ngomo. Survey on challenges of question answering in the semantic web. Semantic Web, 8(6), 2017.

[24]

M. Honnibal and M. Johnson. An improved non-monotonic transition system for dependency parsing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015.

[25]

S. Hu, L. Zou, J. X. Yu, H. Wang, and D. Zhao. Answering natural language questions by subgraph matching over knowledge graphs. IEEE Transactions on Knowledge and Data Engineering (TKDE), 30(5), 2017.

[26]

Z. Jia, A. Abujabal, R. Saha Roy, J. Strötgen, and G. Weikum. TempQuestions: A benchmark for temporal question answering. In Proceedings of the International Conference on World Wide Web (WWW), 2018.

Digital Library

[27]

E. Kaufmann, A. Bernstein, and R. Zumstein. Querix: A natural language interface to query ontologies based on clarification dialogs. In Proceedings of the International Semantic Web Conference (ISWC), 2006.

[28]

S. Liang, K. Stockinger, T. M. de Farias, M. Anisimova, and M. Gil. Querying knowledge graphs in natural language. Journal of Big Data, 8(1), 2021.

[29]

V. Lopez, M. Fernández, E. Motta, and N. Stieler. PowerAqua: supporting users in querying and exploring the semantic web. Semantic Web, 3(3), 2012.

Digital Library

[30]

X. Ma and E. Hovy. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2016.

[31]

C. Matuszek, M. Witbrock, J. Cabral, and J. DeOliveira. An introduction to the syntax and content of cyc. UMBC Computer Science and Electrical Engineering Department Collection, 2006.

[32]

T. Pellissier Tanon, D. Vrandečić, S. Schaffert, T. Steiner, and L. Pintscher. From Freebase to Wikidata: The great migration. In Proceedings of the International Conference on World Wide Web (WWW), 2016.

Digital Library

[33]

K. Singh, A. Both, A. Sethupat, and S. Shekarpour. Frankenstein: a platform enabling reuse of question answering components. In European Semantic Web Conference (ESWC), 2018.

[34]

K. Singh, A. S. Radhakrishna, A. Both, S. Shekarpour, I. Lytra, R. Usbeck, A. Vyas, A. Khikmatullaev, D. Punjani, C. Lange, et al. Why reinvent the wheel: Let's build question answering systems together. In Proceedings of the World Wide Web Conference (WWW), 2018.

Digital Library

[35]

Y. Su, H. Sun, B. Sadler, M. Srivatsa, I. Gur, Z. Yan, and X. Yan. On generating characteristic-rich question sets for qa evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2016.

[36]

F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: a core of semantic knowledge. In Proceedings of the International World Wide Web Conference (WWW), 2007.

Digital Library

[37]

T. P. Tanon, M. Dias de Assuncao, E. Caron, and F. Suchanek. Platypus - A Multilingual Question Answering Platform for Wikidata. Technical report, 2018.

[38]

P. Trivedi, G. Maheshwari, M. Dubey, and J. Lehmann. LC-QuAD: A corpus for complex question answering over knowledge graphs. In International Semantic Web Conference (ISWC), 2017.

Digital Library

[39]

C. Unger, L. Bühmann, J. Lehmann, A.-C. Ngonga Ngomo, D. Gerber, and P. Cimiano. Template-based question answering over RDF data. In Proceedings of the International World Wide Web Conference (WWW), 2012.

Digital Library

[40]

C. Unger, P. Cimiano, V. Lopez, and E. Motta. Question answering over linked data (QALD-1). In Workshop on Question Answering Over Linked Data (QALD-1), 2011.

[41]

C. Unger, C. Forascu, V. Lopez, A.-C. N. Ngomo, E. Cabrio, P. Cimiano, and S. Walter. Question answering over linked data (QALD-4). In CLEF Working Notes Papers, 2014.

[42]

C. Unger, C. Forascu, V. Lopez, A.-C. N. Ngomo, E. Cabrio, P. Cimiano, and S. Walter. Question answering over linked data (QALD-5). In CLEF Working Notes Papers, 2015.

[43]

C. Unger, A.-C. N. Ngomo, and E. Cabrio. 6th open challenge on question answering over linked data (QALD-6). In Semantic Web Challenges, 2016.

[44]

R. Usbeck, R. H. Gusmita, M. Saleem, and A.-C. N. Ngomo. 9th challenge on question answering over linked data (QALD-9). Joint Workshop on Natural Language Interfaces for Web of Data (NLIWoD) and Question Answering over Linked Data challenge, 2018.

[45]

R. Usbeck, A.-C. N. Ngomo, F. Conrads, M. Röder, and G. Napolitano. 8th challenge on question answering over linked data (QALD-8). Joint Proceedings of the International Workshop on Benchmarking Linked Data and Natural Language Interfaces for the Web of Data (NLIWoD), 2018.

[46]

R. Usbeck, A.-C. N. Ngomo, B. Haarmann, A. Krithara, M. Röder, and G. Napolitano. 7th open challenge on question answering over linked data (QALD-7). In Semantic Web Challenges, 2017.

[47]

D. Vrandečić and M. Krötzsch. Wikidata: a free collaborative knowledgebase. Communications of the ACM, 57(10), 2014.

Digital Library

[48]

R. Weischedel. OntoNotes Release 5.0 LDC2013T19. Linguistic Data Consortium, 2013.

[49]

M. Yahya, K. Berberich, S. Elbassuoni, and G. Weikum. Robust question answering over the web of linked data. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), 2013.

Digital Library

[50]

G. Zenz, X. Zhou, E. Minack, W. Siberski, and W. Nejdl. From keywords to semantic queries-Incremental query construction on the semantic web. Journal of Web Semantics, 7(3), 2009.

Digital Library

[51]

W. Zheng, J. X. Yu, L. Zou, and H. Cheng. Question answering over knowledge graphs: question understanding via template decomposition. Proceedings of the VLDB Endowment, 2018.

Digital Library

[52]

Q. Zhou, C. Wang, M. Xiong, H. Wang, and Y. Yu. Spark: Adapting keyword query to semantic search. In Proceedings of the International Semantic Web Conference (ISWC), 2007.

Digital Library

[53]

L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In Proceedings of the ACM SIGMOD International Conference on Management of Data, 2014.

Digital Library

Cited By

Orogat AVadlamani SThomas DEl-Roby ASerra ESpezzano F(2024)Ericsogate: Advancing Analytics and Management of Data from Diverse Sources within Ericsson Using Knowledge GraphsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680033(4795-4802)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3680033
Orogat AEl-Roby A(2023)Maestro: Automatic Generation of Comprehensive Benchmarks for Question Answering Over Knowledge GraphsProceedings of the ACM on Management of Data10.1145/35893221:2(1-24)Online publication date: 20-Jun-2023
https://dl.acm.org/doi/10.1145/3589322
Zhang JShao JCui BChen HDuh WHuang HKato MMothe JPoblete B(2023)StreamE: Learning to Update Representations for Temporal Knowledge Graphs in Streaming ScenariosProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591772(622-631)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591772
Show More Cited By

Index Terms

CBench: towards better evaluation of question answering over knowledge graphs

Index terms have been assigned to the content through auto-classification.

Recommendations

CBench: demonstrating comprehensive evaluation of question answering systems over knowledge graphs through deep analysis of benchmarks

A plethora of question answering (QA) systems that retrieve answers to natural language questions from knowledge graphs have been developed in recent years. However, choosing a benchmark to accurately assess the quality of a question answering system is ...
Mark-copy: fast copying GC with less space overhead
OOPSLA '03: Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications

Copying garbage collectors have a number of advantages over non-copying collectors, including cheap allocation and avoiding fragmentation. However, in order to provide completeness (the guarantee to reclaim each garbage object eventually), standard ...
Mark-copy: fast copying GC with less space overhead
Special Issue: Proceedings of the OOPSLA '03 conference

Copying garbage collectors have a number of advantages over non-copying collectors, including cheap allocation and avoiding fragmentation. However, in order to provide completeness (the guarantee to reclaim each garbage object eventually), standard ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 14, Issue 8

April 2021

200 pages

ISSN:2150-8097

Editors:
Xin Luna Dong
Amazon
,
Felix Naumann
HPI, University of Potsdam

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 April 2021

Published in PVLDB Volume 14, Issue 8

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
53
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)6

Reflects downloads up to 25 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Orogat AVadlamani SThomas DEl-Roby ASerra ESpezzano F(2024)Ericsogate: Advancing Analytics and Management of Data from Diverse Sources within Ericsson Using Knowledge GraphsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680033(4795-4802)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3680033
Orogat AEl-Roby A(2023)Maestro: Automatic Generation of Comprehensive Benchmarks for Question Answering Over Knowledge GraphsProceedings of the ACM on Management of Data10.1145/35893221:2(1-24)Online publication date: 20-Jun-2023
https://dl.acm.org/doi/10.1145/3589322
Zhang JShao JCui BChen HDuh WHuang HKato MMothe JPoblete B(2023)StreamE: Learning to Update Representations for Temporal Knowledge Graphs in Streaming ScenariosProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591772(622-631)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591772
Orogat AEl-Roby A(2021)CBenchProceedings of the VLDB Endowment10.14778/3476311.347632614:12(2711-2714)Online publication date: 28-Oct-2021
https://dl.acm.org/doi/10.14778/3476311.3476326

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents