TraQuLA: Transparent Question Answering Over RDF Through Linguistic Analysis

Zimina, Elizaveta; Järvelin, Kalervo; Peltonen, Jaakko; Ranta, Aarne; Nummenmaa, Jyrki

doi:10.1007/978-3-031-62362-2_2

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14629))

Included in the following conference series:

International Conference on Web Engineering

293 Accesses

Abstract

Answering complex questions over knowledge graphs has gained popularity recently. Systems based on large language models seem to achieve top performance. However, these models may generate content that looks reasonable but is incorrect. They also lack transparency, making it impossible to exactly explain why a particular answer was generated. To tackle these problems we present the TraQuLA (Transparent QUestion-answering through Linguistic Analysis) system – a rule-based system developed through linguistic analysis of datasets of complex questions over DBpedia and Wikidata. TraQuLA defines a question’s type and extracts its semantic component candidates (named entities, properties and class names). For the extraction of properties, whose natural language verbalisations are most diverse, we built an extensive database which matches DBpedia/Wikidata properties to natural language expressions, allowing linguistic variation. TraQuLA generates semantic parses for the components and ranks them by each question’s structure and morphological features. The ranked parses are then analysed top down according to their patterns, also noting linguistic aspects, until a solution is found and a SPARQL query is produced. TraQuLA outperforms the existing baseline systems on the LC-QuAD 1.0 and competes with ChatGPT-based systems on LC-QuAD 2.0. For the LC-QuAD 1.0 test set, we developed an evaluation approach that accepts multiple ways to answer the questions (some ignored by the dataset) and curated some errors. TraQuLa contains no “black boxes” of neural networks or machine learning and makes its answer construction traceable. Users can therefore better rely on them and assess their correctness.

Supported by the Academy of Finland

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://lod-cloud.net/
2.
http://qald.aksw.org/
3.
With the help of the Word Forms module: https://github.com/gutfeeling/word_forms
4.
Detected by means of WordNet: https://wordnet.princeton.edu/
5.
This and further sample questions are taken from the training splits. All grammar and spelling has been left unedited.
6.
Hereinafter, SPARQL queries for What questions are given in the shortened form, so that the full form would have the preamble:
PREFIX res: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?uri WHERE
7.
https://github.com/lizazim/TraQuLA
8.
We had to adjust WDAqua’s macro precision, since in [6] the reported precision was calculated for answered questions only, and for each unanswered question micro precision equaled 1. We also use TeBaQA’s F1 score instead of the reported QALD F-Measure.
9.
http://mappings.dbpedia.org/
10.
https://commoncrawl.org/

References

Berant, J., Chou, A., Frostig, R., Liang, P.: Semantic parsing on Freebase from question-answer pairs. In: Proceedings of EMNLP 2013, pp. 1533–1544 (2013)
Google Scholar
Bordes, A., Usunier, N., Chopra, S., Weston, J.: Large-scale simple question answering with memory networks (2015)
Google Scholar
Cai, Q., Yates, A.: Large-scale semantic parsing via schema matching and lexicon extension. In: Proceedings of ACL 2013, pp. 423–433 (2013)
Google Scholar
Chung, H.W., et al.: Scaling instruction-finetuned language models (2022)
Google Scholar
Cimiano, P., Lopez, V., Unger, C., Cabrio, E., Ngonga Ngomo, A.C., Walter, S.: Multilingual question answering over linked data (qald-3): lab overview, pp. 321–332 (2013)
Google Scholar
Diefenbach, D., Both, A., Singh, K., Maret, P.: Towards a question answering system over the semantic web. Semant. Web 1–16 (2018)
Google Scholar
Dong, L., Wei, F., Zhou, M., Xu, K.: Question answering over Freebase with multi-column convolutional neural networks. In: Proceedings of ACL 2015 – IJCNLP 2015, pp. 260–269 (2015)
Google Scholar
Dubey, M., Banerjee, D., Abdelkawi, A., Lehmann, J.: LC-QuAD 2.0: a large dataset for complex question answering over Wikidata and DBpedia. In: Proceedings of ISWC 2019, pp. 69–78 (2019)
Google Scholar
Dubey, M., Banerjee, D., Chaudhuri, D., Lehmann, J.: EARL: joint entity and relation linking for question answering over knowledge graphs. In: Proceedings of ISWC 2018, pp. 108—126 (2018)
Google Scholar
Ferragina, P., Scaiella, U.: Tagme: on-the-fly annotation of short text fragments (by Wikipedia entities). In: Proceedings of CIKM 2010, pp. 1625–1628 (2010)
Google Scholar
Ferragina, P., Scaiella, U.: Fast and accurate annotation of short texts with Wikipedia pages. IEEE Softw. 29(1), 70–75 (2012)
Article Google Scholar
Gabrilovich, E., Ringgaard, M., Subramanya, A.: FACC1: freebase annotation of ClueWeb corpora, version 1 (release date 2013-06-26, format version 1, correction level 0) (2013)
Google Scholar
Golub, D., He, X.: Character-level question answering with attention. In: Proceedings of EMNLP 2016, pp. 1598–1607 (2016)
Google Scholar
He, S., Zhang, Y., Liu, K., Zhao, J.: CASIA@V2: a MLN-based question answering system over linked data. In: Proceedings of QALD-4, pp. 1249–1259 (2014)
Google Scholar
Hu, X., Shu, Y., Huang, X., Qu, Y.: EDG-based question decomposition for complex question answering over knowledge bases. In: Proceedings of ISWC 2021, pp. 128–145 (2021)
Google Scholar
Kapanipathi, P., et al.: Leveraging Abstract Meaning Representation for knowledge base question answering. In: ACL-IJCNLP 2021, pp. 3884–3894. Association for Computational Linguistics (2021)
Google Scholar
Kim, J., et al.: OKBQA: an open collaboration framework for development of natural language question-answering over knowledge bases. In: Proceedings of ISWC 2017 Posters & Demonstrations and Industry Tracks (2017)
Google Scholar
Kocoń, J., et al.: ChatGPT: jack of all trades, master of none. Inf. Fusion 99, 101861 (2023)
Article Google Scholar
Liang, Z., Peng, Z., Yang, X., Zhao, F., Liu, Y., McGuinness, D.L.: Bert-based semantic query graph extraction for knowledge graph question answering. In: Proceedings of ISWC 2021 Posters, Demos and Industry (2021)
Google Scholar
Lukovnikov, D., Fischer, A., Lehmann, J., Auer, S.: Neural network-based question answering over knowledge graphs on word and character level. In: Proceedings of WWW 2017, pp. 1211–1220 (2017)
Google Scholar
Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A., Lehmann, J.: Learning to rank query graphs for complex question answering over knowledge graphs. In: Proceedings of ISWC 2019, pp. 487–504 (2019)
Google Scholar
Marx, E., Usbeck, R., Ngomo, A.C.N., Höffner, K., Lehmann, J., Auer, S.: Towards an open question answering architecture. In: Proceedings of SEM$\acute{1}4$, pp. 57–60 (2014)
Google Scholar
Mazzeo, G., Zaniolo, C.: Answering controlled natural language questions on RDF knowledge bases. In: Proceedings of EDBT 2016, pp. 608–611 (2016)
Google Scholar
Mendes, P., Jakob, M., García-Silva, A., Bizer, C.: DBpedia Spotlight: shedding light on the web of documents. In: Proceedings of I-SEMANTICS 2011, pp. 1–8 (2011)
Google Scholar
Mohammed, S., Shi, P., Lin, J.: Strong baselines for simple question answering over knowledge graphs with and without neural networks. In: Proceedings of NAACL 2018, vol. 2 (Short Papers). pp. 291–296 (2018)
Google Scholar
Nakashole, N., Weikum, G., Suchanek, F.: PATTY: a taxonomy of relational patterns with semantic types. In: Proceedings of EMNLP-CoNLL 2012, pp. 1135–1145 (2012)
Google Scholar
Omar, R., Mangukiya, O., Kalnis, P., Mansour, E.: ChatGPT versus traditional question answering for knowledge graphs: current status and future directions towards knowledge graph chatbots (2023)
Google Scholar
Park, S., Shim, H., Lee, G.G.: Isoft at QALD-4: semantic similarity-based question answering system over linked data. In: CLEF (Working Notes) 2014, pp. 1236–1248 (2014)
Google Scholar
Pramanik, S., Alabi, J., Roy, R.S., Weikum, G.: UNIQORN: unified question answering over RDF knowledge graphs and natural language text (2023)
Google Scholar
Sakor, A., et al.: Old is gold: linguistic driven approach for entity and relation linking of short text. In: Proceedings of NAACL 2019, vol. 1. pp. 2336–2346 (2019)
Google Scholar
Singh, K., Both, A., Sethupat, A., Shekarpour, S.: Frankenstein: a platform enabling reuse of question answering components. In: Proceedings of ESWC 2018, pp. 624–638 (2018)
Google Scholar
Tan, Y., et al.: Can ChatGPT replace traditional KBQA models? an in-depth analysis of the question answering performance of the GPT LLM family. In: ISWC 2023, pp. 348–367 (2023)
Google Scholar
Trivedi, P., Maheshwari, G., Dubey, M., Lehmann, J.: LC-QuAD: a corpus for complex question answering over knowledge graphs. In: ISWC, pp. 210—218 (2017)
Google Scholar
Unger, C., Cimiano, P., Lopez, V., Motta, E.: Preface. In: Proceedings of 1st Workshop on Question Answering Over Linked Data (QALD-1), pp. II–V (2011)
Google Scholar
Unger, C., et al.: Question answering over linked data (QALD-4), vol. 1180, pp. 1172–1180 (2014)
Google Scholar
Unger, C., Forascu, C., Lopez, V., Ngonga Ngomo, A.C., Cabrio, E., Cimiano, P., Walter, S.: Question answering over linked data (QALD-5). In: CLEF 2015 Working Notes (2015)
Google Scholar
Unger, C., Ngonga Ngomo, A.C., Cabrio, E.: 6th open challenge on question answering over linked data (QALD-6). In: SemWebEval 2016: Semantic Web Challenges, vol. 641, pp. 171–177 (2016)
Google Scholar
Usbeck, R., Gusmita, R.H., Ngomo, A.C.N., Saleem, M.: 9th challenge on question answering over linked data (QALD-9). In: Joint Proceedings of ISWC 2018 Workshops SemDeep-4 and NLIWOD-4 (2018)
Google Scholar
Usbeck, R., Ngomo, A.C.N., Conrads, F., Röder, M., Napolitano, G.: 8th challenge on question answering over linked data (QALD-8). In: Joint Proceedings of ISWC 2018 Workshops SemDeep-4 and NLIWOD-4 (2018)
Google Scholar
Usbeck, R., Ngomo, A.-C.N., Haarmann, B., Krithara, A., Röder, M., Napolitano, G.: 7th open challenge on question answering over linked data (QALD-7). In: Dragoni, M., Solanki, M., Blomqvist, E. (eds.) SemWebEval 2017. CCIS, vol. 769, pp. 59–69. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69146-6_6
Chapter Google Scholar
Vakulenko, S., Garcia, J.D.F., Polleres, A., de Rijke, M., Cochez, M.: Message passing for complex question answering over knowledge graphs. In: Proceedings of CIKM 2019, pp. 1431–1440 (2019)
Google Scholar
Vollmers, D., Jalota, R., Moussallem, D., Topiwala, H., Ngonga Ngomo, A.C., Usbeck, R.: Knowledge graph question answering using graph-pattern isomorphism. In: Proceedings of SEMANTiCS 2021, vol. 53, pp. 103–117
Google Scholar
Wang, S., Scells, H., Koopman, B., Zuccon, G.: Can ChatGPT write a good boolean query for systematic review literature search? ACM SIGIR (2023)
Google Scholar
Yih, W.t., Chang, M.W., He, X., Gao, J.: Semantic parsing via staged query graph generation: question answering with knowledge base. In: Proceedings of ACL–IJCNP 2015, pp. 1321–1331 (2015)
Google Scholar
Yin, W., Yu, M., Xiang, B., Zhou, B., Schütze, H.: Simple question answering by attentive convolutional neural network. In: Proceedings of COLING 2016, pp. 1746–1756 (2016)
Google Scholar
Zafar, H., Napolitano, G., Lehmann, J.: Formal query generation for question answering over knowledge bases. In: Proceedings of ESWC 2018, pp. 714–728 (2018)
Google Scholar
Zimina, E., Nummenmaa, J., Järvelin, K., Peltonen, J., Stefanidis, K.: MuG-QA: multilingual grammatical question answering for RDF data. In: Proceedings of PIC 2018, pp. 57–61 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Tampere University, Tampere, Finland
Elizaveta Zimina, Kalervo Järvelin, Jaakko Peltonen & Jyrki Nummenmaa
University of Gothenburg, Gothenburg, Sweden
Aarne Ranta

Authors

Elizaveta Zimina
View author publications
You can also search for this author in PubMed Google Scholar
Kalervo Järvelin
View author publications
You can also search for this author in PubMed Google Scholar
Jaakko Peltonen
View author publications
You can also search for this author in PubMed Google Scholar
Aarne Ranta
View author publications
You can also search for this author in PubMed Google Scholar
Jyrki Nummenmaa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jyrki Nummenmaa .

Editor information

Editors and Affiliations

Tampere University, Tampere, Finland
Kostas Stefanidis
Tampere University, Tampere, Finland
Kari Systä
Politecnico di Milano, Milano, Italy
Maristella Matera
Chemnitz University of Technology, Chemnitz, Germany
Sebastian Heil
University of Crete, Heraklion, Greece
Haridimos Kondylakis
University of Verona, Verona, Italy
Elisa Quintarelli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zimina, E., Järvelin, K., Peltonen, J., Ranta, A., Nummenmaa, J. (2024). TraQuLA: Transparent Question Answering Over RDF Through Linguistic Analysis. In: Stefanidis, K., Systä, K., Matera, M., Heil, S., Kondylakis, H., Quintarelli, E. (eds) Web Engineering. ICWE 2024. Lecture Notes in Computer Science, vol 14629. Springer, Cham. https://doi.org/10.1007/978-3-031-62362-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-62362-2_2
Published: 16 June 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-62361-5
Online ISBN: 978-3-031-62362-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics