Falcon 2.0

Maria-Esther Vidal

Falcon 2.0

2020, Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Resource Track CIKM '20, October 19–23, 2020, Virtual Event, Ireland Falcon 2.0: An Entity and Relation Linking Tool over Wikidata Ahmad Sakor Kuldeep Singh ahmad.sakor@tib.eu L3S Research Center and TIB, University of Hannover Hannover, Germany kuldeep.singh1@cerence.com Cerence GmbH and Zerotha Research Aachen, Germany Anery Patel Maria-Esther Vidal anery.patel@tib.eu TIB, University of Hannover Hannover, Germany maria.vidal@tib.eu L3S Research Center and TIB, University of Hannover Hannover, Germany ABSTRACT Management (CIKM ’20), October 19ś23, 2020, Virtual Event, Ireland. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3340531.3412777 The Natural Language Processing (NLP) community has significantly contributed to the solutions for entity and relation recognition from a natural language text, and possibly linking them to proper matches in Knowledge Graphs (KGs). Considering Wikidata as the background KG, there are still limited tools to link knowledge within the text to Wikidata. In this paper, we present Falcon 2.0, the first joint entity and relation linking tool over Wikidata. It receives a short natural language text in the English language and outputs a ranked list of entities and relations annotated with the proper candidates in Wikidata. The candidates are represented by their Internationalized Resource Identifier (IRI) in Wikidata. Falcon 2.0 resorts to the English language model for the recognition task (e.g., N-Gram tiling and N-Gram splitting), and then an optimization approach for the linking task. We have empirically studied the performance of Falcon 2.0 on Wikidata and concluded that it outperforms all the existing baselines. Falcon 2.0 is open source and can be reused by the community; all the required instructions of Falcon 2.0 are well-documented at our GitHub repository1 . We also demonstrate an online API, which can be run without any technical expertise. Falcon 2.0 and its background knowledge bases are available as resources at https://labs.tib.eu/falcon/falcon2/. 1 INTRODUCTION Entity Linking (EL)- also known as Named Entity Disambiguation (NED)- is a well-studied research domain for aligning unstructured text to its structured mentions in various knowledge repositories (e.g., Wikipedia, DBpedia [1], Freebase [4] or Wikidata [28]). Entity linking comprises two sub-tasks. The first task is Named Entity Recognition (NER), in which an approach aims to identify entity labels (or surface forms) in an input sentence. Entity disambiguation is the second sub-task of linking entity surface forms to semistructured knowledge repositories. With the growing popularity of publicly available knowledge graphs (KGs), researchers have developed several approaches and tools for EL task over KGs. Some of these approaches implicitly perform NER and directly provide mentions of entity surface forms in the sentences to the KG (often referred to as end-to-end EL approaches) [7]. Other attempts (e.g., Yamanda et al. [30], DCA [32]) consider recognized surface forms of the entities as additional inputs besides the input sentence to perform entity linking. Irrespective of the input format and underlying technologies, the majority of the existing attempts [22] in the EL research are confined to well-structured KGs such as DBpedia or Freebase2 . These KGs rely on a well-defined process to extract information directly from Wikipedia infoboxes. They do not provide direct access to the users to add/delete the entities or alter the KG facts. Wikidata, on the other hand, also allows users to edit Wikidata pages directly, add newer entities, and define new relations between the objects. Wikidata is hugely popular as a crowdsourced collection of knowledge. Since its launch in 2012, over 1 billion edits have been made by the users across the world3 . CCS CONCEPTS · Information systems → Resource Description Framework (RDF); Information extraction. KEYWORDS NLP, Entity Linking, Relation Linking, Background Knowledge, English morphology, DBpedia, and Wikidata ACM Reference Format: Ahmad Sakor, Kuldeep Singh, Anery Patel, and Maria-Esther Vidal. 2020. Falcon 2.0: An Entity and Relation Linking Tool over Wikidata. In Proceedings of the 29th ACM International Conference on Information and Knowledge 1https://github.com/SDM-TIB/falcon2.0 This work is licensed under a Creative Commons Attribution International 4.0 License. CIKM ’20, October 19ś23, 2020, Virtual Event, Ireland © 2020 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-6859-9/20/10. https://doi.org/10.1145/3340531.3412777 3141 Motivation, Approach, and Contributions. We motivate our work by the fact that despite the vast popularity of Wikidata, there are limited attempts to target entity and relation linking over Wikidata. For instance, there are over 20 entity linking tools/APIs for DBpedia [22, 26], which are available as APIs. To the best of our knowledge, there exists only one open-source API for Wikidata entity linking (i.e., OpenTapioca [7]). Furthermore, there is no tool over Wikidata for relation linking, i.e., linking predicate surface forms to their corresponding Wikidata mentions. In this paper, we focus on providing Falcon 2.0, a reusable resource API for joint 2 it is now depreciated and no further updates are possible 3 https://www.wikidata.org/wiki/Wikidata:Statistics Resource Track CIKM '20, October 19–23, 2020, Virtual Event, Ireland CIKM ’20, October 19–23, 2020, Virtual Event, Ireland Sakor, et al. entity and relation linking over Wikidata. In our previous work, we proposed Falcon [23], a rule-based approach yet effective for entity and relation linking on short text (questions in this case) over DBpedia. In general, the Falcon approach has two novel concepts: 1) a linguistic-based approach that relies on several English morphology principles such as tokenization, and N-gram tiling; 2) a local knowledge base which serves as a source of background knowledge (BK). This knowledge base is a collection of entities from DBpedia. We resort to the Falcon approach for developing Falcon 2.0. Our aim here is to study whether or not the Falcon approach is agnostic to underlying KG; hence, we do not claim novelty in the underlying linguistic-based approach for Falcon 2.0. Further, we investigate the concerns related to robustness, emerging failures, and bottlenecks. We introduce Falcon 2.0 based on the methodology employed in the first version. Our tool is the first joint entity and relation linking tool for Wikidata. Our novel contributions briefly lie in two aspects: community. The availability and sustainability of resources is explained in Section 6, and its maintenance related discussion is presented in Section 7. We close with the conclusion in Section 8. 2 RELATED WORK Several surveys provide a detailed overview of the advancements of the techniques employed in entity linking over KGs [2, 24]. Various reading lists [16], online forums7 and Github repositories8 track the progress in the domain of entity linking. Initial attempts in EL considered Wikipedia as an underlying knowledge source. The research field has matured and the SOTA nearly matches human-level performance [20]. With the advent of publicly available KGs such as DBpedia, Yago, and Freebase, the focus has shifted towards developing EL over knowledge graphs. The developments in Deep Learning have introduced a range of models that carry out both NER and NED as a single end-to-end step [11, 17]. NCEL [5] learns both local and global features from Wikipedia articles, hyperlinks, and entity links to derive joint embeddings of words and entities. These embeddings are used to train a deep Graph Convolutional Network (GCN) that integrates all the features through a Multi-layer Perceptron. The output is passed through a Sub-Graph Convolution Network, which finally resorts to a fully connected decoder. The decoder maps the output states to linked entities. The BI-LSTM+CRF model [15] formulates entity linking as a sequence learning task in which the entity mentions are a sequence whose length equals the series of the output entities. Albeit precise, deep learning approaches demand high-quality training annotations, which are not extensively available for Wikidata entity linking [6, 19]. There is concrete evidence in the literature that the machine learning-based models trained over generic datasets such as WikiDisamb30 [10], and CoNLL (YAGO) [14] do not perform well when applied to short texts. Singh et al. [26] evaluated more than 20 entity linking tools over DBpedia for short text (e.g., questions) and concluded that issues like capitalization of surface forms, implicit entities, and multi-word entities affect the performance of EL tools in a short input text. Sakor et al. [23] addresses specific challenges of short texts by applying a rule-based approach for EL over DBpedia. In addition to linking entities to DBpedia, Sakor et al. also provides DBpedia IRIs of the relations in a short text. EARL [3] is another tool that proposes a traveling salesman algorithm-based approach for joint entity and relation linking over DBpedia. To the best of our knowledge, EARL and Falcon are the only available tools that provide both entity and relation linking. Entity linking over Wikidata is a relatively new domain. Cetoli et al. [6] propose a neural network-based approach for linking entities to Wikidata. The authors also align an existing Wikipedia corpus-based dataset to Wikidata. However, this work only targets entity disambiguation and assumes that the entities are already recognized in the sentences. Arjun [19] is the latest work for Wikidata entity linking. It uses an attention-based neural network for linking Wikidata entity labels. OpenTapioca [7] is another attempt that performs end-to-end entity linking over Wikidata; it is the closest to our work even though OpenTapioca does not provide Wikidata (1) Falcon 2.0: The first resource for joint entity and relation linking over Wikidata. Falcon 2.0 relies on fundamental principles of English morphology (tokenization and compounding) and links entity and relation surface forms in a short sentence to its Wikidata mentions. Falcon 2.0 is available as an online API and can be accessed at https: //labs.tib.eu/falcon/falcon2/. Falcon 2.0 is also able to recognize entities in keywords such as Barack Obama, where there is no relation. We empirically evaluate Falcon 2.0 on three datasets tailored for Wikidata. According to the observed results, Falcon 2.0 significantly outperforms all the existing baselines. For the ease of use, we integrate the Falcon API4 into Falcon 2.0. This option is available in case Wikipedia contains an equivalence entity (Wikidata is a superset of DBpedia) The Falcon 2.0 API already has over half a million hits from February 2020 to the time of paper acceptance, which shows its gaining usability (excluding self-access of the API while performing the evaluation). (2) Falcon 2.0 Background KG: We created a new background KG of Falcon 2.0 with the Wikidata. We extracted 48,042,867 Wikidata entities from its public dump and aligned these entities with the aliases present in Wikidata. For example, Barack Obama is a Wikidata entity Wiki:Q765 . We created a mapping between the label (Barack Obama) of Wiki:Q76 with its aliases such as President Obama, Barack Hussein Obama, and Barry Obama and stored it in the background knowledge base. We implemented a similar alignment for 15,645 properties/relations of Wikidata. The background knowledge base is an indexed graph and can be queried. The resource is also present at a persistent URI for further reuse6 . The rest of this paper is organized as follows: Section 2 reviews the state-of-the-art, and the following Section 3 describes our two resources and approach to build Falcon 2.0. Section 4 presents experiments to evaluate the performance of Falcon 2.0. Section 5 presents the importance and impact of this work for the research 4 https://labs.tib.eu/falcon/ 7 http://nlpprogress.com/english/entity_linking.html 5 https://www.wikidata.org/wiki/Q76 8 https://github.com/sebastianruder/NLP-progress/blob/master/english/entity_ linking.md 6 https://doi.org/10.6084/m9.figshare.11362883 3142 Resource Track CIKM '20, October 19–23, 2020, Virtual Event, Ireland Falcon 2.0: An Entity and Relation Linking Tool over Wikidata CIKM ’20, October 19–23, 2020, Virtual Event, Ireland Ids of relations in a sentence. OpenTapioca is also available as an API and is utilized as our baseline. S-Mart [33] is a tree-based structured learning framework based on multiple additive regression trees for linking entities in a tweet. The model was later adapted for linking entities in the questions. VCG [27] is another attempt which is a unifying network that models contexts of variable granularity to extract features for an end to end entity linking. However, Falcon 2.0 is the first tool for joint entity and relation linking over Wikidata. 3 each relation empowers Falcon 2.0 to match the surface form in the text to a relation in Wikidata. Figure 2 illustrates the process of building background knowledge. 3.3 Catalog of Rules Falcon 2.0 is a rule-based approach. A catalog of rules is predefined to extract entities and relations from the text. The rules are based on the English morphological principles and borrowed from Sakor et al. [23]. For example, Falcon 2.0 excludes all verbs from the entities candidates list based on the rule verbs are not entities. For example, the N-Gram tiling module in the Falcon 2.0 architecture resorts to the rule: entities with only stopwords between them are one entity. Another example of such rule When -> date, Where -> place solves the ambiguity of matching the correct relation in case the short text is a question by looking at the questions headword. For example, give the two questions When did Princess Diana die? and Where did Princess Diana die?, the relation died can be the death place or the death year. The question headword (When/Where) is the only insight to solve the ambiguity here. When the question word is where, Falcon 2.0 matches only relations that have a place as a range of the relation. FALCON 2.0- A RESOURCE In this section, we describe Falcon 2.0 in detail. First the architecture of Falcon 2.0 is depicted. Next, we discuss the BK used to match the surface forms in the text to the resource in a specific KG. In the paper’s scope, we define "short text" as grammatically correct questions (up to 15 words). 3.1 Architecture The Falcon 2.0 architecture is depicted in Figure 1. Falcon 2.0 receives short input texts and outputs a set of entities and relations extracted from the text; each entity and relation in the output is associated with a unique Internationalized Resource Identifier (IRI) in Wikidata. Falcon 2.0 resorts to BK and a catalog of rules for performing entity and relation linking. The BK combines Wikidata labels and their corresponding aliases. Additionally, it comprises alignments between nouns and entities in Wikidata. Alignments are stored in a text search engine, while the knowledge source is maintained in an RDF triple store accessible via a SPARQL endpoint. The rules that represent the English morphology are in a catalog; a forward chaining inference process is performed on top of the catalog during the extraction and linking tasks. Falcon 2.0 also comprises several modules that identify and link entities and relations to the Wikidata. These modules implement POS Tagging, Tokenization & Compounding, N-Gram Tiling, Candidate List Generation, Matching & Ranking, Query Classifier, and N-Gram Splitting and are reused from the implementation of Falcon. 3.4 Recognition Extraction phase in Falcon 2.0 consists of three modules. POS tagging, tokenization & compounding, and N-Gram tiling. The input of this phase is a natural language text. The output of the phase is the list of surface forms related to entities or relations. Part-of-speech (POS) Tagging receives a natural language text as an input. It tags each word in the text with its related tag, e.g., noun, verb, and adverb. This module differentiates between nouns and verbs to enable the application of the morphological rules from the catalog. The output of the module is a list of the pairs of (word, tag). Tokenization & Compounding builds the tokens list by removing the stopwords from the input and splitting verbs from nouns. For example, if the input is What is the operating income for Qantas, the output of this module is a list of three tokens [operating, income, Qantas]. N-Gram Tilling module combines tokens with only stopwords between them relying on one of the rules from a catalog of rules. For example, if we consider the previous module’s output as an input for the n-gram tilling module, operating and income tokens will be combined in one token. The output of the module is a list of two tokens [operating income, Qantas]. 3.2 Background Knowledge Wikidata contains over 52 million entities and 3.9 billion facts (in the form of subject-predicate-object triples). Since Falcon 2.0 background knowledge only depends on labels, a significant portion of this extensive information is not useful for our approach. Hence, we only extract all the entity and relation labels to create a local background KG, A.K.A "alias background knowledge base.". For example, the entity United States of America9 in Wikidata has the natural language label ‘United States of America’ and several other aliases (or known_as labels) of United States of America such as "the United States of America, America, U.S.A., the U.S., United States, etc.". We extended our background KG with this information from Wikidata. Similarly, for relation’s labels, the background KG is enriched with known_as labels to provide synonyms and derived word forms. For example, the relation spouse 10 in Wikidata has the label spouse and the other known as labels are husband, wife, married to, wedded to, partner, etc. This variety of synonyms for 3.5 Linking This phase consists of four modules: candidate list generation, matching & ranking, relevant rule selection, and n-gram splitting. Candidate List Generation receives the output of the recognition phase. The module queries the text search engine for each token. Then, tokens will have an associated candidate list of resources. For example, the retrieved candidate list of the token operating income is [(P3362, operating income), (P2139, income), (P3362, operating profit)]; where the first element is the Wikidata predicate identifier and the second is the list of labels associated with the 9 https://www.wikidata.org/wiki/Q30 10 https://www.wikidata.org/wiki/Property:P26 3143 Resource Track CIKM '20, October 19–23, 2020, Virtual Event, Ireland CIKM ’20, October 19–23, 2020, Virtual Event, Ireland Sakor, et al. Catalog of Rules …………… Verbs are not entities Entities with only stopwords between them are one entity When → date Where → place Split the token from the right side ………….. Input What is the operating income for Qantas? Output (operating income,{P3362}) (Qantas,{Q32491}) Qantas:noun;... Tokenization & Compounding [operating, income, Qantas]:3 tokens N-Gram Tilling [operating income, Qantas]: 2 tokens N-Gram Splitting Relevant Rule Selection Candidate List Generation Matching & Ranking [(operating income,{P3362,P3362}), (Qantas,{Q32491,Q32491})]: Candidate Lists Search Engine RDF Triple Store <http://www.wikidata.org/entity/Q7266513> <http://www.wikidata.org/prop/direct/P3362> ?object <P3362, operating income> <P2139, income> <P3362, operating profit> <P2295, net income> No Answer POS Tagging Linking Answer found Recognition <Q32491 , Qantas> <Q32491, Qantas Airways> <Q32491, QFA> <Q17156256, Qantas> <Q7266513, Qantas aircraft> <http://www.wikidata.org/entity/Q32491> <http://www.wikidata.org/prop/direct/P3362> ?object 1370000000^^xsd:integer [] Background Knowledge Figure 1: The Falcon 2.0 Architecture. The boxes highlighted in Grey are reused from Falcon [23]. Grey boxes contain a linguistic pipeline for recognizing and linking entity and relation surface forms. The boxes in White are our addition to the Falcon pipeline to build a resource for the Wikidata entity and relation linking. The white boxes constitute what we refer to as BK specific to Wikidata. The text search engine contains the alignment of Wikidata entity/relation labels along with the entity and relation aliases. It is used for generating potential candidates for entity and relation linking. RDF triple store is a local copy of Wikidata triples containing all entities and predicates. based on the range of relationships in the KG. N-Gram Splitting module is used if none of the triples tested in the matching & ranking modules exists in the triple store, i.e., the compounding the approach did in the tokenization & compounding module led to combining two separated entities. The module splits the tokens from the right side and passes the tokens again to the candidate list generation module. Splitting the tokens from the right side resorts to one of the fundamentals of the English morphology; the compound words in English have their headword always towards the right side [29]. Text Search Engine stores all the alignments of the labels. A simple querying technique [12] is used as the text search engine over background knowledge. It receives a token as an input and then returns all the related resources with labels similar to the received token. predicates which match the query "operating income." Matching & Ranking ranks the candidate list received from the candidate list generation module and matches candidates’ entities and relations. Since, in any KG, the facts are represented as triples, the matching and ranking module creates triples consisting of the entities and relationships from the candidates’ list. Then, for each pair of entity and relation, the module checks if the triple exists in the RDF triple store (Wikidata). The checking is done by executing a simple ASK query over the RDF triple store. For each triple, the module increases the rank of the involved relations and entities. The output of the module is the ranked list of the candidates. Relevant Rule Selection interacts with the matching & ranking module by suggesting increasing the ranks of some candidates relying on the catalog of rules. One of the suggestions is considering the question headword to clear the ambiguity between two relations 3144 Resource Track CIKM '20, October 19–23, 2020, Virtual Event, Ireland Falcon 2.0: An Entity and Relation Linking Tool over Wikidata CIKM ’20, October 19–23, 2020, Virtual Event, Ireland Label Also known as the States United States of America America U.S.A. spouse US P26 married to Q30 wife Synonyms for spouse: bride, partner, better half, wife, …. marriage partner Background Knowledge Creation (P26, spouse) (P26,wife) (P26, married to) (P26, marriage partner) (P26, bride) (P26, partner) (P26, better half) (Q30, United States of America) (Q30,U.S.A) (Q30, US) (Q30, United States) Background Knowledge Figure 3: Falcon 2.0 API Web Interface. Figure 2: Falcon 2.0 Background Knowledge is built by converting labels of entities and relations in Wikidata into pairs of alignments. It is a part of search engine (cf. Figure 1). the three datasets. Variable Context Granularity model (VCG) [27]: is a unifying network that models contexts of variable granularity to extract features for mention detection and entity disambiguation. We were unable to reproduce VCG using the publicly available source code. Hence, we only report its performance on WebQSP-WD from the original paper [27] as we are unable to run the model on the other two datasets for entity linking. For the completion of the approach, we also report the other two baselines provided by the authors, namely Heuristic Baseline and Simplified VCG. S-Mart [33]: was initially proposed to link entities in the tweets and later adapted for question answering. The system is not open source, and we adapt its result from [27] for WebQSP-WD dataset. No Baseline for Relation Linking: To the best of our knowledge, there is no baseline for relation linking on Wikidata. One argument could be to run the existing DBpedia based relation linking tool on Wikidata and compare it with our performance. We contest this solely because Wikidata is extremely noisy. For example, in "What is the longest National Highway in the world?" the entity surface form "National Highway" matches four(4) different entities in Wikidata that share the same entity label (i.e., "National Highway"). In comparison, 2,055 other entities contain the full mention in their labels for the surface form "National Highway". However, in DBpedia, there exists only one unique label for "National Highway". Hence, any entity linking tool or relation linking tool tailored for DBpedia will face issues on Wikidata (cf. table 3). Therefore, instead of reporting the bias and under-performance, we did not evaluate their performance for a fair comparison. Hence, we report Falcon 2.0 relation linking performance only to establish new baselines on two datasets: SimpleQuestion and LC-QuAD 2.0. RDF Triple store is a local copy of the Wikidata endpoint. It consists of all the RDF triples of Wikidata labeled with the English language. An RDF triple store is used to check the existence of the triples passed from the Matching & Ranking module. The RDF triple store keeps around 3.9 billion triples. 4 EXPERIMENTAL STUDY We study three research questions: RQ1) What is the performance of Falcon 2.0 for entity linking over Wikidata? RQ2) What is the impact of Wikidata’s specific background knowledge on the performance of a linguistic approach? RQ3) What is the performance of Falcon 2.0 for relation linking over Wikidata? Metrics. We report the performance using the standard metrics of Precision, Recall, and F-measure. Precision is the fraction of relevant resources among the retrieved resources. Recall is the fraction of relevant resources that have been retrieved over the total amount of relevant resources. F-Measure or F-Score is the harmonic mean of precision and recall. Datasets. We rely on three different question answering datasets namely SimpleQuestion dataset for Wikidata [8], WebQSP-WD [27] and LC-QuAD 2.0 [9]. The SimpleQuestion dataset contains 5,622 test questions which are answerable using Wikidata as underlying KG. WebQSP-WD contains 1639 test questions, and LC-QUAD 2.0 contains 6046 test questions. SimpleQuestion and LC-QuaD 2.0 provide the annotated gold standard for entity and relations, whereas WebQSP-WD only provides annotated gold standard for entities. Hence, we evaluated entity linking performance on three datasets and relation linking performance on two datasets. Also, SimpleQuestion and WebQSP-WD contain questions with a single entity and relation, whereas LC-QuAD 2.0 contains mostly complex questions (i.e., more than one entity and relation). Experimental Details. Falcon 2.0 is extremely lightweight from an implementation point of view. A laptop machine, with eight cores and 16GB RAM running Ubuntu 18.04 is used for implementing and evaluating Falcon 2.0. We deployed its web API on a server with 723GB RAM, 96 cores (Intel(R) Xeon(R) Platinum 8160CPU with 2.10GHz) running Ubuntu 18.04. This publicly available API is used to calculate the standard evaluation metrics, namely Precision, Recall, and F-score. Baselines. OpenTapioca [7]: is available as a web API; it provides Wikidata URIs for entities. We run OpenTapioca API on all 3145 Resource Track CIKM '20, October 19–23, 2020, Virtual Event, Ireland CIKM ’20, October 19–23, 2020, Virtual Event, Ireland Sakor, et al. Table 1: Entity linking evaluation results on LC-QuAD 2.0 & SimpleQuestion datasets. Best values are in bold. Approach Dataset P R F OpenTapioca [7] Falcon 2.0 OpenTapioca [7] Falcon 2.0 OpenTapioca [7] Falcon 2.0 LC-QuAD 2.0 LC-QuAD 2.0 SimpleQuestion SimpleQuestion SimpleQuestion Uppercase Entities SimpleQuestion Uppercase Entities 0.29 0.50 0.01 0.56 0.16 0.66 0.42 0.56 0.02 0.64 0.28 0.75 0.35 0.53 0.01 0.60 0.20 0.70 same linguistic driven approach. The jump in Falcon 2.0 performance comes from Wikidata’s specific local background knowledge, which we created by expanding Wikidata entities and relations with associated aliases. It also validates the novelty of Falcon 2.0 when compared to Falcon for the Wikidata entity linking. We observe an indifferent phenomenon in our performance for three datasets, and the performance for Falcon 2.0 differs a lot per dataset. For instance, on WebQSP-WD, our F-score is 0.82, whereas, on LC-QuAD 2.0, the F-Score drops to 0.57. The first source of error is the dataset(s) itself. In both the datasets (SimpleQuestion and LC-QuAD 2.0), many questions are grammatically incorrect. To validate our claim more robustly, we asked two native English speakers to check the grammar of 200 random questions on LCQuAD 2.0. Annotators reported that 42 out of 200 questions are grammatically incorrect. Many questions have erroneous spellings of the entity names. For example, "Who is the country for head of state of Mahmoud Abbas?" and "Tell me about position held of Malcolm Fraser and elected in?" are two grammatically incorrect questions in LC-QuAD 2.0. Similarly, many questions in the SimpleQuestion dataset are also grammatically incorrect. "where was hank cochran birthed" is one such example in the SimpleQuestion dataset. Falcon 2.0 resorts to fundamental principles of the English morphology and finds limitation in recognizing entities in many grammatically incorrect questions. We also recognize that the performance of Falcon 2.0 on sentences with minimal context is limited. For example, in the question "when did annie open?" from the WebQSP-WD dataset, the sentential context is shallow. Also, more than one instance of "Annie" exists in Wikidata, such as Wiki:Q566892 (correct one) and Wiki:Q181734. Falcon 2.0 wrongly predicts the entity in this case. In another example, "which country is lamb from?", the correct entity is Wiki:Q6481017 with label "lamb" in Wikidata. However, Falcon 2.0 returns Wiki:13553878, which also has a label "lamb". In such cases, additional knowledge graph context shall prove to be useful. Approaches such as [32] introduced a concept of feeding "entity descriptions" as an additional context in an entity linking model over Wikipedia. Suppose the extra context in the form of entity description (1985 English drama film directed by Colin Gregg) for the entity Wiki:13553878 is provided. In that case, a model may correctly predict the correct entity "lamb." Based on our observations, we propose the following recommendations for the community to improve the entity linking task over Wikidata: Table 2: Entity linking evaluation results on the WEBQSP test dataset. Best values are in bold. Approach P R F S-MART [33] Heuristic baseline [27] Simplified VCG [27] VCG [27] OpenTapioca [7] Falcon 2.0 0.66 0.30 0.84 0.83 0.01 0.80 0.77 0.61 0.62 0.65 0.02 0.84 0.72 0.40 0.71 0.73 0.02 0.82 4.1 Experimental Results Experimental Results 1. In the first experiment described in Table 1, we compare entity linking performance of Falcon 2.0 on SimpleQuestion and LC-QuAD 2.0 datasets. We first evaluate the performance on the SimpleQuestion dataset. Surprisingly, we observe that for the OpenTapioca baseline, the values are approximately 0.0 for Precision, Recall, and F-score. We analyzed the source of errors and found that out of 5,622 questions, only 246 have entity labels in uppercase letters. Opentapioca fails to recognize and link entity mentions written in lowercase letters. Case sensitivity is a common issue for entity linking tools over short text, as reported by Singh et al. [25, 26] in a detailed analysis. From the remaining 246 questions, only 70 are answered correctly by OpenTapioca. Given that OpenTapioca finds limitation in linking lowercase entity surface forms. We evaluated Falcon 2.0 and OpenTapioca on the 246 questions of SimpleQuestion to provide a fair evaluation for the baseline (reported as SimpleQuestion uppercase entities in table 1). OpenTapioca reports F-score 0.20 on this subset of SimpleQuestion. On the other hand, Falcon 2.0 reports F-score 0.70 on the same dataset (cf. Table 1). For LC-QuAD 2.0, OpenTapioca reports F-score 0.35 against Falcon 2.0 with F-score 0.53 reported in Table 1. • Wikidata has inherited challenges of vandalism and noisy entities due to crowd-authored entities [13]. We expect the research community to come up with more robust short text datasets for the Wikidata entity linking without spelling and grammatical errors. • Rule-based approaches come with its limitations when the sentential context is minimal. However, such methods are beneficial for the nonavailability of training data. We recommend a two-step process to target questions with minimal sentential context: 1) work towards a clean and large Wikidata dataset for entity linking of short text. This will allow more robust machine learning approaches to evolve 2) use of entity descriptions from knowledge graphs to improve the linking process (same as [32]). Experimental Results 2. We report performance of Falcon 2.0 on WebQSP-WD dataset in Table 2. Falcon 2.0 clearly outperforms all other baselines with highest F-score value 0.82. OpenTapioca demonstrates a low performance on this dataset as well. Experiment results 1 & 2 answer our first research question (RQ1). Ablation Study for Entity Linking and Recommendations. For the second research question (RQ2), we evaluate the impact of Wikidata’s specific background knowledge on the entity linking performance. We evaluated Falcon on the WebQSP-WD dataset against Falcon 2.0. We linked Falcon predicted DBpedia IRIs with corresponding Wikidata IDs using owl:sameAs. We can see in the Table 3 that Falcon 2.0 significantly outperforms Falcon despite using the 3146 Resource Track CIKM '20, October 19–23, 2020, Virtual Event, Ireland Falcon 2.0: An Entity and Relation Linking Tool over Wikidata CIKM ’20, October 19–23, 2020, Virtual Event, Ireland Table 3: Entity Linking Performance of Falcon vs Falcon 2.0 on WEBQSP-WD. Best values are in bold. Approach P R F annotate unstructured text against Wikidata. We also believe that a rule-based approach, such as ours that does not require any training data, is beneficial for low resource languages (considering Wikidata is multilingual 12 ). Falcon [23] Falcon 2.0 0.47 0.80 0.45 0.84 0.46 0.82 6 Table 4: Relation linking evaluation results on LC-QuAD 2.0 & SimpleQuestion datasets. Approach Dataset P R F Falcon 2.0 Falcon 2.0 LC-QuAD 2.0 SimpleQuestion 0.44 0.35 0.37 0.44 0.40 0.39 Experimental Results 3: In the third experiment (for RQ3), we evaluate the relation linking performance of Falcon 2.0. We are not aware of any other model for relation linking over Wikidata. Table 4 summarizes relation linking performance. With this, we established new baselines over two datasets for relation linking on Wikidata. Ablation Study for Relation Linking and Recommendations. Falcon reported an F-score of 0.43 on LC-QuAD over DBpedia in [23] whereas Falcon 2.0 reports a comparable relation linking F-score 0.40 on LC-QuAD 2.0 for Wikidata (cf. Table 4). The wrong identification of the entities does affect the relation linking performance, and it is the major source of error in our case for relation linking. Table 5 summarizes a sample case study for relation linking on five LC-QuAD 2.0 questions. We observe that the relations present in the questions are highly uncommon and nonstandard, and it is a peculiar property of Wikidata. Falcon 2.0 finds limitations in linking such relations. We recommend the following: • Wikidata challenges relation linking approaches by posing a new challenge: user-created nonstandard relations such as in Table 5. A rule-based approach like ours faces a clear limitation in linking such relations. Linking user-created relations in crowd-authored Wikidata is an open question for the research community. 5 ADOPTION AND REUSABILITY Falcon 2.0 is open source. The source code is available in our public GitHub: https://github.com/SDM-TIB/Falcon2.0 for reusability and reproducibility. Falcon 2.0 is easily accessible via a simple CURL request or using our web interface. Detailed instructions are provided on our GitHub. It is currently available for the English language. However, there is no assumption in the approach or while building the background knowledge base that restricts its adaptation or extensibility to other languages. The background knowledge of Falcon 2.0 is available for the community and can be easily reused to generate candidates for entity linking [31] or in question answering approaches such as [35]. The background knowledge consists of 48,042,867 alignments of Wikidata entities and 15,645 alignments for Wikidata predicates. MIT License allows for the free distribution and re-usage of Falcon 2.0. We hope the research community and industry practitioners will use Falcon 2.0 resources for various usages such as linking entities and relations to Wikidata, annotating an unstructured text, developing new low language resources, and others. 7 MAINTENANCE AND SUSTAINABILITY Falcon 2.0 is a publicly available resource offering of the Scientific Data Management(SDM) group at TIB, Hannover13 . TIB is one of the largest libraries for Science and Technology in the world 14 . It actively promotes open access to scientific artifacts, e.g., research data, scientific literature, non-textual material, and software. Similar to other publicly maintained repositories of SDM, Falcon 2.0 will be preserved and regularly updated to fix bugs and include new features15 . The Falcon 2.0 API will be sustained on the TIB servers to allow for unrestricted free access. 8 CONCLUSION AND FUTURE WORK We presented the resource Falcon 2.0, a rule-based entity and relation linking tool able to recognize entities & relations in a short text and link them to the existing knowledge graph, e.g., DBpedia and Wikidata. Although there are various approaches for entity & relation linking to DBpedia, Falcon 2.0 is one of the few tools targeting Wikidata. Thus, given the number of generic and domainspecific facts that compose Wikidata, Falcon 2.0 has the potential to impact researchers and practitioners that resort to NLP tools for transforming semi-structured data into structured facts. Falcon 2.0 is open source. The API is publicly accessible and maintained in the servers of the TIB labs. Falcon 2.0 has been empirically evaluated on three benchmarks, and the outcomes suggest that it is able to overcome the state of the arts. Albeit promising, the experimental results can be improved. In the future, we plan to continue researching novel techniques that enable adjusting the catalog of IMPACT In August 2019, Wikidata became the first Wikimedia project that crossed 1 billion edits, and over 20,000 active Wikidata editors11 . A large subset of the information extraction community has extensively relied on its research around DBpedia and Wikidata targeting different research problems such as KG completion, question answering, entity linking, and data quality assessments [18, 21, 34]. Furthermore, entity and relation linking tasks have been studied well beyond information extraction research, especially NLP and Semantic Web. Despite Wikidata being hugely popular, there are limited resources for reusing and aligning unstructured text to Wikidata mentions. However, when it comes to a short text, the performance of existing baselines are limited. We believe the availability of Falcon 2.0 as a web API along with open source access to its code will provide researchers an easy and reusable way to 12 https://www.wikidata.org/wiki/Help:Wikimedia_language_codes/lists/all 13 https://www.tib.eu/en/research-development/scientific-data-management/ 14 https://www.tib.eu/en/tib/profile/ 11 https://www.wikidata.org/wiki/Wikidata:Statistics 15 https://github.com/SDM-TIB 3147 Resource Track CIKM '20, October 19–23, 2020, Virtual Event, Ireland CIKM ’20, October 19–23, 2020, Virtual Event, Ireland Sakor, et al. Table 5: Sample Questions from LC-QuAD 2.0 datset. The table shows five sample questions and associated gold standard relations. These sentences do not include standard sentential relations in the English language. Considering Wikidata is largely authored by the crowd, the crowd often creates such uncommon relations. Falcon 2.0 finds limitation in linking such relations, and most results are empty. Question Which is the global-warming potential of dichlorodifluoromethane? What is the AMCA Radiocommunications Licence ID for Qantas? What is ITIS TSN for Sphyraena? What is the ARICNS for Fomalhaut? Which is CIQUAL 2017 ID for cheddar? Gold Standard IDs P2565 P2472 P815 P999 P4696 rules and alignments to the changes in Wikidata. We further plan to mitigate errors caused by a rule-based approach using machine learning approaches to aim towards a hybrid approach. 9 Gold Standard Labels global warming potential ACMA Radiocommunications Client Number ITIS TSN ARICNS CIQUAL2017 ID Predicted IDs [] P275 [] [] [] Predicted Labels _ copyright license _ _ _ [19] Isaiah Onando Mulang, Kuldeep Singh, Akhilesh Vyas, Saeedeh Shekarpour, Ahmad Sakor, Maria Esther Vidal, Soren Auer, and Jens Lehmann. 2020. Encoding Knowledge Graph Entity Aliases in an Attentive Neural Networks for Wikidata Entity Linking. In WISE (to appear) (2020). [20] Jonathan Raphael Raiman and Olivier Michel Raiman. 2018. DeepType: multilingual entity linking by neural type system evolution. In Thirty-Second AAAI Conference on Artificial Intelligence. [21] Ridho Reinanda, Edgar Meij, and Maarten de Rijke. 2016. Document Filtering for Long-tail Entities. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management, CIKM 2016, Indianapolis, IN, USA, October 24-28, 2016. ACM, 771ś780. https://doi.org/10.1145/2983323.2983728 [22] Michael Röder, Ricardo Usbeck, and Axel-Cyrille Ngonga Ngomo. 2018. Gerbilś benchmarking named entity recognition and linking consistently. Semantic Web 9, 5 (2018), 605ś625. [23] Ahmad Sakor, Isaiah Onando Mulang, Kuldeep Singh, Saeedeh Shekarpour, Maria Esther Vidal, Jens Lehmann, and Sören Auer. 2019. Old is gold: linguistic driven approach for entity and relation linking of short text. In Proceedings of the 2019 NAACL HLT (Long Papers). 2336ś2346. [24] W. Shen, J. Wang, and J. Han. 2015. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions. IEEE Transactions on Knowledge and Data Engineering 27, 2 (2015), 443ś460. [25] Kuldeep Singh, Ioanna Lytra, Arun Sethupat Radhakrishna, Saeedeh Shekarpour, Maria-Esther Vidal, and Jens Lehmann. 2018. No One is Perfect: Analysing the Performance of Question Answering Components over the DBpedia Knowledge Graph. arXiv:1809.10044 (2018). [26] Kuldeep Singh, Arun Sethupat Radhakrishna, Andreas Both, Saeedeh Shekarpour, Ioanna Lytra, Ricardo Usbeck, Akhilesh Vyas, Akmal Khikmatullaev, Dharmen Punjani, Christoph Lange, Maria-Esther Vidal, Jens Lehmann, and Sören Auer. 2018. Why Reinvent the Wheel: Let’s Build Question Answering Systems Together. In Web Conference. 1247ś1256. [27] Daniil Sorokin and Iryna Gurevych. 2018. Mixing Context Granularities for Improved Entity Linking on Question Answering Data across Entity Categories. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics. 65ś75. [28] Denny Vrandecic. 2012. Wikidata: a new platform for collaborative data collection. In Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, France, April 16-20, 2012 (Companion Volume). ACM, 1063ś1064. https://doi.org/10.1145/ 2187980.2188242 [29] Edwin Williams. 1981. On the notions" Lexically related" and" Head of a word". Linguistic inquiry 12, 2 (1981), 245ś274. [30] Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda, and Yoshiyasu Takefuji. 2016. Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation. In CoNLL 2016, Yoav Goldberg and Stefan Riezler (Eds.). ACL, 250ś259. [31] Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda, and Yoshiyasu Takefuji. 2016. Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation. CoRR abs/1601.01343 (2016). [32] Xiyuan Yang, Xiaotao Gu, Sheng Lin, Siliang Tang, Yueting Zhuang, Fei Wu, Zhigang Chen, Guoping Hu, and Xiang Ren. 2019. Learning Dynamic Context Augmentation for Global Entity Linking. In EMNLP-IJCNLP 2019, Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). 271ś281. [33] Yi Yang and Ming-Wei Chang. 2015. S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking. In ACL- IJCNLP (Volume 1: Long Papers). 504ś513. [34] Zi Yang, Elmer Garduño, Yan Fang, Avner Maiberg, Collin McCormack, and Eric Nyberg. 2013. Building optimal information systems automatically: configuration space exploration for biomedical information systems. In 22nd ACM CIKM’13, San Francisco, USA. ACM, 1421ś1430. [35] Xinbo Zhang and Lei Zou. 2018. IMPROVE-QA: An Interactive Mechanism for RDF Question/Answering Systems. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10-15, 2018. 1753ś1756. https://doi.org/10.1145/3183713.3193555 ACKNOWLEDGMENTS This work has received funding from the EU H2020 Project No. 727658 (IASIS). REFERENCES [1] Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary G. Ives. 2007. DBpedia: A Nucleus for a Web of Open Data. In ISWC. 722ś735. [2] Krisztian Balog. 2018. Entity-oriented search. Springer Open. [3] Debayan Banerjee, Mohnish Dubey, Debanjan Chaudhuri, and Jens Lehmann. [n.d.]. Joint Entity and Relation Linking using EARL. ([n. d.]). [4] Kurt D. Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In ACM SIGMOD. 1247ś1250. [5] Yixin Cao, Lei Hou, Juanzi Li, and Zhiyuan Liu. 2018. Neural Collective Entity Linking. arXiv:1811.08603 http://arxiv.org/abs/1811.08603 [6] Alberto Cetoli, Stefano Bragaglia, Andrew D O’Harney, Marc Sloan, and Mohammad Akbari. 2019. A Neural Approach to Entity Linking on Wikidata. In European Conference on Information Retrieval. Springer, 78ś86. [7] Antonin Delpeuch. 2019. OpenTapioca: Lightweight Entity Linking for Wikidata. arXiv preprint arXiv:1904.09131 (2019). [8] Dennis Diefenbach, Thomas Tanon, Kamal Singh, and Pierre Maret. 2017. Question answering benchmarks for wikidata. [9] Mohnish Dubey, Debayan Banerjee, Abdelrahman Abdelkawi, and Jens Lehmann. 2019. Lc-quad 2.0: A large dataset for complex question answering over wikidata and dbpedia. In International Semantic Web Conference. Springer, 69ś78. [10] Paolo Ferragina and Ugo Scaiella. 2010. TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM Conference on Information and Knowledge Management, CIKM 2010, Toronto, Ontario, Canada, October 26-30, 2010. 1625ś1628. [11] Octavian-Eugen Ganea and Thomas Hofmann. 2017. Deep Joint Entity Disambiguation with Local Neural Attention. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017. 2619ś2629. [12] Clinton Gormley and Zachary Tong. 2015. Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine. " O’Reilly Media, Inc.". [13] Stefan Heindorf, Martin Potthast, Benno Stein, and Gregor Engels. 2016. Vandalism detection in wikidata. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 327ś336. [14] Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust Disambiguation of Named Entities in Text. In EMNLP 2011. 782ś792. [15] Emrah Inan and Oguz Dikenelli. 2018. A Sequence Learning Method for DomainSpecific Entity Linking. In Proceedings of the Seventh Named Entities Workshop (Melbourne, Australia). Association for Computational Linguistics, 14ś21. http: //aclweb.org/anthology/W18-2403 [16] Heng Ji. 2019. Entity Discovery and Linking and Wikification Reading List. http: //nlp.cs.rpi.edu/kbp/2014/elreading.html [17] Nikolaos Kolitsas, Octavian-Eugen Ganea, and Thomas Hofmann. 2018. End-toEnd Neural Entity Linking. In Proceedings of the 22nd Conference on Computational Natural Language Learning. 519ś529. [18] Changsung Moon, Paul Jones, and Nagiza F Samatova. 2017. Learning entity type embeddings for knowledge graph completion. In Proceedings of the 2017 ACM on conference on information and knowledge management. 2215ś2218. 3148

Log In

Falcon 2.0

Related papers

Related papers

Related topics