Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3178876.3186074acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article
Free access

Leveraging Fine-Grained Wikipedia Categories for Entity Search

Published: 10 April 2018 Publication History
  • Get Citation Alerts
  • Abstract

    Ad-hoc entity search, which is to retrieve a ranked list of relevant entities in response to a query of natural language question, has been widely studied. It has been shown that category matching of entities, especially when matching to fine-grained entity types/categories, is critical to the performance of entity search. However, the potentials of the fine-grained Wikipedia entity categories, has not been well exploited by existing studies. Based on the observation of how people describe entities of a specific type, we propose a headword-and-modifier model to deeply interpret both queries and fine-grained entity types/categories. Probabilistic generative models are designed to effectively estimate the relevance of headwords and modifiers as a pattern-based matching problem, taking the Wikipedia type taxonomy as an important input to address the ad-hoc representations of concepts/entities in queries. Extensive experimental results on three widely-used test sets: INEX-XER 2009, SemSearch-LS and TREC-Entity, show that our method achieves a significant improvement of the entity search performance over the state-of-the-art methods.

    References

    [1]
    Krisztian Balog, Leif Azzopardi, and Maarten de Rijke. 2006. Formal models for expert finding in enterprise corpora SIGIR 2006: Proceedings of the 29th ACM SIGIR. ACM, 43--50.
    [2]
    Krisztian Balog, Marc Bron, and Maarten de Rijke. 2011. Query modeling for entity search based on terms, categories, and examples. ACM Trans. Inf. Syst. Vol. 29, 4 (2011), 22.
    [3]
    Krisztian Balog, Arjen P. de Vries, Pavel Serdyukov, Paul Thomas, and Thijs Westerveld. 2009. Overview of the TREC 2009 Entity Track. In Proceedings of The 18th TREC.
    [4]
    Krisztian Balog and Robert Neumayer. 2012. Hierarchical target type identification for entity-oriented queries 21st ACM CIKM. ACM, 2391--2394.
    [5]
    Krisztian Balog and Robert Neumayer. 2013. A test collection for entity search in DBpedia. In The 36th ACM SIGIR. ACM, 737--740.
    [6]
    Roi Blanco, Harry Halpin, Daniel M Herzig, Peter Mika, Jeffrey Pound, David R Cheriton, and Henry Thompson. 2017. Entity Search Evaluation over Structured Web Data. Proceedings of the 1st International Workshop on EOS. 65--71.
    [7]
    Marc Bron, Krisztian Balog, and Maarten de Rijke. 2013. Example Based Entity Search in the Web of Data. In Advances in Information Retrieval - 35th ECIR. 392--403.
    [8]
    Yueguo Chen, Lexi Gao, Shuming Shi, Xiaoyong Du, and Ji-Rong Wen. 2014. Improving Context and Category Matching for Entity Search Proceedings of the 28th AAAI. 16--22.
    [9]
    Marek Ciglan, Kjetil Nørvåg, and Ladislav Hluchý. 2012. The SemSets model for ad-hoc semantic list search. Proceedings of the 21st WWW. 131--140.
    [10]
    Nick Craswell, Arjen P. de Vries, and Ian Soboroff. 2005. Overview of the TREC 2005 Enterprise Track. In Proceedings of the 14th TREC.
    [11]
    Arjen P. de Vries, Anne-Marie Vercoustre, James A. Thom, Nick Craswell, and Mounia Lalmas. 2007. Overview of the INEX 2007 Entity Ranking Track. Focused Access to XML Documents, 6th Workshop of INEX. 245--251.
    [12]
    Gianluca Demartini, Arjen P. de Vries, Tereza Iofciu, and Jianhan Zhu. 2008. Overview of the INEX 2008 Entity Ranking Track. Advances in Focused Retrieval, 7th Workshop of the INEX. 243--252.
    [13]
    Gianluca Demartini, Tereza Iofciu, and Arjen P. de Vries. 2009. Overview of the INEX 2009 Entity Ranking Track. Focused Retrieval and Evaluation, 8th Workshop of INEX. 254--264.
    [14]
    Dar'ıo Garigliotti and Krisztian Balog. 2017. On Type-Aware Entity Retrieval. In Proceedings of the ACM SIGIR ICTIR. ACM, 27--34.
    [15]
    Dar'ıo Garigliotti, Faegheh Hasibi, and Krisztian Balog. 2017. Target Type Identification for Entity-Bearing Queries Proceedings of the 40th ACM SIGIR. ACM, 845--848.
    [16]
    Rianne Kaptein and Jaap Kamps. 2013. Exploiting the category structure of Wikipedia for entity ranking. Artif. Intell. Vol. 194 (2013), 111--129.
    [17]
    Rianne Kaptein, Pavel Serdyukov, Arjen P. de Vries, and Jaap Kamps. 2010. Entity ranking using Wikipedia as a pivot. In Proceedings of the 19th ACM CIKM. ACM, 69--78.
    [18]
    Tom Kenter and Maarten de Rijke. 2015. Short Text Similarity with Word Embeddings. In Proceedings of the 24th ACM CIKM. ACM, 1411--1420.
    [19]
    Vanessa López, Christina Unger, Philipp Cimiano, and Enrico Motta. 2013. Evaluating question answering over linked data. J. Web Sem. Vol. 21 (2013), 3--13.
    [20]
    Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to information retrieval. Cambridge University Press.
    [21]
    Donald Metzler and W. Bruce Croft. 2007. Linear feature-based models for information retrieval. Inf. Retr., Vol. 10, 3 (2007), 257--274.
    [22]
    David N. Milne and Ian H. Witten. 2008. Learning to link with wikipedia. In Proceedings of the 17th ACM CIKM. ACM, 509--518.
    [23]
    Fedor Nikolaev, Alexander Kotov, and Nikita Zhiltsov. 2016. Parameterized Fielded Term Dependence Models for Ad-hoc Entity Retrieval from Knowledge Graph Proceedings of the 39th SIGIR. 435--444.
    [24]
    Madhu Ramanathan, Srikant Rajagopal, S. Venkatesh Karthik, Meenakshi Sundaram Murugeshan, and Saswati Mukherjee. 2009. A Recursive Approach to Entity Ranking and List Completion Using Entity Determining Terms, Qualifiers and Prominent n-Grams. Focused Retrieval and Evaluation, 8th Workshop of the INEX. 292--302.
    [25]
    Lev-Arie Ratinov and Dan Roth. 2009. Design Challenges and Misconceptions in Named Entity Recognition Proceedings of the 13th CoNLL. 147--155.
    [26]
    Alan Ritter, Sam Clark, Mausam, and Oren Etzioni. 2011. Named Entity Recognition in Tweets: An Experimental Study Proceedings of the 2011 EMNLP. 1524--1534.
    [27]
    Young-In Song, Kyoung-Soo Han, Sang-Bum Kim, So-Young Park, and Hae-Chang Rim. 2008. A novel retrieval approach reflecting variability of syntactic phrase representation. J. Intell. Inf. Syst. Vol. 31, 3 (2008), 265--286.
    [28]
    Qiuyue Wang, Jaap Kamps, Georgina Ram'ırez Camps, Maarten Marx, Anne Schuth, Martin Theobald, Sairam Gurajada, and Arunav Mishra. 2012. Overview of the INEX 2012 Linked Data Track. In CLEF 2012 Evaluation Labs and Workshop, Online Working Notes.
    [29]
    Zhongyuan Wang, Haixun Wang, and Zhirui Hu. 2014. Head, modifier, and constraint detection in short texts 30th IEEE ICDE. 280--291.
    [30]
    Emine Yilmaz, Evangelos Kanoulas, and Javed A. Aslam. 2008. A simple and efficient sampling method for estimating AP and NDCG Proceedings of the 31st ACM SIGIR. ACM, 603--610.

    Cited By

    View all
    • (2023)Category-Highlighting Transformer Network for Question RetrievalDatabase Systems for Advanced Applications10.1007/978-3-031-30675-4_33(457-467)Online publication date: 15-Apr-2023
    • (2022)Fine-Grained Entity Typing with a Type Taxonomy: a Systematic ReviewIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3148980(1-1)Online publication date: 2022
    • (2022)Dealing With Hierarchical Types and Label Noise in Fine-Grained Entity TypingIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2022.315528130(1305-1318)Online publication date: 2022
    • Show More Cited By

    Index Terms

    1. Leveraging Fine-Grained Wikipedia Categories for Entity Search

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '18: Proceedings of the 2018 World Wide Web Conference
      April 2018
      2000 pages
      ISBN:9781450356398
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      • IW3C2: International World Wide Web Conference Committee

      In-Cooperation

      Publisher

      International World Wide Web Conferences Steering Committee

      Republic and Canton of Geneva, Switzerland

      Publication History

      Published: 10 April 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. category matching
      2. entity search
      3. language model

      Qualifiers

      • Research-article

      Funding Sources

      • Outstanding Innovative Talents Cultivation Funded Programs 2017 of Renmin University of China
      • State Scholarship Fund from China Scholarship Council
      • a gift from Huawei
      • National Science Foundation of China
      • State Visiting Scholar Funds from the China Scholarship Council

      Conference

      WWW '18
      Sponsor:
      • IW3C2
      WWW '18: The Web Conference 2018
      April 23 - 27, 2018
      Lyon, France

      Acceptance Rates

      WWW '18 Paper Acceptance Rate 170 of 1,155 submissions, 15%;
      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)80
      • Downloads (Last 6 weeks)11
      Reflects downloads up to 11 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Category-Highlighting Transformer Network for Question RetrievalDatabase Systems for Advanced Applications10.1007/978-3-031-30675-4_33(457-467)Online publication date: 15-Apr-2023
      • (2022)Fine-Grained Entity Typing with a Type Taxonomy: a Systematic ReviewIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3148980(1-1)Online publication date: 2022
      • (2022)Dealing With Hierarchical Types and Label Noise in Fine-Grained Entity TypingIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2022.315528130(1305-1318)Online publication date: 2022
      • (2021)Transfer learning for fine-grained entity typingKnowledge and Information Systems10.1007/s10115-021-01549-5Online publication date: 13-Feb-2021
      • (2020)Generating Categories for Sets of EntitiesProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412019(1833-1842)Online publication date: 19-Oct-2020
      • (2020)Making Explainable Friend Recommendations Based on Concept Similarity Measurements via a Knowledge GraphIEEE Access10.1109/ACCESS.2020.30146708(146027-146038)Online publication date: 2020
      • (2020)Entity set expansion in knowledge graph: a heterogeneous information network perspectiveFrontiers of Computer Science10.1007/s11704-020-9240-815:1Online publication date: 29-Sep-2020
      • (2019)Lightweight Lexical and Semantic Evidence for Detecting Classes Among Wikipedia ArticlesProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3291020(78-86)Online publication date: 30-Jan-2019

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media