Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1871437.1871736acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Efficient wikipedia-based semantic interpreter by exploiting top-k processing

Published: 26 October 2010 Publication History

Abstract

Proper representation of the meaning of texts is crucial to enhancing many data mining and information retrieval tasks, including clustering, computing semantic relatedness between texts, and searching. Representing of texts in the concept space derived from Wikipedia has received growing attention recently, due to its comprehensiveness and expertise, This concept-based representation is capable of extracting semantic relatedness between texts that cannot be deduced with the bag of words model. A key obstacle, however, for using Wikipedia as a semantic interpreter is that the sheer size of the concepts derived from Wikipedia makes it hard to efficiently map texts into concept-space. In this paper, we develop an efficient algorithm which is able to represent the meaning of a text by using the concepts that best match it. In particular, our approach first computes the approximate top-k concepts that are most relevant to the given text. We then leverage these concepts for representing the meaning of the given text. The experimental results show that the proposed technique provides significant gains in execution time over current solutions to the problem.

References

[1]
Wikipedia, the free encyclopedia. http://www.wikipedia.org.
[2]
Wikinews, the free news source. http://en.wikinews.org.
[3]
B. Arai, G. Das, D. Gunopulos, and N. Koudas. Anytime measures for top-k algorithms. In VLDB, 2007.
[4]
R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. JCSS, vol. 66, pp. 614--656, 2003.
[5]
E. Gabrilovich and S. Markovitch. Wikipedia-based semantic interpretation for natural language processing. JAIR, vol. 34, pp.443--498, 2009.
[6]
G. Salton, A. Wong, and C.S. Yang. A vector space model for automatic indexing. CACM, vol. 18, pp. 613--62, 1975.
[7]
M. Theobald, G. Weikum, and R. Schenkel. Top-k query evaluation with probabilistic guarantees. In VLDB, 2004.

Cited By

View all

Index Terms

  1. Efficient wikipedia-based semantic interpreter by exploiting top-k processing

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
      October 2010
      2036 pages
      ISBN:9781450300995
      DOI:10.1145/1871437
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 October 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. concept
      2. semantic interpretation
      3. wikipedia

      Qualifiers

      • Poster

      Conference

      CIKM '10

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 02 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media