Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1277741.1277856acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

ESTER: efficient search on text, entities, and relations

Published: 23 July 2007 Publication History

Abstract

We present ESTER, a modular and highly efficient system for combined full-text and ontology search. ESTER builds on a query engine that supports two basic operations: prefix search and join. Both of these can be implemented very efficiently with a compact index, yet in combination provide powerful querying capabilities. We show how ESTER can answer basic SPARQL graph-pattern queries on the ontology by reducing them to a small number of these two basic operations. ESTER further supports a natural blend of such semantic queries with ordinary full-text queries. Moreover, the prefix search operation allows for a fully interactive and proactive user interface, which after every keystroke suggests to the user possible semantic interpretations of his or her query, and speculatively executes the most likely of these interpretations. As a proof of concept, we applied ESTER to the English Wikipedia, which contains about 3 million documents, combined with the recent YAGO ontology, which contains about 2.5 million facts. For a variety of complex queries, ESTER achieves worst-case query processing times of a fraction of a second, on a single machine, with an index size of about 4 GB.

References

[1]
H. Bast, C. W. Mortensen, and I. Weber. Output-sensitive autocompletion search. In 13th International Conference on String Processing and Information Retrieval (SPIRE'06) pages 150--162, 2006.
[2]
H. Bast and I. Weber. Type less, find more: fast autocompletion search with a succinct index. In 29th Annual Conference on Research and Development in Information Retrieval (SIGIR'06) pages 364--371, 2006.
[3]
P. A. Boncz, T. Grust, M. vanKeulen, S. Manegold, J. Rittinger, and J. Teubner. MonetDB/XQuery: a fast XQuery processor powered by a relational engine. In Conference on Management of Data (SIGMOD'06) pages 479--490, 2006.
[4]
D. Carmel, Y. S. Maarek, M. Mandelbrod, Y. Mass, and A. Soffer. Searching xml documents via xml fragments. In 26th Annual Conference on Research and Development in Information Retrieval (SIGIR'03) pages 151--158, 2003.
[5]
P. Castells, M. Fernández, and D. Vallet. An adaptation of the vector-space model for ontology-based information retrieval. IEEE Transactions on Knowledge and Data Enginering 19(02):261--272, 2007.
[6]
S. Dill, N. Eiron, D. Gibson, D. Gruhl, R. V. Guha, A. Jhingran, T. Kanungo, K. S. McCurley, S. Rajagopalan, A. Tomkins, J. A. Tomlin, and J. Y. Zien. A case for automated large-scale semantic annotation. Journal of Web Semantics 1(1):115--132, 2003.
[7]
S. Dill, N. Eiron, D. Gibson, D. Gruhl, R. V. Guha, A. Jhingran, T. Kanungo, S. Rajagopalan, A. Tomkins, J. A. Tomlin, and J. Y. Zien. Semtag and seeker: bootstrapping the semantic web via automated semantic annotation. In 12th World Wide Web Conference (WWW'03) pages 178--186, 2003.
[8]
C. Fellbaum, editor. WordNet: An Electronic Lexical Database MIT Press, 1998.
[9]
M. Hearst, A. Elliott, J. English, R. Sinha, K. Swearingen, and K.-P. Yee. Finding the. ow in web site search. Communications of the ACM 45(9):42--49, 2002.
[10]
D. Huynh, S. Mazzocchi, and D. R. Karger. Piggy bank: Experience the semantic web inside your web browser. In 4th International Semantic Web Conference (ISWC'05) pages 413--430, 2005.
[11]
N. Kabra, R. Ramakrishnan, and V. Ercegovac. The QUIQ engine: A hybrid IR DB system. In 19th International Conference on Data Engineering (ICDE'03) pages 741--743, 2003.
[12]
D. R. Karger, K. Bakshi, D. Huynh, D. Quan, and V. Sinha. Haystack: A general-purpose information management tool for end users based on semistructured data. In 2nd Biennial Conference on Innovative Data Systems Research (CIDR'05) pages 13--26, 2005.
[13]
V. Kumar. Algorithms for constraint-satisfaction problems: A survey. AI Magazine 13(1):32--44, 1992.
[14]
R. Schenkel, F. M. Suchanek, and G. Kasneci. YAWN: A semantically annotated Wikipedia XML corpus. In 12. Symposium on Database Systems for Business, Technology and the Web of the German Socienty for Computer science (BTW 2007) 2007.
[15]
A. Seaborne. ARQ -- a SPARQL processor for Jena, 2005. http://jena.sourceforge.net/ARQ.
[16]
V. Sinha and D. R. Karger. Magnet: Supporting navigation in semistructured data environments. In Conference on Management of Data (SIGMOD'05) pages 97--106, 2005.
[17]
F. Suchanek, G. Kasneci, and G. Weikum. YAGO: A core of semantic knowledge. In 16th World Wide Web Conference (WWW'07) 2007. To appear.
[18]
M. Völkel, M. Krötzsch, D. Vrandecic, H. Haller, and R. Studer. Semantic wikipedia. In 15th World Wide Web Conference (WWW'06) pages 585--594, 2006.
[19]
World Wide Web Consortium (W3C). The SPARQL query language, 2005. http: //www.w3.org/TR/rdf-sparql-query.

Cited By

View all
  • (2024)Beyond Text and Speech in Conversational Agents: Mapping the Design Space of AvatarsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661563(1875-1894)Online publication date: 1-Jul-2024
  • (2019)ENT RankProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331257(215-224)Online publication date: 18-Jul-2019
  • (2019)Bridging Text Visualization and Mining: A Task-Driven SurveyIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2018.283434125:7(2482-2504)Online publication date: 1-Jul-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
July 2007
946 pages
ISBN:9781595935977
DOI:10.1145/1277741
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 July 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Wikipedia
  2. interactive
  3. ontologies
  4. proactive
  5. semantic search

Qualifiers

  • Article

Conference

SIGIR07
Sponsor:
SIGIR07: The 30th Annual International SIGIR Conference
July 23 - 27, 2007
Amsterdam, The Netherlands

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)6
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Beyond Text and Speech in Conversational Agents: Mapping the Design Space of AvatarsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661563(1875-1894)Online publication date: 1-Jul-2024
  • (2019)ENT RankProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331257(215-224)Online publication date: 18-Jul-2019
  • (2019)Bridging Text Visualization and Mining: A Task-Driven SurveyIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2018.283434125:7(2482-2504)Online publication date: 1-Jul-2019
  • (2019)Neural embedding-based indices for semantic searchInformation Processing and Management: an International Journal10.1016/j.ipm.2018.10.01556:3(733-755)Online publication date: 1-May-2019
  • (2017)QLeverProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3132921(647-656)Online publication date: 6-Nov-2017
  • (2017)A Quality Evaluation of Combined Search on a Knowledge Base and TextKI - Künstliche Intelligenz10.1007/s13218-017-0513-932:1(19-26)Online publication date: 6-Oct-2017
  • (2016)Relationship Queries on Extended Knowledge GraphsProceedings of the Ninth ACM International Conference on Web Search and Data Mining10.1145/2835776.2835795(605-614)Online publication date: 8-Feb-2016
  • (2016)Profile-Based Selection of Expert GroupsResearch and Advanced Technology for Digital Libraries10.1007/978-3-319-43997-6_7(81-93)Online publication date: 10-Aug-2016
  • (2015)Related entity finding by unified probabilistic modelsWorld Wide Web10.5555/2780035.278005118:3(521-543)Online publication date: 1-May-2015
  • (2015)DB-IR integration using tight-coupling in the Odysseus DBMSWorld Wide Web10.1007/s11280-013-0264-y18:3(491-520)Online publication date: 1-May-2015
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media