Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1287369.1287427dlproceedingsArticle/Chapter ViewAbstractPublication PagesvldbConference Proceedingsconference-collections
Article

Discover: keyword search in relational databases

Published: 20 August 2002 Publication History

Abstract

DISCOVER operates on relational databases and facilitates information discovery on them by allowing its user to issue keyword queries without any knowledge of the database schema or of SQL. DISCOVER returns qualified joining networks of tuples, that is, sets of tuples that are associated because they join on their primary and foreign keys and collectively contain all the keywords of the query. DISCOVER proceeds in two steps. First the Candidate Network Generator generates all candidate networks of relations, that is, join expressions that generate the joining networks of tuples. Then the Plan Generator builds plans for the efficient evaluation of the set of candidate networks, exploiting the opportunities to reuse common subexpressions of the candidate networks.
We prove that DISCOVER finds without redundancy all relevant candidate networks, whose size can be data bound, by exploiting the structure of the schema. We prove that the selection of the optimal execution plan (way to reuse common subexpressions) is NP-complete. We provide a greedy algorithm and we show that it provides near-optimal plan execution time cost. Our experimentation also provides hints on tuning the greedy algorithm.

References

[1]
{ACD02} Sanjay Agrawal, Surajit Chaudhuri, and Gautam Das. DBXplorer: A System For Keyword-Based Search Over Relational Databases. ICDE, 2002.
[2]
{ACGM+01} Arvind Arasu, Junghoo Cho, Hector Garcia-Molina, Andreas Paepcke, and Sriram Raghavan. Searching the web. Transactions on Internet Technology , 2001.
[3]
{AHV95} S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison Wesley, 1995.
[4]
{BNH+02} G. Bhalotia, C. Nakhey, A. Hulgeri, S. Chakrabarti, and S. Sudarshanz. Keyword Searching and Browsing in Databases using BANKS. Proceedings of International Conference on Data Engineering , 2002.
[5]
{BP98} S. Brin and L. Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine. WWW Conference , 1998.
[6]
{DB201} http://www.ibm.com/software/data/db2/ extenders/textinformation/index.html. 2001.
[7]
{Fin82} Sheldon J. Finkelstein. Common subexpression analysis in database applications. ACM SIGMOD, 1982.
[8]
{FKM99} Daniela Florescu, Donald Kossmann, and Ioana Manolescu. Integrating Keyword Search into XML Query Processing. WWW9 Conference, 1999.
[9]
{GSVGM98} R. Goldman, N. Shivakumar, S. Venkatasubramanian, and H. Garcia-Molina. Proximity Search in Databases. VLDB, 1998.
[10]
{MSD01} http://msdn.microsoft.com/library/. 2001.
[11]
{MV00a} U. Masermann and G. Vossen. Schema Independent Database Querying (on and off the Web). Proc. Of IDEAS2000, 2000.
[12]
{MV00b} Ute Masermann and Gottfried Vossen. Design and Implementation of a Novel Approach to Keyword Searching in Relational Databases. ADBISDASFAA Symposium, 2000.
[13]
{Ora01} http://technet.oracle.com/products/text/ content.html. 2001.
[14]
{Ple81} J. Plesn'ik. A bound for the Steiner tree problem in graphs. Math. Slovaca 31, pages 155-163, 1981.
[15]
{RSSB00} Prasan Roy, S. Seshadri, S. Sudarshan, and Siddhesh Bhobe. Efficient and extensible algorithms for multi query optimization. SIGMOD Record, 29(2):249-260, 2000.
[16]
{Sal89} Gerard Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison Wesley, 1989.
[17]
{Sel88} Timos K. Sellis. Multiple-query optimization. TODS, 13(1): 23-52, 1988.
[18]
{SS82} J. A. Storer and T. Szymanski. Data Compression via Textual Substitution. J. ACM, 1982.
[19]
{TWW+00} Jason T., L. Wang, Xiong Wang, Dennis Shasha, Bruce A. Shapiro, Kaizhong Zhang, Qicheng Ma, and Zasha Weinberg. An approximate search engine for structural databases. SIGMOD, 2000.
[20]
{Ull82} Jeffrey D. Ullman. Principles of Database Systems, 2nd Edition. Computer Science Press, 1982.

Cited By

View all
  • (2023)Full-Power Graph Querying: State of the Art and ChallengesProceedings of the VLDB Endowment10.14778/3611540.361157716:12(3886-3889)Online publication date: 1-Aug-2023
  • (2023)Table Discovery in Data Lakes: State-of-the-art and Future DirectionsCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589409(69-75)Online publication date: 4-Jun-2023
  • (2021)Proportionality in Spatial Keyword SearchProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457309(885-897)Online publication date: 9-Jun-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
VLDB '02: Proceedings of the 28th international conference on Very Large Data Bases
August 2002
1110 pages

Publisher

VLDB Endowment

Publication History

Published: 20 August 2002

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)1
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Full-Power Graph Querying: State of the Art and ChallengesProceedings of the VLDB Endowment10.14778/3611540.361157716:12(3886-3889)Online publication date: 1-Aug-2023
  • (2023)Table Discovery in Data Lakes: State-of-the-art and Future DirectionsCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589409(69-75)Online publication date: 4-Jun-2023
  • (2021)Proportionality in Spatial Keyword SearchProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457309(885-897)Online publication date: 9-Jun-2021
  • (2021)An In-Depth Benchmarking of Text-to-SQL SystemsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452836(632-644)Online publication date: 9-Jun-2021
  • (2020)Keyword search on form resultsProceedings of the VLDB Endowment10.14778/3402707.34027534:11(1189-1200)Online publication date: 3-Jun-2020
  • (2020)Optimizing and parallelizing ranked enumerationProceedings of the VLDB Endowment10.14778/3402707.34027394:11(1028-1039)Online publication date: 3-Jun-2020
  • (2020)Answering (Unions of) Conjunctive Queries using Random Access and Random-Order EnumerationProceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3375395.3387662(393-409)Online publication date: 14-Jun-2020
  • (2019)Example-driven query intent discoveryProceedings of the VLDB Endowment10.14778/3342263.334226612:11(1262-1275)Online publication date: 1-Jul-2019
  • (2019)Less Data Delivers Higher Search Effectiveness for Keyword QueriesProceedings of the 31st International Conference on Scientific and Statistical Database Management10.1145/3335783.3335794(109-120)Online publication date: 23-Jul-2019
  • (2019)Root RankProceedings of the ACM India Joint International Conference on Data Science and Management of Data10.1145/3297001.3297014(103-111)Online publication date: 3-Jan-2019
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media