Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1807167.1807251acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Structured annotations of web queries

Published: 06 June 2010 Publication History
  • Get Citation Alerts
  • Abstract

    Queries asked on web search engines often target structured data, such as commercial products, movie showtimes, or airline schedules. However, surfacing relevant results from such data is a highly challenging problem, due to the unstructured language of the web queries, and the imposing scalability and speed requirements of web search. In this paper, we discover latent structured semantics in web queries and produce Structured Annotations for them. We consider an annotation as a mapping of a query to a table of structured data and attributes of this table. Given a collection of structured tables, we present a fast and scalable tagging mechanism for obtaining all possible annotations of a query over these tables. However, we observe that for a given query only few are sensible for the user needs. We thus propose a principled probabilistic scoring mechanism, using a generative model, for assessing the likelihood of a structured annotation, and we define a dynamic threshold for filtering out misinterpreted query annotations. Our techniques are completely unsupervised, obviating the need for costly manual labeling effort. We evaluated our techniques using real world queries and data and present promising experimental results.

    References

    [1]
    J. L. Bentley and R. Sedgewick. Fast Algorithms for Sorting and Searching Strings. In SODA, 1997.
    [2]
    M. Bergman. The Deep Web: Surfacing Hidden Value. Journal of Electronic Publishing, 7(1), 2001.
    [3]
    C. M. Bishop. Pattern Recognition and Machine Learning. Springer, 1st edition, 2006.
    [4]
    M. J. Cafarella, A. Y. Halevy, D. Z. Wang, E. Wu, and Y. Zhang. WebTables: Exploring the Power of Tables on the Web. PVLDB, 1(1):538--549, 2008.
    [5]
    P. Calado, A. S. da Silva, A. H. F. Laender, B. A. Ribeiro-Neto, and R. C. Vieira. A Bayesian Network Approach to Searching Web Databases through Keyword-based Queries. Inf. Process. Man., 40(5), 2004.
    [6]
    S. Chaudhuri, V. Ganti, and D. Xin. Exploiting Web Search to Generate Synonyms for Entities. In WWW, 2009.
    [7]
    Y. Chen, W. Wang, Z. Liu, and X. Lin. Keyword Search on Structured and Semi-structured Data. In SIGMOD, 2009.
    [8]
    T. Cheng, H. Lauw, and S. Paparizos. Fuzzy Matching of Web Queries to Structured Data. In ICDE, 2010.
    [9]
    F. de Sá Mesquita, A. S. da Silva, E. S. de Moura, P. Calado, and A. H. F. Laender. LABRADOR: Efficiently Publishing Relational Databases on the Web by Using Keyword-based Query Interfaces. Inf. Process. Manage., 43(4), 2007.
    [10]
    L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. XRANK: Ranked Keyword Search over XML Documents. In SIGMOD, 2003.
    [11]
    H. He, H. Wang, J. Yang, and P. S. Yu. BLINKS: Ranked Keyword Searches on Graphs. In SIGMOD, 2007.
    [12]
    V. Hristidis, L. Gravano, and Y. Papakonstantinou. Efficient IR-Style Keyword Search over Relational Databases. In VLDB, 2003.
    [13]
    Y. E. Ioannidis. The History of Histograms. In VLDB, 2003.
    [14]
    V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional Expansion For Keyword Search on Graph Databases. In VLDB, 2005.
    [15]
    E. Kandogan, R. Krishnamurthy, S. Raghavan, S. Vaithyanathan, and H. Zhu. Avatar Semantic Search: A Database Approach to Information Retrieval. In SIGMOD06.
    [16]
    J. Kim, X. Xue, and W. B. Croft. A Probabilistic Retrieval Model for Semistructured Data. In ECIR, 2009.
    [17]
    X. Li, Y.-Y. Wang, and A. Acero. Extracting Structured Information from User Queries with Semi-supervised Conditional Random Fields. In SIGIR, 2009.
    [18]
    F. Liu, C. T. Yu, W. Meng, and A. Chowdhury. Effective Keyword Search in Relational Databases. In SIGMOD, 2006.
    [19]
    Z. Liu and Y. Chen. Reasoning and Identifying Relevant Matches for XML Keyword Search. PVLDB, 1(1), 2008.
    [20]
    V. Markl, P. J. Haas, M. Kutsch, N. Megiddo, U. Srivastava, and T. M. Tran. Consistent selectivity estimation via maximum entropy. VLDB J., 16(1), 2007.
    [21]
    G. A. Miller. WordNet: A Lexical Database for English. Commun. ACM, 38(11):39--41, 1995.
    [22]
    S. Paparizos, A. Ntoulas, J. C. Shafer, and R. Agrawal. Answering Web Queries Using Structured Data Sources. In SIGMOD, 2009.
    [23]
    K. Q. Pu and X. Yu. Keyword Query Cleaning. PVLDB, 1(1):909--920, 2008.

    Cited By

    View all
    • (2022)Type Linking for Query Understanding and Semantic SearchProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539067(3931-3940)Online publication date: 14-Aug-2022
    • (2022)Evaluating the Use of Synthetic Queries for Pre-training a Semantic Query TaggerAdvances in Information Retrieval10.1007/978-3-030-99739-7_5(39-46)Online publication date: 5-Apr-2022
    • (2021)Semantic Query Labeling Through Synthetic Query GenerationProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463071(2278-2282)Online publication date: 11-Jul-2021
    • Show More Cited By

    Index Terms

    1. Structured annotations of web queries

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
      June 2010
      1286 pages
      ISBN:9781450300322
      DOI:10.1145/1807167
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 June 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. keyword search
      2. structured data
      3. web

      Qualifiers

      • Research-article

      Conference

      SIGMOD/PODS '10
      Sponsor:
      SIGMOD/PODS '10: International Conference on Management of Data
      June 6 - 10, 2010
      Indiana, Indianapolis, USA

      Acceptance Rates

      Overall Acceptance Rate 785 of 4,003 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)8
      • Downloads (Last 6 weeks)2

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Type Linking for Query Understanding and Semantic SearchProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539067(3931-3940)Online publication date: 14-Aug-2022
      • (2022)Evaluating the Use of Synthetic Queries for Pre-training a Semantic Query TaggerAdvances in Information Retrieval10.1007/978-3-030-99739-7_5(39-46)Online publication date: 5-Apr-2022
      • (2021)Semantic Query Labeling Through Synthetic Query GenerationProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463071(2278-2282)Online publication date: 11-Jul-2021
      • (2020)Query Reformulation in E-Commerce SearchProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3397271.3401065(1319-1328)Online publication date: 25-Jul-2020
      • (2018)Understanding Information NeedsEntity-Oriented Search10.1007/978-3-319-93935-3_7(225-267)Online publication date: 3-Oct-2018
      • (2018)IntroductionEntity-Oriented Search10.1007/978-3-319-93935-3_1(1-23)Online publication date: 3-Oct-2018
      • (2017)Cluster Based Prediction of Keyword Query Over DatabasesComputer Communication, Networking and Internet Security10.1007/978-981-10-3226-4_25(253-260)Online publication date: 4-May-2017
      • (2016)Data Driven Discovery of Attribute DictionariesTransactions on Computational Collective Intelligence XXI - Volume 963010.5555/3090176.3090180(69-96)Online publication date: 1-Jan-2016
      • (2016)Using the Crowd to Improve Search Result Ranking and the Search ExperienceACM Transactions on Intelligent Systems and Technology10.1145/28973687:4(1-24)Online publication date: 12-Jul-2016
      • (2016)TechLand: Assisting Technology Landscape Inquiries with Insights from Stack Overflow2016 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME.2016.17(356-366)Online publication date: Oct-2016
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media