Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Supporting keyword search in product database: a probabilistic approach

Published: 01 September 2013 Publication History

Abstract

The ability to let users search for products conveniently in product database is critical to the success of e-commerce. Although structured query languages (e.g. SQL) can be used to effectively access the product database, it is very difficult for end users to learn and use. In this paper, we study how to optimize search over structured product entities (represented by specifications) with keyword queries such as "cheap gaming laptop". One major difficulty in this problem is the vocabulary gap between the specifications of products in the database and the keywords people use in search queries. To solve the problem, we propose a novel probabilistic entity retrieval model based on query generation, where the entities would be ranked for a given keyword query based on the likelihood that a user who likes an entity would pose the query. Different ways to estimate the model parameters would lead to different variants of ranking functions. We start with simple estimates based on the specifications of entities, and then leverage user reviews and product search logs to improve the estimation. Multiple estimation algorithms are developed based on Maximum Likelihood and Maximum a Posteriori estimators. We evaluate the proposed product entity retrieval models on two newly created product search test collections. The results show that the proposed model significantly outperforms the existing retrieval models, benefiting from the modeling of attribute-level relevance. Despite the focus on product retrieval, the proposed modeling method is general and opens up many new opportunities in analyzing structured entity data with unstructured text data. We show the proposed probabilistic model can be easily adapted for many interesting applications including facet generation and review annotation.

References

[1]
S. Agrawal, S. Chaudhuri, and G. Das. Dbxplorer: A system for keyword-based search over relational databases. In Proceedings of ICDE 2002, 2002.
[2]
K. Balog, Y. Fang, M. de Rijke, P. Serdyukiv, and L. Si. Expertise finding. Foundations and Trends in Information Retrieval, 6(3), 2012.
[3]
K. Balog, P. Serdyukov, A. P. D. Vries, P. Thomas, and T. Westerveld. Overview of the trec 2009 entity track. 2009.
[4]
A. Berger and J. Lafferty. Information retrieval as statistical translation. In Proceedings of SIGIR, pages 222-229, 1999.
[5]
G. Bouma. Normalized (pointwise) mutual information in collocation extraction. Proceedings of GSCL, pages 31-40, 2009.
[6]
S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum. Probabilistic information retrieval approach for ranking of database query results. ACM Trans. Database Syst., 31(3):1134-1168, 2006.
[7]
T. Cheng, H. W. Lauw, and S. Paparizos. Entity synonyms for structured web search. IEEE Transactions on Knowledge and Data Engineering, 24(10):1862-1875, 2012.
[8]
T. Cheng, X. Yan, and K. C.-C. Chang. Entityrank: Searching entities directly and holistically. In VLDB, pages 387-398, 2007.
[9]
E. Demidova, X. Zhou, and W. Nejdl. A probabilistic scheme for keyword-based incremental query construction. IEEE Transactions on Knowledge and Data Engineering, 24(3):426-439, 2012.
[10]
N. Fuhr. A probabilistic framework for vague queries and imprecise information in databases. In VLDB, pages 696-707, 1990.
[11]
N. Fuhr. A probabilistic relational model for the integration of ir and databases. In SIGIR, pages 309-317, 1993.
[12]
N. Fuhr and T. Rölleke. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst., 15(1):32-66, 1997.
[13]
K. Ganesan and C. Zhai. Opinion-based entity ranking. Inf. Retr., 15(2):116-150, 2012.
[14]
H. He, H. Wang, J. Yang, and P. S. Yu. Blinks: ranked keyword searches on graphs. In Proceedings of SIGMOD, pages 305-316. ACM, 2007.
[15]
V. Hristidis, L. Gravano, and Y. Papakonstantinou. Efficient ir-style keyword search over relational databases. In Proceedings of the 29th VLDB conference, pages 850-861, 2003.
[16]
V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In Proceedings of VLDB 2002, 2002.
[17]
A. Hulgeri and C. Nakhe. Keyword searching and browsing in databases using banks. In Proceedings of the 18th ICDE conference, 2002.
[18]
J. Lafferty and C. Zhai. Probabilistic relevance models based on document and query generation. 2003.
[19]
M. Lalmas. XML Retrieval (Synthesis Lectures on Information Concepts, Retrieval, and Services). Morgan and Claypool, 2009.
[20]
X. Li, Y.-Y. Wang, and A. Acero. Extracting structured information from user queries with semi-supervised conditional random fields. In Proceedings of SIGIR, pages 572-579, 2009.
[21]
F. Liu, C. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In Proceedings of SIGMOD, pages 563-574. ACM, 2006.
[22]
J. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proceedings of the ACM SIGIR, pages 275-281, 1998.
[23]
J. Pound, S. Paparizos, and P. Tsaparas. Facet discovery for structured web search: a query-log mining approach. In Proceedings of SIGMOD, pages 169-180, 2011.
[24]
S. E. Robertson. The probability ranking principle in IR. Journal of Documentation, 33(4):294-304, 1977.
[25]
A. P. D. Vries, A. marie Vercoustre, J. A. Thom, M. Lalmas, and I. rocquencourt Le Chesnay Cedex. Overview of the inex 2007 entity ranking track. In INEX 2007, pages 245-251. Springer-Verlag, 2008.
[26]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of SIGIR'2001, pages 334-342, Sept 2001.

Cited By

View all
  • (2024)Unified Visual Preference Learning for User Intent UnderstandingProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635858(816-825)Online publication date: 4-Mar-2024
  • (2023)Dynamic Bayesian Contrastive Predictive Coding Model for Personalized Product SearchACM Transactions on the Web10.1145/360922517:4(1-31)Online publication date: 10-Oct-2023
  • (2023)Contrastive Learning for User Sequence Representation in Personalized Product SearchProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599287(380-389)Online publication date: 6-Aug-2023
  • Show More Cited By

Index Terms

  1. Supporting keyword search in product database: a probabilistic approach
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image Proceedings of the VLDB Endowment
          Proceedings of the VLDB Endowment  Volume 6, Issue 14
          September 2013
          384 pages

          Publisher

          VLDB Endowment

          Publication History

          Published: 01 September 2013
          Published in PVLDB Volume 6, Issue 14

          Qualifiers

          • Article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)19
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 26 Jan 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Unified Visual Preference Learning for User Intent UnderstandingProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635858(816-825)Online publication date: 4-Mar-2024
          • (2023)Dynamic Bayesian Contrastive Predictive Coding Model for Personalized Product SearchACM Transactions on the Web10.1145/360922517:4(1-31)Online publication date: 10-Oct-2023
          • (2023)Contrastive Learning for User Sequence Representation in Personalized Product SearchProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599287(380-389)Online publication date: 6-Aug-2023
          • (2023)Long-Form Information Retrieval for Enterprise MatchmakingProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591833(3260-3264)Online publication date: 19-Jul-2023
          • (2022)Approximating probabilistic group steiner trees in graphsProceedings of the VLDB Endowment10.14778/3565816.356583416:2(343-355)Online publication date: 1-Oct-2022
          • (2022)Learning to Ask: Conversational Product Search via Representation LearningACM Transactions on Information Systems10.1145/355537141:2(1-27)Online publication date: 21-Dec-2022
          • (2022)Semantic Retrieval at WalmartProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539164(3495-3503)Online publication date: 14-Aug-2022
          • (2022)AMinerGNN: Heterogeneous Graph Neural Network for Paper Click-through Rate Prediction with Fusion QueryProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557544(4039-4043)Online publication date: 17-Oct-2022
          • (2022)Differential Query Semantic AnalysisProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498503(535-543)Online publication date: 11-Feb-2022
          • (2022)Meta-Learning Helps Personalized Product SearchProceedings of the ACM Web Conference 202210.1145/3485447.3512036(2277-2287)Online publication date: 25-Apr-2022
          • Show More Cited By

          View Options

          Login options

          Full Access

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media