Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2892753.2892773guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Learning concept embeddings for query expansion by quantum entropy minimization

Published: 27 July 2014 Publication History

Abstract

In web search, users queries are formulated using only few terms and term-matching retrieval functions could fail at retrieving relevant documents. Given a user query, the technique of query expansion (QE) consists in selecting related terms that could enhance the likelihood of retrieving relevant documents. Selecting such expansion terms is challenging and requires a computational framework capable of encoding complex semantic relationships. In this paper, we propose a novel method for learning, in a supervised way, semantic representations for words and phrases. By embedding queries and documents in special matrices, our model disposes of an increased representational power with respect to existing approaches adopting a vector representation. We show that our model produces high-quality query expansion terms. Our expansion increase IR measures beyond expansion from current word-embeddings models and well-established traditional QE methods.

References

[1]
Arguello, J.; Elsas, J. L.; Callan, J.; and Carbonell, J. G. 2008. Document representation and query expansion models for blog recommendation. In Adar, E.; Hurst, M.; Finin, T.; Glance, N. S.; Nicolov, N.; and Tseng, B. L., eds., ICWSM. The AAAI Press.
[2]
Bai, B.; Weston, J.; Collobert, R.; and Grangier, D. 2009. Supervised semantic indexing. In Boughanem, M.; Berrut, C.; Moth, J.; and Soulé-Dupuy, C., eds., ECIR, volume 5478 of Lecture Notes in Computer Science, 761-765. Springer.
[3]
Bengio, Y.; Schwenk, H.; Senécal, J.-S.; Morin, F.; and Gauvain, J.-L. 2006. Neural probabilistic language models. In Innovations in Machine Learning. Springer. 137-186.
[4]
Bergstra, J., and Bengio, Y. 2012. Random search for hyperparameter optimization. Journal of Machine Learning Research13:281-305.
[5]
Blei, D. M.; Ng, A. Y.; Jordan, M. I.; and Lafferty, J. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3:2003.
[6]
Carpineto, C., and Romano, G. 2012. A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44(1):1:1-1:50.
[7]
Chapelle, O.; Metlzer, D.; Zhang, Y.; and Grinspan, P. 2009. Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM conference on Information and knowledge management, CIKM '09, 621-630. New York, NY, USA: ACM.
[8]
Collobert, R.; Weston, J.; Bottou, L.; Karlen, M.; Kavukcuoglu, K.; and Kuksa, P. 2011. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 999888:2493-2537.
[9]
Cui, H.; Wen, J.-R.; Nie, J.-Y.; and Ma, W.-Y. 2002. Probabilistic query expansion using query logs. In Proceedings of the 11th international conference on World Wide Web, WWW '02, 325-332. New York, NY, USA: ACM.
[10]
Dang, V., and Croft,W. B. 2010. Query reformulation using anchor text. In 0001, B. D. D.; Suel, T.; Craswell, N.; and 0001, B. L., eds., WSDM, 41-50. ACM.
[11]
Deerwester, S.; Dumais, S. T.; Furnas, G. W.; Landauer, T. K.; and Harshman, R. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science 41:391-407.
[12]
Gao, J., and Nie, J.-Y. 2012. Towards concept-based translation models using search logs for query expansion. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM '12, 1:1-1:10. New York, NY, USA: ACM.
[13]
Gao, J.; He, X.; and Nie, J.-Y. 2010. Clickthrough-based translation models for web search: from word models to phrase models. In Huang, J.; Koudas, N.; Jones, G. J. F.;Wu, X.; Collins-Thompson, K.; and An, A., eds., CIKM, 1139-1148. ACM.
[14]
Hofmann, T. 1999. Probabilistic latent semantic analysis. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, 289-296. Morgan Kaufmann Publishers Inc.
[15]
Huang, P.-S.; He, X.; Gao, J.; Deng, L.; Acero, A.; and Heck, L. 2013. Learning deep structured semantic models for web search using clickthrough data. In He, Q.; Iyengar, A.; Nejdl, W.; Pei, J.; and Rastogi, R., eds., CIKM, 2333-2338. ACM.
[16]
Kotov, A., and Zhai, C. 2012. Tapping into knowledge base for concept feedback: leveraging conceptnet to improve search results for difficult queries. In Adar, E.; Teevan, J.; Agichtein, E.; and Maarek, Y., eds., WSDM, 403-412. ACM.
[17]
Lee, D. D., and Seung, H. S. 2000. Algorithms for nonnegative matrix factorization. In NIPS, 556-562.
[18]
Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.; and Dean, J. 2013. Distributed representations of words and phrases and their compositionality. CoRR abs/1310.4546.
[19]
Mnih, A., and Kavukcuoglu, K. 2013. Learning word embeddings efficiently with noise-contrastive estimation. In NIPS, 2265-2273.
[20]
Morin, F., and Bengio, Y. 2005. Hierarchical probabilistic neural network language model. In AISTATS'05, 246-252.
[21]
Nielsen, M. A., and Chuang, I. L. 2010. Quantum Computation and Quantum Information. Cambridge University Press.
[22]
Rocchio, J. 1971. Relevance feedback in information retrieval. In The SMART Retrieval System. 313-323.
[23]
Salton, G.; Yang, C. S.; and Yu, C. T. 1974. A theory of term importance in automatic text analysis. Technical report, Cornell University.
[24]
Smucker, M. D.; Allan, J.; and Carterette, B. 2007. A comparison of statistical significance tests for information retrieval evaluation. In Proceedings of the 16th ACM Conference on Information and Knowledge Management, 623-632.
[25]
Sordoni, A.; Nie, J.-Y.; and Bengio, Y. 2013. Modeling term dependencies with quantum language models for ir. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '13, 653-662. New York, NY, USA: ACM.
[26]
Weston, J.; Bengio, S.; and Usunier, N. 2011. Wsabie: Scaling up to large vocabulary image annotation. In Walsh, T., ed., IJCAI, 2764-2770. IJCAI/AAAI.
[27]
Xu, J., and Croft, W. B. 2000. Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems (TOIS) 18:79- 112. ACM ID: 333138.
[28]
Zhai, C. 2008. Statistical language models for information retrieval. Synthesis Lectures on Human Language Technologies1(1):1-141.

Cited By

View all
  • (2019)Multi-view Embedding-based Synonyms for Email SearchProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331250(575-584)Online publication date: 18-Jul-2019
  • (2018)End-to-end quantum-like language models with application to question answeringProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence10.5555/3504035.3504730(5666-5673)Online publication date: 2-Feb-2018
  • (2018)A Quantum Many-body Wave Function Inspired Language Modeling ApproachProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3271723(1303-1312)Online publication date: 17-Oct-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
AAAI'14: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence
July 2014
3155 pages

Publisher

AAAI Press

Publication History

Published: 27 July 2014

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Multi-view Embedding-based Synonyms for Email SearchProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331250(575-584)Online publication date: 18-Jul-2019
  • (2018)End-to-end quantum-like language models with application to question answeringProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence10.5555/3504035.3504730(5666-5673)Online publication date: 2-Feb-2018
  • (2018)A Quantum Many-body Wave Function Inspired Language Modeling ApproachProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3271723(1303-1312)Online publication date: 17-Oct-2018
  • (2018)A Prospect-Guided global query expansion strategy using word embeddingsInformation Processing and Management: an International Journal10.1016/j.ipm.2017.09.00154:1(1-13)Online publication date: 1-Jan-2018
  • (2017)Bridging the gapJournal of Biomedical Informatics10.1016/j.jbi.2017.09.01475:C(122-127)Online publication date: 1-Nov-2017
  • (2016)Query Expansion Using Word EmbeddingsProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983876(1929-1932)Online publication date: 24-Oct-2016
  • (2016)Word Vector Compositionality based Relevance Feedback using Kernel Density EstimationProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983750(1281-1290)Online publication date: 24-Oct-2016
  • (2016)Embedding-based Query Language ModelsProceedings of the 2016 ACM International Conference on the Theory of Information Retrieval10.1145/2970398.2970405(147-156)Online publication date: 12-Sep-2016
  • (2016)Estimating Embedding Vectors for QueriesProceedings of the 2016 ACM International Conference on the Theory of Information Retrieval10.1145/2970398.2970403(123-132)Online publication date: 12-Sep-2016

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media