Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/956863.956866acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Query expansion using associated queries

Published: 03 November 2003 Publication History

Abstract

Hundreds of millions of users each day use web search engines to meet their information needs. Advances in web search effectiveness are therefore perhaps the most significant public outcomes of IR research. Query expansion is one such method for improving the effectiveness of ranked retrieval by adding additional terms to a query. In previous approaches to query expansion, the additional terms are selected from highly ranked documents returned from an initial retrieval run. We propose a new method of obtaining expansion terms, based on selecting terms from past user queries that are associated with documents in the collection. Our scheme is effective for query expansion for web retrieval: our results show relative improvements over unexpanded full text retrieval of 26%--29%, and 18%--20% over an optimised, conventional expansion approach.

References

[1]
Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A. Raghavan, S. (2001), 'Searching the web', ACM Transactions on Internet Technology (TOIT)(1),2--43.
[2]
Baeza-Yates, R. Ribeiro-Neto, B. (1999), Modern Information Retrieval, Addison-Wesley Longman.
[3]
Bailey, P., Craswell, & N. Hawking, D. (2001), 'Engineering a multi-purpose test collection for web retrieval experiments',Information Processing and Management. In revision. Available from www.ted.cmis.csiro.au/ dave/cwc.ps.gz.
[4]
Billerbeck, & B. Zobel, J. (2003), When query expansion fails, in C. Clarke, G. Cormack, J. Callan, D. Hawking, & A. Smeaton, eds, 'Proceedings of the ACM-SIGIR International Conference on Research and Development in Information Retrieval', Toronto, Canada, pp.387--388.
[5]
Buckley, C., Salton, G., Allan, J. & Singhal, A. (1994), Automatic query expansion using SMART: TREC 3, in D. Harman, ed., 'Overview of the Third Text REtrieval Conference (TREC-3)', NIST Special Publication 500-225, pp.69--80.
[6]
Carpineto, C., deMori, R., Romano, G. & Bigi, B. (2001), 'An information-theoretic approach to automatic query expansion',ACM Transactions on Information Systems (TOIS) 19(1),1--27.
[7]
Craswell, N., Hawking, D. & Robertson, S. (2001), Effective site finding using link anchor information, in D. H. Kraft, W. B. Croft, D. J. Harper J. Zobel, eds, 'Proceedings of the ACM-SIGIR International Conference on Research and Development in Information Retrieval', New Orleans, LA, pp.250--257.
[8]
Daniel, W. (1990), Applied Nonparametric Statistics, 2nd edn, PWS-KENT Publishing Company.
[9]
Fitzpatrick, L. & Dent, M. (1997), Automatic feedback using past queries: Social searching?, in N. J. Belkin, A. D. Narasimhalu, P. Willett, W. Hersh, F. Can & E. Voorhees, eds, 'Proceedings of the ACM-SIGIR International Conference on Research and Development in Information Retrieval', Philadelphia, PA, pp.306--313.
[10]
Furnas, G. W. (1985), Experience with an adaptive indexing scheme, in L. Borman & R. Smith, eds, 'Proceedings of the ACM-CHI Conference on Human Factors in Computing Systems', pp.131--135.
[11]
Harman, D. (1995), 'Overview of the second text retrieval conference (TREC-2)', Information Processing & Management 31(3),271--289.
[12]
Hawking, D. (2000), Overview of the TREC-9 web track, in 'The Ninth Text REtrieval Conference (TREC 9)', National Institute of Standards and Technology Special Publication 500-249, Washington, DC, pp.87--99.
[13]
Hawking, D. & Craswell, N. (2001), Overview of the TREC-2001 web track, in E. M. Voorhees & D. K. Harman, eds, 'The Tenth Text REtrieval Conference (TREC 2001)', National Institute of Standards and Technology Special Publication 500-250, Washington, DC, pp.61--67.
[14]
Lam-Adesina, A. M. & Jones, G. J. F. (2001), Applying summarization techniques for term selection in relevance feedback,in D. H. Kraft, W. B. Croft, D. J. Harper & J. Zobel, eds, 'Proceedings of the ACM-SIGIR International Conference on Research and Development in Information Retrieval', New Orleans, LA, pp.1--9.
[15]
Leuski, A. (2000), Relevance and reinforcement in interactive browsing, in A. Agah, J. Callan, E. RundensteinerS. Gauch, eds, 'Proceedings of the ACM-CIKM International Conference on Information and Knowledge Management', McLean, VA, pp.119--126.
[16]
Mandala, R., Tokunaga, T. & Tanaka, H. (1999), Combining multiple evidence from different types of thesaurus for query expansion, in F. Gey, M. Hearst & R. Tong, eds, 'Proceedings of the ACM-SIGIR International Conference on Research and Development in Information Retrieval', Berkeley, CA.
[17]
Raghavan, V. V. & Sever, H. (1995), On the reuse of past optimal queries, in E. A. Fox, P. Ingwersen & R. Fidel, eds, 'Proceedings of the ACM-SIGIR International Conference on Research and Development in Information Retrieval', Seattle, WA, pp.344--350.
[18]
Robertson, S. E. & Walker, S. (1999), Okapi/Keenbow at TREC-8, in E. M. Voorhees & D. K. Harman, eds, 'The Eighth Text REtrieval Conference (TREC-8)', NIST Special Publication 500-264, Gaithersburg, MD, pp.151--161.
[19]
Robertson, S. E. & Walker, S. (2000), Microsoft cambridge at trec-9: Filtering track, in E. M. Voorhees & D. K. Harman, eds, 'The Ninth Text REtrieval Conference (TREC-9)', NIST Special Publication 500-249, Gaithersburg, MD, pp.361--368.
[20]
Robertson, S. E., Walker, S., Hancock-Beaulieu, M., Gull, A. & Lau, M. (1992), Okapi at TREC, in D. K. Harman, ed., 'The First Text REtrieval Conference (TREC-1)', NIST Special Publication 500-207, Gaithersburg, MD, pp.21--30.
[21]
Rocchio, J. J. (1971), Relevance feedback in information retrieval, in E. Ide & G. Salton, eds, 'The Smart Retrieval System --- Experiments in Automatic Document Processing', Prentice-Hall, Englewood, Cliffs, New Jersey, pp.313--323.
[22]
Sakai, T. & Robertson, S. E. (2001), Flexible pseudo-relevance feedback using optimization tables, in D. H. Kraft, W. B. Croft, D. J. Harper & J. Zobel, eds, 'Proceedings of the ACM-SIGIR International Conference on Research and Development in Information Retrieval', New Orleans, LA, pp.396--397.
[23]
Salton, G. & McGill, M. (1983), Introduction to Modern Information Retrieval, McGraw-Hill, New York.
[24]
Scholer, F. & Williams, H. E. (2002), Query association for effective retrieval, in C. Nicholas, D. Grossman, K. Kalpakis, S. Qureshi, H. van Dissel & L. Seligman, eds, 'Proceedings of the ACM-CIKM International Conference on Information and Knowledge Management', McLean, VA, pp.324--331.
[25]
Scholer, F., Williams, H. & Turpin, A. (2003), Document surrogates for web search.(Manuscript in submission).
[26]
Sparck-Jones, K., Walker, S. & Robertson, S. E. (2000), 'A probabilistic model of information retrieval: development and comparative experiments. Parts 1&2', Information Processing and Management36(6),779--840.
[27]
Spink, A., Jansen, M. & B. J., Wolfram, D. & Saracevic, T. (2002), 'From e-sex to e-commerce: Web search changes', IEEE Computer 35(3),107--109.
[28]
Spink, A., Wolfram, D., Jansen, M. & B. J. & Saracevic, T. (2001), 'Searching the web: the public and their queries',Journal of the American Society for Information Science and Technology52(3),226--234.
[29]
van Rijsbergen, C. (1979), Information Retrieval, second edn, Butterworths.
[30]
Voorhees, E. M. & Buckley, C. (2002), The effect of topic set size on retrieval experiment error, in K. Jää rvelin, M. Beaulieu, R. Baeza-Yates & S. H. Myaeng, eds, 'Proceedings of the ACM-SIGIR International Conference on Research and Development in Information Retrieval', Tampere, Finland, pp.316--323.
[31]
Voorhees, E. M. & Harman, D. K. (2000), Overview of the Ninth Text REtrieval Conference (TREC-9), in E. M. Voorhees & D. K. Harman, eds, 'The Ninth Text REtrieval Conference (TREC 9)', National Institute of Standards and Technology Special Publication 500-249, Gaithersburg, MD, pp.1--14.
[32]
Voorhees, E. M. & Harman, D. K. (2001), Overview of TREC 2001, in E. M. Voorhees & D. K. Harman, eds, 'The Tenth Text REtrieval Conference (TREC 2001)', National Institute of Standards and Technology Special Publication 500-250, Gaithersburg, MD, pp.1--15.
[33]
Witten, I. H., Moffat, A. & Bell, T. C. (1999), Managing Gigabytes: Compressing and Indexing Documents and Images., 2nd edn, Morgan Kaufman Publishing, San Francisco.
[34]
Zobel, J. (1998), How reliable are the results of large-scale information retrieval experiments?, in W. B. Croft, A. Moffat, C. J. van Rijsbergen, R. Wilkinson & J. Zobel, eds, 'Proceedings of the ACM-SIGIR International Conference on Research and Development in Information Retrieval', Melbourne, Australia, pp.307--314.

Cited By

View all
  • (2024)Query Expansion Using Proposed Location-Based Algorithm for Hindi–English CLIR: Analyzing Three Test CollectionsInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142459001838:05Online publication date: 11-May-2024
  • (2022)A Survey of Natural Language GenerationACM Computing Surveys10.1145/355472755:8(1-38)Online publication date: 23-Dec-2022
  • (2021)Semantic Information Retrieval on Medical TextsACM Computing Surveys10.1145/346247654:7(1-38)Online publication date: 17-Sep-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management
November 2003
592 pages
ISBN:1581137230
DOI:10.1145/956863
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2003

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. query association
  2. query expansion
  3. web search

Qualifiers

  • Article

Conference

CIKM03

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Query Expansion Using Proposed Location-Based Algorithm for Hindi–English CLIR: Analyzing Three Test CollectionsInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142459001838:05Online publication date: 11-May-2024
  • (2022)A Survey of Natural Language GenerationACM Computing Surveys10.1145/355472755:8(1-38)Online publication date: 23-Dec-2022
  • (2021)Semantic Information Retrieval on Medical TextsACM Computing Surveys10.1145/346247654:7(1-38)Online publication date: 17-Sep-2021
  • (2020)Term Ordering-Based Query Expansion Technique for Hindi-English CLIR SystemHandling Priority Inversion in Time-Constrained Distributed Databases10.4018/978-1-7998-2491-6.ch016(283-302)Online publication date: 2020
  • (2019)Boosting Search Performance Using Query VariationsACM Transactions on Information Systems10.1145/334500137:4(1-25)Online publication date: 4-Oct-2019
  • (2019)An Advanced User Intent Model Based On User Learning ProcessInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142050024X34:09(2050024)Online publication date: 13-Dec-2019
  • (2019)Query Expansion for Effective Retrieval Results of Hindi–English Cross-Lingual IRApplied Artificial Intelligence10.1080/08839514.2019.157701833:7(567-593)Online publication date: 8-Apr-2019
  • (2019)A hybrid evolutionary algorithm based automatic query expansion for enhancing document retrieval systemJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-019-01247-915:1(829-848)Online publication date: 25-Feb-2019
  • (2018)What’s on Your Mind: Automatic Intent Modeling for Data ExplorationAdvances in Data and Information Sciences10.1007/978-981-13-0277-0_6(65-75)Online publication date: 29-Jun-2018
  • (2018)Social SearchSocial Information Access10.1007/978-3-319-90092-6_7(213-276)Online publication date: 3-May-2018
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media