Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1571941.1572015acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Effective query expansion for federated search

Published: 19 July 2009 Publication History

Abstract

While query expansion techniques have been shown to improve retrieval performance in a centralized setting, they have not been well studied in a federated setting. In this paper, we consider how query expansion may be adapted to federated environments and propose several new methods: where focused expansions are used in a selective fashion to produce specific queries for each source (or a set of sources). On a number of different testbeds, we show that focused query expansion can significantly outperform the previously proposed global expansion method, and---contrary to earlier work---show that query expansion can improve performance over standard federated retrieval.
These findings motivate further research examining the different methods for query expansion, and other forms of system and user interaction, in order to continue improving the performance of interactive federated search systems.

References

[1]
T. Avrahami, L. Yau, L. Si, and J. Callan. The FedLemur project: Federated search in the real world. JASIST, 57(3):347--358, 2006.
[2]
M. Bilenko and R. White. Mining the search trails of surfing crowds: identifying relevant websites from user activity. In Proceedings of the Int.l Conf. on World Wide Web, pages 51--60, Beijing, China, 2008.
[3]
J. Callan. Advances in Information Retrieval, chapter Distributed Information Retrieval, pages 127--150. Kluwer Academic Publishers, 2000.
[4]
J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
[5]
G. Cao, J.-Y. Nie, J. Gao, and S. Robertson. Selecting good expansion terms for pseudo-relevance feedback. In Proceedings of the ACM SIGIR Conference, pages 243--250, Singapore, 2008.
[6]
K. Collins-Thompson and J. Callan. Query expansion using random walk models. In Proceedings of the ACM CIKM Conference, pages 704--711, Bremen, Germany, 2005.
[7]
L. Gravano, K. Chang, H. García-Molina, C. Lagoze, and A. Paepcke. STARTS: Stanford protocol proposal for internet retrieval and search. In Proceedings of the ACM SIGMOD Conference, pages 207--218, Tucson, Arizona, 1997.
[8]
L. Gravano and P.G. Ipeirotis. QProber: A system for automatic classification of hidden-web databases. ACM Transactions on Information Systems, 21(1):1--41, 2003.
[9]
D. Hawking and P. Thomas. Server selection methods in hybrid portal search. In Proceedings of the ACM SIGIR Conference, pages 75--82, Salvador, Brazil, 2005.
[10]
A. Jain and R. Dubes. Algorithms for clustering data. Prentice-Hall, Upper Saddle River, NJ, 1988.
[11]
T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the ACM SIGIR Conference, pages 154--161, Salvador, Brazil, 2005.
[12]
R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In Proc. of the Int. Conf. on World Wide Web, pages 387--396, Scotland, 2006.
[13]
V. Lavrenko and B. Croft. Relevance based language models. In Proceedings of the ACM SIGIR Conference, pages 120--127, New Orleans, LA, 2001.
[14]
M. Montague and J. Aslam. Relevance score normalization for metasearch. In Proceedings of the ACM CIKM Conference, pages 427--433, Atlanta, GA, 2001.
[15]
P. Ogilvie and J. Callan. The effectiveness of query expansion for distributed information retrieval. In Proceedings of the ACM CIKM Conference, pages 183--190, Atlanta, GA, 2001.
[16]
A. Paepcke, R. Brandri, G. Janee, R. Larson, B. Ludaescher, S. Melnik, and S. Raghavan. Search middleware and the simple digital library interoperability protocol. D-Lib magazine, 6(3), 2000.
[17]
J. Rocchio. The SMART retrieval system: Experiments in automatic document processing. In Relevance feedback in information retrieval, pages 313--323, 1971.
[18]
M. Shokouhi. Central-rank-based collection selection in uncooperative distributed information retrieval. In Proceedings of the ECIR Conference, pages 160--172, Rome, Italy, 2007.
[19]
M. Shokouhi, J. Zobel, F. Scholer, and S. Tahaghoghi. Capturing collection size for distributed non-cooperative retrieval. In Proceedings of the ACM SIGIR Conference, pages 316--323, Seattle, WA, 2006.
[20]
M. Shokouhi, J. Zobel, S. Tahaghoghi, and F. Scholer. Using query logs to establish vocabularies in distributed information retrieval. IPM, 43(1), 2007.
[21]
L. Si and J. Callan. Relevant document distribution estimation method for resource selection. In Proceedings of the ACM SIGIR Conference, pages 298--305, Toronto, Canada, 2003.
[22]
L. Si and J. Callan. A semisupervised learning method to merge search engine results. ACM TOIS, 21(4):457--491, 2003.
[23]
L. Si and J. Callan. Unified utility maximization framework for result selection. In Proceedings of the ACM CIKM Conference, pages 32--41, Washington, DC, 2004.
[24]
J. Wen, J. Nie, and H. Zhang. Query clustering using user logs. ACM TOIS, 20(1):59--81, 2002.
[25]
R.W. White, I. Ruthven, J.M. Jose, and C.J.V. Rijsbergen. Evaluating implicit feedback models using searcher simulations. ACM TOIS, 23(3):325--361, 2005.
[26]
J. Xu and B. Croft. Query expansion using local and global document analysis. In Proceedings of the ACM SIGIR Conference, pages 4--11, Zurich, Switzerland, 1996.
[27]
J. Xu and W.B. Croft. Cluster-based language models for distributed retrieval. In Proceedings of the ACM SIGIR Conference, pages 254--261, Berkeley, CA, 1999.

Cited By

View all
  • (2024)Query Expansion Using Proposed Location-Based Algorithm for Hindi–English CLIR: Analyzing Three Test CollectionsInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142459001838:05Online publication date: 11-May-2024
  • (2024)Understanding the impact of query expansion on federated searchMultimedia Tools and Applications10.1007/s11042-023-15831-x83:4(10393-10407)Online publication date: 1-Jan-2024
  • (2019)Bridging Text Visualization and Mining: A Task-Driven SurveyIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2018.283434125:7(2482-2504)Online publication date: 1-Jul-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
July 2009
896 pages
ISBN:9781605584836
DOI:10.1145/1571941
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. distributed information retrieval
  2. query expansion

Qualifiers

  • Research-article

Conference

SIGIR '09
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Query Expansion Using Proposed Location-Based Algorithm for Hindi–English CLIR: Analyzing Three Test CollectionsInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142459001838:05Online publication date: 11-May-2024
  • (2024)Understanding the impact of query expansion on federated searchMultimedia Tools and Applications10.1007/s11042-023-15831-x83:4(10393-10407)Online publication date: 1-Jan-2024
  • (2019)Bridging Text Visualization and Mining: A Task-Driven SurveyIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2018.283434125:7(2482-2504)Online publication date: 1-Jul-2019
  • (2018)A Vertical PRF Architecture for Microblog SearchProceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3234944.3234960(107-114)Online publication date: 10-Sep-2018
  • (2018)Strength Pareto fitness assignment for pseudo-relevance feedbackFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-016-5560-012:1(163-176)Online publication date: 1-Feb-2018
  • (2015)Evaluation of Pseudo Relevance Feedback Techniques for Cross Vertical Aggregated SearchProceedings of the 6th International Conference on Experimental IR Meets Multilinguality, Multimodality, and Interaction - Volume 928310.1007/978-3-319-24027-5_8(91-102)Online publication date: 8-Sep-2015
  • (2014)A quantitative evaluation of query expansion in domain specific information retrievalProceedings of the American Society for Information Science and Technology10.1002/meet.1450500104650:1(1-7)Online publication date: 8-May-2014
  • (2013)A quantitative evaluation of query expansion in domain specific information retrievalProceedings of the 76th ASIS&T Annual Meeting: Beyond the Cloud: Rethinking Information Boundaries10.5555/2655780.2655844(1-7)Online publication date: 1-Nov-2013
  • (2012)Utilizing inter-document similarities in federated searchProceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval10.1145/2348283.2348523(1169-1170)Online publication date: 12-Aug-2012
  • (2012)A Survey of Automatic Query Expansion in Information RetrievalACM Computing Surveys10.1145/2071389.207139044:1(1-50)Online publication date: 1-Jan-2012
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media