Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Improving query expansion using WordNet

Published: 01 December 2014 Publication History
  • Get Citation Alerts
  • Abstract

    This study proposes a new way of using WordNet for query expansion QE. We choose candidate expansion terms from a set of pseudo-relevant documents; however, the usefulness of these terms is measured based on their definitions provided in a hand-crafted lexical resource such as WordNet. Experiments with a number of standard TREC collections WordNet-based that this method outperforms existing WordNet-based methods. It also compares favorably with established QE methods such as KLD and RM3. Leveraging earlier work in which a combination of QE methods was found to outperform each individual method as well as other well-known QE methods, we next propose a combination-based QE method that takes into account three different aspects of a candidate expansion term's usefulness: a its distribution in the pseudo-relevant documents and in the target corpus, b its statistical association with query terms, and c its semantic relation with the query, as determined by the overlap between the WordNet definitions of the term and query terms. This combination of diverse sources of information appears to work well on a number of test collections, viz., TREC123, TREC5, TREC678, TREC robust new, and TREC910 collections, and yields significant improvements over competing methods on most of these collections.

    References

    [1]
    Abdul-Jaleel, N., Allan, J., Croft, W.B., Diaz, F., Larkey, L.S., Li, X., 2004. UMass at TREC 2004: Novelty and HARD. In TREC.
    [2]
    Amati, G. 2003. Probability models for information retrieval based on divergence from randomness. University of Glasgow.
    [3]
    Amati, G., &Van Rijsbergen, C.J. 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst., Volume 20, pp.357-389.
    [4]
    Banerjee, S., &Pedersen, T. 2002. An adapted lesk algorithm for word sense disambiguation using wordnet. In Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing pp. pp.136-145. London, UK: Springer-Verlag.
    [5]
    Banerjee, S., &Pedersen, T. 2003. Extended gloss overlaps as a measure of semantic relatedness. In Proceedings of the 18th International Joint Conference on Artificial Intelligence pp. pp.805-810. San Francisco, CA: Morgan Kaufmann Publishers Inc.
    [6]
    Bendersky, M., Metzler, D., &Croft, W.B. 2011. Parameterized concept weighting in verbose queries. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval pp. pp.605-614. New York: ACM.
    [7]
    Bendersky, M., Metzler, D., &Croft, W.B. 2012. Effective query formulation with multiple information sources. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining pp. pp.443-452. New York: ACM.
    [8]
    Bhogal, J., Macfarlane, A., &Smith, P. 2007, Jul. A review of ontology based query expansion. Inf. Process. Manage., Volume 43 Issue 4, pp.866-886.
    [9]
    Buckley, C., Singhal, A., &Mitra, M. 1995. New retrieval approaches using smart: Trec 4. In TREC.
    [10]
    Cao, G., Nie, JY., Gao, J., &Robertson, S.E. 2008. Selecting good expansion terms for pseudo-relevance feedback. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pp. pp.243-250. New York: ACM.
    [11]
    Carpineto, C., <familyNamePrefix>de</familyNamePrefix>Mori, R., Romano, G., &Bigi, B. 2001. An information-theoretic approach to automatic query expansion. ACM Trans. Inf. Syst., Volume 19 Issue 1, pp.1-27.
    [12]
    Carpineto, C., &Romano, G. 2012. A survey of automatic query expansion in information retrieval. ACM Comput. Surv., Volume 44 Issue 1, pp.1:1-1:50.
    [13]
    Carpineto, C., Romano, G., &Giannini, V. 2002. Improving retrieval feedback with multiple term-ranking function combination. ACM Trans. Inf. Syst., Volume 20 Issue 3, pp.259-290.
    [14]
    Cui, H., Wen, JR., Nie, JY., &Ma, WY. 2002. Probabilistic query expansion using query logs. In Proceedings of the 11th International Conference on World Wide Web pp. pp.325-332. New York: ACM.
    [15]
    Doszkocs, T.E. 1978. AID, an associative interactive dictionary for online searching. Online Information Review Technology, Volume 2 Issue 2, pp.163-173.
    [16]
    Fang, H. 2008. A re-examination of query expansion using lexical resources. In Proceedings of Association for Computational Linguistics ACL-08: Human Language Technology pp. pp.139-147. Stroudsburg, PA: ACL.
    [17]
    Jing, Y., &Croft, W.B. 1994. An association thesaurus for information retrieval. In Recherche d'Information Assistée par Ordinateur pp. pp.146-160. New York: CID.
    [18]
    Krikon, E., Kurland, O., &Bendersky, M. 2010. Utilizing inter-passage and inter-document similarities for reranking search results. ACM Trans. Inf. Syst., Volume 29 Issue 1, pp.3:1-3:28.
    [19]
    Lavrenko, V., &Croft, W.B. 2001. Relevance-based language models. In SIGIR pp. pp.120-127. New York: ACM.
    [20]
    Lesk, M. 1986. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of the 5th Annual International Conference on Systems Documentation pp. pp.24-26. New York: ACM.
    [21]
    Li, Y., Luk, W.P.R., Ho, K.S.E., &Chung, F.L.K. 2007. Improving weak ad-hoc queries using wikipedia asexternal corpus. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pp. pp.797-798. New York: ACM.
    [22]
    Liu, S., Liu, F., Yu, C., &Meng, W. 2004. An effective approach to document retrieval via utilizing wordnet and recognizing phrases. In Proceedings of the 27th Annual International Association for Computing Machinery ACM Special Interest Group on Information Retrieval SIGIR Conference on Research and Development in Information Retrieval pp. pp.266-272. New York: ACM.
    [23]
    Lv, Y., &Zhai, C. 2009. A comparative study of methods for estimating query language models with pseudo feedback. In Proceedings of the 18th ACM Conference on Information and Knowledge Management pp. pp.1895-1898. New York: ACM.
    [24]
    Lv, Y., &Zhai, C. 2010. Positional relevance model for pseudo-relevance feedback. In Proceedings of the 33rd International Association for Computing Machinery ACM Special Interest Group on Information Retrieval SIGIR Conference on Research and Development in Information Retrieval pp. pp.579-586. New York: ACM.
    [25]
    Miao, J., Huang, J.X., &Ye, Z. 2012. Proximity-based rocchio's model for pseudo relevance. In Proceedings of the 35th International Association for Computing Machinery ACM Special Interest Group on Information Retrieval SIGIR Conference on Research and Development in Information Retrieval pp. pp.535-544. New York: ACM.
    [26]
    Pal, D., Mitra, M., &Datta, K. 2013. Query expansion using term distribution and term association. Computing Research Repository CoRR, abs/1303.0667. New York: Cornell University.
    [27]
    Pérez-Agüera, J.R., &Araujo, L. 2008. Comparing and combining methods for automatic query expansion. Research in Computing Science Proceedings of Conference on Intelligent Text Processing and Computational Linguistics CICLing 2008, Advances in Natural Language Processing and Applications, Volume 33, pp.177-188.
    [28]
    Qiu, Y., &Frei, HP. 1993. Concept based query expansion. In Proceedings of the 16th Annual International Association for Computing Machinery ACM Special Interest Group on Information Retrieval SIGIR Conference on Research and Development in Information Retrieval pp. pp.160-169. New York: ACM.
    [29]
    Robertson, S.E. 1991. On term selection for query expansion. Journal of Documentation, Volume 46 Issue 4, pp.359-364.
    [30]
    Robertson, S. 2004. Understanding inverse document frequency: On theoretical arguments for idf. Journal of Documentation, Volume 60, pp.503-520.
    [31]
    Robertson, S.E., &Sparck Jones, K. 1976. Relevance weighting of search terms. Journal of the American Society for Information Science, Volume 27, pp.129-146.
    [32]
    Salton, G. Ed. 1971. The SMART retrieval system-Experiments in automatic document processing. Englewood, Cliffs, NJ: Prentice Hall.
    [33]
    Voorhees, E.M. 1993. On expanding query vectors with lexically related words. In TREC pp. pp.223-232. Washington, DC: Department of Commerce, National Institute of Standards and Technology.
    [34]
    Voorhees, E.M. 1994. Query expansion using lexical-semantic relations. In Proceedings of the 17th Annual International Association for Computing Machinery ACM Special Interest Group on Information Retrieval SIGIR Conference on Research and Development in Information Retrieval pp. pp.61-69. London: Springer-Verlag.
    [35]
    Xu, J., &Croft, W.B. 1996. Query expansion using local and global document analysis. In SIGIR pp. pp.4-11. New York: ACM.
    [36]
    Xu, J., &Croft, W.B. 2000. Improving the effectiveness of information retrieval with local context analysis. ACM Trans. Inf. Syst., Volume 18 Issue 1, pp.79-112.
    [37]
    Zhang, J., Deng, B., &Li, X. 2009. Concept based query expansion using wordnet. In Proceedings of the 2009 International e-Conference on Advanced Science and Technology pp. pp.52-55. Washington, DC: IEEE Computer Society.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Journal of the Association for Information Science and Technology
    Journal of the Association for Information Science and Technology  Volume 65, Issue 12
    December 2014
    214 pages
    ISSN:2330-1635
    EISSN:2330-1643
    Issue’s Table of Contents

    Publisher

    John Wiley & Sons, Inc.

    United States

    Publication History

    Published: 01 December 2014

    Author Tags

    1. query expansion
    2. semantic analysis

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Evaluation of semantic relations impact in query expansion-based retrieval systemsKnowledge-Based Systems10.1016/j.knosys.2023.111183283:COnline publication date: 11-Jan-2024
    • (2024)Understanding the impact of query expansion on federated searchMultimedia Tools and Applications10.1007/s11042-023-15831-x83:4(10393-10407)Online publication date: 1-Jan-2024
    • (2023)Word-embedding-based query expansionJournal of Information Science10.1177/0165551521104065949:5(1168-1186)Online publication date: 1-Oct-2023
    • (2021)MS MARCO Chameleons: Challenging the MS MARCO Leaderboard with Extremely Obstinate QueriesProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482011(4426-4435)Online publication date: 26-Oct-2021
    • (2021)Pseudo-relevance feedback based query expansion using boosting algorithmArtificial Intelligence Review10.1007/s10462-021-09972-454:8(6101-6124)Online publication date: 1-Dec-2021
    • (2021)Using Query Expansion in Manifold Ranking for Query-Oriented Multi-document SummarizationChinese Computational Linguistics10.1007/978-3-030-84186-7_7(97-111)Online publication date: 13-Aug-2021
    • (2021)An Extensible Toolkit of Query Refinement Methods and Gold Standard Dataset GenerationAdvances in Information Retrieval10.1007/978-3-030-72240-1_54(498-503)Online publication date: 28-Mar-2021
    • (2020)ReQueProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412775(3165-3172)Online publication date: 19-Oct-2020
    • (2019)Spatial search strategies for open government dataProceedings of the 13th Workshop on Geographic Information Retrieval10.1145/3371140.3371142(1-10)Online publication date: 28-Nov-2019
    • (2019)Term Selection for Query Expansion in Medical Cross-Lingual Information RetrievalAdvances in Information Retrieval10.1007/978-3-030-15712-8_33(507-522)Online publication date: 14-Apr-2019
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media