Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Coverage, relevance, and ranking: The impact of query operators on Web search engine results

Published: 01 October 2003 Publication History
  • Get Citation Alerts
  • Abstract

    Research has reported that about 10% of Web searchers utilize advanced query operators, with the other 90% using extremely simple queries. It is often assumed that the use of query operators, such as Boolean operators and phrase searching, improves the effectiveness of Web searching. We test this assumption by examining the effects of query operators on the performance of three major Web search engines. We selected one hundred queries from the transaction log of a Web search service. Each of these original queries contained query operators such as AND, OR, MUST APPEAR (+), or PHRASE (" "). We then removed the operators from these one hundred advanced queries. We submitted both the original and modified queries to three major Web search engines; a total of 600 queries were submitted and 5,748 documents evaluated. We compared the results from the original queries with the operators to the results from the modified queries without the operators. We examined the results for changes in coverage, relative precision, and ranking of relevant documents. The use of most query operators had no significant effect on coverage, relative precision, or ranking, although the effect varied depending on the search engine. We discuss implications for the effectiveness of searching techniques as currently taught, for future information retrieval system design, and for future research.

    References

    [1]
    Aol. 2003. Getting started. Accessed on: 23 April 2003 at: http://search.aol.com/aolcom/help.jsp.
    [2]
    Borgman, C. 1996. Why are online catalogs still hard to use? J. ASIS 47, 7, 493--503.
    [3]
    Brin, S. 1998. Extracting patterns and relations from the world wide web. In Proceedings of the World Wide Web and Databases (Valencia, Spain). 172--183.
    [4]
    Chang, C.-C. K., Garcia-Molina, H., and Paepcke, A. 1999. Predicate rewriting for translating Boolean queries in a heterogeneous information system. ACM Trans. Inf. Syst. 17, 1, 1--39.
    [5]
    Chowdhury, A., Beitzel, S., and Jensen, E. 2002. Analysis of combining multiple query representations with varying lengths in a single engine. In Proceedings of the IEEE 3rd International Conference on Information Technology Coding and Computing (Las Vegas, Nev.). IEEE Computer Society Press, Los Alamitos, Calif., 8--15.
    [6]
    Clark, P. 2001. Solving Internet overload. Net Econ. 2, 3, 1.
    [7]
    Clarke, C. L. A. and Cormack, G. V. 2000. Shortest-substring retrieval and ranking. ACM Trans. Inf. Syst. 18, 1, 44--78.
    [8]
    Cooper, W. S. 1968. Expected search length: A single measure of retrieval effectiveness based on weak ordering action of retrieval systems. J. ASIS 19, 1, 30--41.
    [9]
    Craswell, N., Hawking, D., and Robertson, S. 2001. Effective site finding using link anchor information. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (New Orleans, La). ACM New York, 250--257.
    [10]
    Cronen-Townsend, S., Zhou, Y., and Croft, W. B. 2002. Predicting query performance. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Tampere, Finland). ACM New York, 299--306.
    [11]
    Cyber Atlas. 2002. U.S. Top 50 Internet properties July 2002 at home and work combined. Accessed on: 1 November 2002 at: http://cyberatlas.internet.com.
    [12]
    Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R. 1990. Indexing by latent semantic analysis. J. Asis 41, 6, 391--407.
    [13]
    Ding, W. and Marchionini, G. 1996. A comparative study of Web search service performance. In Proceedings of the 59th Annual Meeting of the American Society for Information Science (Medford, N.J.). 136--142.
    [14]
    Dumais, S., Cutrell, E., and Chen, H. 2001. Optimizing search by showing results in context. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (Seattle, Wash.). 277--284.
    [15]
    Dumais, S. T. 2002. Web experiments and test collections. Presented at the 11th International World Wide Web Conference (Honolulu, Hi., May 7--11). Presentation available at http://www2002.org/presentations/dumais.pdf.
    [16]
    Eastman, C. M. 2002. 30,000 hits may be better than 300: Precision anomalies in Internet searches. J. ASIST 53, 11, 879--882.
    [17]
    Ford, N., Miller, D., and Moss, N. 2003. Web search strategies and approaches to studying. J. ASIST 54, 6, 473--489.
    [18]
    Frants, V. I., Shapiro, J., Taksa, I., and Voiskunskii, V. G. 1999. Boolean search: Current state and perspectives. J. ASIS 50, 1, 86--95.
    [19]
    Glover, E. J., Lawrence, S., Gordon, M. D., Birmingham, W. P., and Giles, C. L. 2001. Web search---Your way: Improving Web searching with user preferences. Commun. ACM 44, 12, 97--102.
    [20]
    Google. 2003. Advanced search. Accessed on: 23 April 2003 at: http://www.google.com/help/refinesearch.html.
    [21]
    Gudivada, V. N., Raghavan, V. V., Grosky, W. I., and Kasanagottu, R. 1997. Information retrieval on the World Wide Web. IEEE Internet Comput. Sept.--Oct. 58--68.
    [22]
    Hawking, D. 2000. Overview of the TREC-9 Web track. In Proceedings of the 9th Text Retrieval Conference (Gaithersburg, Md.). 87--103.
    [23]
    Hawking, D., Craswell, N., Bailey, P., and Griffihs, K. 2001. Measuring search engine quality. Inf. Ret. 4, 1, 33--59.
    [24]
    Hawking, D. and Robertson, S. 2003. On collection size and retrieval effectiveness. Inf. Ret. 6, 1, 99--105.
    [25]
    Hiemstra, D. and Robertson, S. E. 2001. Relevance feedback for best match term weighting algorithms in information retrieval. In Proceedings of Joint 2nd DELOS-NSF Workshop on Personalisation and Recommender Systems in Digital Libraries (Dublin City University, Ireland). 37--42.
    [26]
    Hiemstra, D. and Vries, A. D. 2000. CTIT Technical Report TR-CTIT-00-09: Relating the new language models of information retrieval to the traditional retrieval models. Accessed on: 30 March 2003 at: http://wwwhome.cs.utwente.nl/∼hiemstra/papers/.
    [27]
    Hölscher, C. and Strube, G. 2000. Web search behavior of Internet experts and newbies. Int. J. Comput. Telecommun. Netw. 33, 1--6, 337--346.
    [28]
    Inktomi. 2002. Refocused Inktomi seeks to monetize search. Accessed on: 20 December 2002 at: http://www.inktomi.com/company/news/.
    [29]
    Jansen, B. J. 2000. An investigation into the use of simple queries on Web IR systems. Inf. Res.: An Elect. J. 6, 1, 1--10.
    [30]
    Jansen, B. J. and Kroner, G. 2003. The impact of automated assistance on the information retrieval process. In Proceedings of the ACM CHI 2003 Conference on Human Factors in Computing Systems (Fort Lauderdale, Fla.). ACM New York, 1004--1006.
    [31]
    Jansen, B. J. and Pooch, U. 2001. Web user studies: A review and framework for future work. J. ASIST 52, 3, 235--246.
    [32]
    Jansen, B. J., Spink, A., Bateman, J., and Saracevic, T. 1998. Real life information retrieval: A study of user queries on the Web. SIGIR Forum. 32, 1, 5--17.
    [33]
    Jansen, B. J., Spink, A., Pfaff, A., and Goodrum, A. 2000a. Web query structure: Implications for IR system design. In Proceedings of the 4th World Multiconference on Systemics, Cybernetics and Informatics (SCI'2000), (Orlando, Fla.). 169--176.
    [34]
    Jansen, B. J., Spink, A., and Saracevic, T. 2000b. Real life, real users, and real needs: A study and analysis of user queries on the Web. Inf. Proc. Manage. 36, 2, 207--227.
    [35]
    Kirsch, S. 1998. The future of Internet search (keynote address). Accessed on: 16 August 1999 at: http://www.skirsch.com/stk.html/presentations/sigir.ppt.
    [36]
    Korfhage, R. 1997. Information Storage and Retrieval. Wiley, New York.
    [37]
    Lawrence, S. and Giles, C. L. 1999. Accessibility of information on the Web. Nature. 400, 107--109.
    [38]
    Lexisnexis. 2003. Terms and connectors---introduction to traditional Boolean searching. Accessed on: 5 April 2003 at: http://support.lexis-nexis.com/online/record.asp?ArticleID=GS_Boolean.
    [39]
    Looksmart. 2003. Looksmart search: About us. Accessed on: 1 February 2002 at: http://aboutus.looksmart.com/about.jhtml;$sessionid$MFR0PZYAAACADLAQQBTSOJQ?dir = profile&page = about.
    [40]
    Lucas, W. and Topi, H. 2002. Form and function: The impact of query term and operator usage on Web search results. J. ASIST 53, 2, 95--108.
    [41]
    Msn. 2003. Tips for an advanced search. Accessed on: 23 April 2003 at: http://search.msn.com.
    [42]
    Nielsen Netrating. 2002. Top Web properties march 2002. Accessed on: 27 September 2002 at: http://www.nielsen-netratings.com.
    [43]
    Nielsenmedia. 1997. Search engines most popular method of surfing the Web. Accessed on: 30 August 2000 at: http://www.commerce.net/news/press/0416.html.
    [44]
    Notess, G. R. 2003. Search engine features chart. Accessed on: 13 April 2003 at: http://www.searchengineshowdown.com/features/.
    [45]
    Petersen, R. E. 1997. Eight Internet search engines compared. First Monday. 2, 2,
    [46]
    Salton, G. and McGill, M. J. 1983. Introduction to Modern Information Retrieval. McGraw-Hill, New York.
    [47]
    Selberg, E. and Etzioni, O. 2000. On the instability of Web search services. In Proceedings of RIAO 2000: Computer-Assisted Information Retrieval (Paris, France). 223--236.
    [48]
    Silverstein, C., Henzinger, M., Marais, H., and Moricz, M. 1999. Analysis of a very large Web search engine query log. SIGIR Forum. 33, 1, 6--12.
    [49]
    Sormunen, E. 2000. A novel method for the evaluation of Boolean query effectiveness across a wide operational range. In Proceedings of 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Athens, Greece). ACM New York, 25--32.
    [50]
    Spink, A. 1995. Term relevance feedback and mediated database searching: Implications for information retrieval practice and design. Inf. Proc. Manage. 31, 6, 161--171.
    [51]
    Spink, A., Bateman, J. and Jansen, B. J. 1998a. Searching heterogeneous collections on the Web: Behavior of Excite users. Inf. Res. 4, 2, 317--328.
    [52]
    Spink, A., Bateman, J. and Jansen, B. J. 1999. Searching the Web: A survey of Excite users. J. Internet Res. Electron. Netw. Appl. Policy 9, 2, 117--128.
    [53]
    Spink, A., Greisdorf, H. and Bateman, J. 1998b. From highly relevant to not relevant: Examining different regions of relevance. J. ASIS 34, 5, 599--621.
    [54]
    Spink, A., Jansen, B. J., Wolfram, D., and Saracevic, T. 2002. From e-sex to e-commerce: Web search changes. IEEE Comput. 35, 3, 107--111.
    [55]
    Sullivan, D. 2000. Search watch. Accessed on: 1 June 2000 at: http://searchenginewatch.com/.
    [56]
    Sullivan, D. 2002. Search engine math. Accessed on: 11 April 2003 at: http://www.searchenginewatch.com/facts/article.php/2156021.
    [57]
    Sullivan, D. 2003. Coping with GDS, the Google dance syndrome. Accessed on: 14 July 2033 at: http://www.searchenginewatch.com/sereport/article.php/2216081.
    [58]
    Witten, I., Moffat, A. and Bell, T. C. 1994. Managing Gigabytes: Compressing and Indexing Documents and Images. Van Nostrand Reinhold, New York, NY.
    [59]
    Wolfram, D. 1999. Term co-occurrence in Internet search engine queries: An analysis of the Excite data set. Canad. J. Inf. Library Sci. 24, 2/3, 12--33.
    [60]
    Zapur, K. and Zhang, J. 2000. Searching the Web using synonyms and senses. WebNet J.: Internet Tech. Appl. Issues. 2, 3, 54--61.

    Cited By

    View all
    • (2023)Image retrieval effectiveness of Bing Images, Google Images and Yahoo Image Search in the scientific field of tourism and COVID-19Journal of Information Science10.1177/01655515231161560(016555152311615)Online publication date: 15-Mar-2023
    • (2022)Spatiotemporal Mapping of Online Interest in Cannabis and Popular Psychedelics before and during the COVID-19 Pandemic in PolandInternational Journal of Environmental Research and Public Health10.3390/ijerph1911661919:11(6619)Online publication date: 29-May-2022
    • (2022)What Does Information Science Offer for Data Science Research?: A Review of Data and Information Ethics LiteratureJournal of Data and Information Science10.2478/jdis-2022-00187:4(16-38)Online publication date: 8-Sep-2022
    • Show More Cited By

    Index Terms

    1. Coverage, relevance, and ranking: The impact of query operators on Web search engine results

            Recommendations

            Reviews

            Srini Ramaswamy

            This interesting paper examines the effects of query operators on the performance of three widely used search engines: Microsoft Network (MSN), Google, and America Online (AOL). It compares and contrasts their search capabilities and associated success factors, with and without using query operators. The results indicate that all three search engines, while different with respect to the searching algorithms they use, are, with respect to coverage and relative precision, common in their flaws: they do not effectively use available Boolean query operators in discriminating Web search results for presentation to the end user, although each search engine does have statements that indicate that using Boolean query operators will provide improved results. The minimal use of these advanced querying features by a majority of Web searchers is also noted. The authors reason that this may be due to the presence of ineffective user interfaces for advanced search features, and suggest that more research may be needed on advanced query operators for designing good ranking algorithms for displaying search results. In summary, the paper experimentally verifies what many have suspected for a long time about user preferences and how search engines have not correspondingly evolved in sophistication. The paper shows that this is due to the engines' narrow goal in satisfying the large population of casual Web searchers. It is quite evident that advanced searchers have always been handicapped, by both the scope and the effectiveness of the available query operators provided by various search engines. Online Computing Reviews Service

            Access critical reviews of Computing literature here

            Become a reviewer for Computing Reviews.

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Transactions on Information Systems
            ACM Transactions on Information Systems  Volume 21, Issue 4
            October 2003
            179 pages
            ISSN:1046-8188
            EISSN:1558-2868
            DOI:10.1145/944012
            Issue’s Table of Contents
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 01 October 2003
            Published in TOIS Volume 21, Issue 4

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. Boolean operators
            2. Relative precision
            3. Web results
            4. coverage
            5. query operators
            6. ranking
            7. search engines

            Qualifiers

            • Article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)23
            • Downloads (Last 6 weeks)8

            Other Metrics

            Citations

            Cited By

            View all
            • (2023)Image retrieval effectiveness of Bing Images, Google Images and Yahoo Image Search in the scientific field of tourism and COVID-19Journal of Information Science10.1177/01655515231161560(016555152311615)Online publication date: 15-Mar-2023
            • (2022)Spatiotemporal Mapping of Online Interest in Cannabis and Popular Psychedelics before and during the COVID-19 Pandemic in PolandInternational Journal of Environmental Research and Public Health10.3390/ijerph1911661919:11(6619)Online publication date: 29-May-2022
            • (2022)What Does Information Science Offer for Data Science Research?: A Review of Data and Information Ethics LiteratureJournal of Data and Information Science10.2478/jdis-2022-00187:4(16-38)Online publication date: 8-Sep-2022
            • (2020)Y a-t-il des différences de mesure bibliométrique selon la source des données Medline, Google Scholar ou Web of Science ? Application à la bibliométrie de la publication d’articles après une communication orale aux congrès de la Société Francophone d’Arthroscopie (SFA) en 2013 et 2014Revue de Chirurgie Orthopédique et Traumatologique10.1016/j.rcot.2020.10.013106:8(924-929)Online publication date: Dec-2020
            • (2020)Measuring the levels of 21st-century digital skills among professionals working within the creative industries: A performance-based approachPoetics10.1016/j.poetic.2020.101434(101434)Online publication date: Jan-2020
            • (2020)Do bibliometric findings differ between Medline, Google Scholar and Web of Science? Bibliometry of publications after oral presentation to the 2013 and 2014 French Society of Arthroscopy (SFA) CongressesOrthopaedics & Traumatology: Surgery & Research10.1016/j.otsr.2020.09.005Online publication date: Nov-2020
            • (2019)Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databasesScientometrics10.1007/s11192-018-2958-5118:1(177-214)Online publication date: 1-Jan-2019
            • (2017)Assessing the Completeness Evolution of DBpedia: A Case StudyAdvances in Conceptual Modeling10.1007/978-3-319-70625-2_22(238-247)Online publication date: 10-Nov-2017
            • (2016)Evaluating the Effectiveness of Automated Assistance for Web SearchingProceedings of the Human Factors and Ergonomics Society Annual Meeting10.1177/15419312040480131048:13(1518-1522)Online publication date: 5-Nov-2016
            • (2016)An Analysis of Web-scale Discovery Services From the Perspective of User's Relevance JudgmentThe Journal of Academic Librarianship10.1016/j.acalib.2016.06.01642:5(529-534)Online publication date: Sep-2016
            • Show More Cited By

            View Options

            Get Access

            Login options

            Full Access

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media