Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

How are we searching the world wide web?: a comparison of nine search engine transaction logs

Published: 01 January 2006 Publication History

Abstract

The Web and especially major Web search engines are essential tools in the quest to locate online information for many people. This paper reports results from research that examines characteristics and changes in Web searching from nine studies of five Web search engines based in the US and Europe. We compare interactions occurring between users and Web search engines from the perspectives of session length, query length, query complexity, and content viewed among the Web search engines. The results of our research shows (1) users are viewing fewer result pages, (2) searchers on US-based Web search engines use more query operators than searchers on European-based search engines, (3) there are statistically significant differences in the use of Boolean operators and result pages viewed, and (4) one cannot necessary apply results from studies of one particular Web search engine to another Web search engine. The wide spread use of Web search engines, employment of simple queries, and decreased viewing of result pages may have resulted from algorithmic enhancements by Web search engine companies. We discuss the implications of the findings for the development of Web search engines and design of online content.

References

[1]
Amichai-Hamburger, Y. (2002). Internet and personality. Computers in Human Behavior, 18, 1-10.
[2]
Baeza-Yates, R., & Castillo, C. (October 2000). Relating web characteristies {in Spanish} {Website}. University of Chile. Retrieved 15.07.02 from the World Wide Web: http://www.todocl.cl/stats/rbaeza.pdf.
[3]
Baeza-Yates, R., & Castillo, C. (2001). Relating web structure and user search behavior. In Proceedings of the 10th World Wide Web conference (pp. 1-2). Hong Kong, China. 1-5 May.
[4]
Bar-Ilan, J. (2004). The use of web search engines in information science research. In B. Cronin (Ed.). Annual review of information science and technology (Vol. 33, pp. 231-288). Medford, NY, USA: Information Today.
[5]
Bates, M. J. (1990). Where should the person stop and the information search interface start? Information Processing & Management, 26(5), 575-591.
[6]
Beitzel, S. M., Jensen, E. C., Chowdhury, A., Grossman, D., & Frieder, O. (2004). Hourly analysis of a very large topically categorized Web query log. In Proceedings of the 27th annual international conference on Research and development in information retrieval (pp. 321-328). Sheffield, UK, 25-29 July.
[7]
Belkin, N., Cool, C., Stein, A., & Theil, S. (1995). Cases, scripts, and information-seeking strategies: on the design of interactive information retrieval systems. Expert Systems with Applications, 9(3), 379-395.
[8]
Cacheda, F., & Viñña, Á. (2001a). Experiences retrieving information in the World Wide Web. In Proceedings of the 6th IEEE Symposium on Computers and Communications (pp. 72-79). Hammamet, Tunisia. July.
[9]
Cacheda, F., & Viñña, Á. (2001b). Understanding how people use search engines: a statistical analysis for e-business. In Proceedings of the e-Business and e-Work Conference and Exhibition 2001 (pp. 319-325). Venice, Italy, October.
[10]
Callan, J., & Smeaton, A. (2003). Personalisation and recommender systems in digital libraries. Joint NSF_EU_DELOS working group report. Joint NSF-EU DELOS Working Group Report. Retrieved 1.1.02 from the World Wide Web: http://www-2.cs.cmu.edu/ ~callan/papers/personalisation03-wg.pdf.
[11]
Chen, H.-M., & Cooper, M. D. (2001). Using clustering techniques to detect usage patterns in a web-based information system. Journal of the American Society for Information Science and Technology, 52(11), 888-904.
[12]
Chen, H.-M., & Cooper, M. D. (2002). Stochastic modeling of usage patterns in a web-based information system. Journal of the American Society for Information Science and Technology, 53(7), 536-548.
[13]
Cole, J. I., Suman, M., Schramm, P., Lunn, R., & Aquino, J. S. (February 2003) The ucla internet report surveying the digital future year three {Website}. UCLA Center for Communication Policy. Retrieved 1.2.2003 from the World Wide Web: http://www.ccp.ucla.edu/ pdf/ucla-internet-report-year-three.pdf.
[14]
Cyber Atlas. (1999). US Top 50 internet properties, December 1999, at home/work combined {Website}. CyberAtlas. Retrieved 1.7.2000 from the World Wide Web: http://cyberatlas.internet.com.
[15]
Cyber Atlas. (2001). US Top 50 internet properties, may 2001, at home/work combined {Website}. CyberAtlas. Retrieved 1.7.2000 from the World Wide Web: http://cyberatlas.internet.com.
[16]
Dennis, S., Bruza, P., & McArthur, R. (2002). Web searching: a process-oriented experimental study of three interactive search paradigms. Journal of the American Society for Information Science and Technology, 53(2), 120-133.
[17]
Dumais, S. T. (2002). Web experiments and test collections {Presentation}. Retrieved 20.4.03 from the World Wide Web: http:// www2002.org/presentations/dumais.pdf.
[18]
Efthimiadis, E. N., & Robertson, S. E. (1989). Feedback and interaction in information retrieval. In C. Oppenheim (Ed.), Perspectives in information management (pp. 257-272). London: Butterworths.
[19]
Eiron, N., & McCurley, K. (2003). Analysis of anchor text for web search. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval (pp. 459-460). Toronto, Canada. 28 July-1 August.
[20]
Fox, S., (2002, July 2002). Search engines {website} The Pew Internet & American Life Project. Retrieved 15.10.2002 from the World Wide Web: http://www.pewinternet.org/reports/toc.asp.
[21]
Hansen, M. H., & Shriver, E. (2001). Using navigation data to improve ir functions in the context of web search. In Proceedings of the tenth international conference on information and knowledge management (pp. 135-142). Atlanta, Georgia, USA. October.
[22]
He, D., Göker, A., & Harper, D. J. (2002). Combining evidence for automatic web session identification. Information Processing & Management, 38(5), 727-742.
[23]
Hölscher, C., & Strube, G. (2000). Web search behavior of internet experts and newbies. International Journal of Computer and Telecommunications Networking, 33(1-6), 337-346.
[24]
Hsieh-Yee, I. (2001). Research on web search behavior. Library & Information Science Research, 23(1), 168-185.
[25]
Huang, C.-K., Chien, L.-F., & Oyang, Y.-J. (2003). Relevant term suggestion in interactive web search based on contextual information in query session logs. Journal of the American Society for Information Science and Technology, 54(7), 638-649.
[26]
Jansen, B. J., & Pooch, U. (2001). Web user studies: a review and framework for future work. Journal of the American Society of Information Science and Technology, 52(3), 235-246.
[27]
Jansen, B. J., & Spink, A. (2003). An analysis of web information seeking and use: documents retrieved versus documents viewed. In Proceedings of the 4th international conference on Internet computing (pp. 65-69). Las Vegas, Nevada. 23-26 June.
[28]
Jansen, B. J., & Spink, A. (Forthcoming). An analysis of web searching by European alltheweb.Com users. Information Processing & Management.
[29]
Jansen, B. J., Spink, A., & Saracevic, T. (2000). Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing & Management, 36(2), 207-227.
[30]
Kea, H.-R., Kwakkelaarb, R., Taic, Y.-M., & Chen, L.-C. (2002). Exploring behavior of e-journal users in science and technology: transaction log analysis of Elsevier's Sciencedirect onsite in Taiwan. Library & Information Science Research, 24(1), 265-291.
[31]
Lalmas, M., & Ruthven, I. (1999). A framework for investigating the interaction in information retrieval. In Proceedings of 9th European-Japanese conferences on information modeling and knowledge bases (pp. 222-39). Iwate, Japan. 24-28 May.
[32]
Lawrence, S., & Giles, C. L. (1999). Accessibility of information on the web. Nature, 400, 107-109.
[33]
Lempel, R., & Moran, S. (2003). Predictive caching and prefetching of query results in search engines. In Proceedings of the twelfth international conference on World Wide Web (pp. 19-28). Budapest, Hungary.
[34]
Liawa, S.-S., & Huangb, H.-M. (2003). An investigation of user attitudes toward search engines as an information retrieval tool. Computers in Human Behavior, 19, 751-765.
[35]
Lin, S.-J. (2002). Design space of personalized indexing: enhancing successive web searching for transmuting information problems. In Proceedings of the eighth Americas conference on information systems (pp. 1092-1100). Dallas, Texas. 9-11 August.
[36]
Loken, E., Radlinski, F., Crespi, V. H., Millet, J., & Cushing, L. (2004). Online study behavior of 100,000 students preparing for the SAT, ACT, and GRE. Journal of Educational Computing Research, 30(3), 255-262.
[37]
Montgomery, A., & Faloutsos, C. (2001). Identifying web browsing trends and patterns. IEEE Computer, 34(7), 94-95.
[38]
Munarriz, R. A. (1997). How did it double? Daily double. Retrieved 10.11.2002 from the World Wide Web: http://www.fool.com/ ddouble/1997/ddouble970812.htm.
[39]
National Telecommunications and Information Administration. (2002). A nation online: How Americans are expanding their use of the internet. Washington, DC: US Department of Commerce.
[40]
Nielsen Media. (1997). Search engines most popular method of surfing the web {Website}. Commerce Net/Nielsen Media. Retrieved 30.8.2000 from the World Wide Web: http://www.commerce.net/news/press/0416.html.
[41]
Ozmutlu, H. C. & Cavdur, F. (Forthcoming). Application of automatic topic identification on excite web search engine data logs. Information Processing & Management.
[42]
Park, M., Bae, J., & Lee, S. (Forthcoming). End user searching: a Web log analysis of NAVAR, a Korean web search engine. Library & Information Science Research, 27(2).
[43]
Peters, T. (1993). The history and development of transaction log analysis. Library Hi Tech, 42(11), 41-66.
[44]
Pu, H. T. (2000). An exploratory analysis on search terms of network users in Taiwan {in Chinese}. Central Library Bulletin, 89(1), 23-37.
[45]
Rieh, S. Y., & Xu, H. (2001). Patterns and sequences of multiple query reformulation in web searching: a preliminary study. In Proceedings of the 64th annual meeting of the American society for information science and technology, pp. 246-255.
[46]
Romano, N. C., Donovan, C., Chen, H., & Nunamaker, J. F. (2003). A methodology for analyzing web-based qualitative data. Journal of Management Information Systems, 19(4), 213-246.
[47]
Ross, N., & Wolfram, D. (2000). End user searching on the internet: an analysis of term pair topics submitted to the excite search engine. Journal of the American Society for Information Science, 51(10), 949-958.
[48]
Silverstein, C., Henzinger, M., Marais, H., & Moricz, M. (1999). Analysis of a very large web search engine query log. SIGIR Forum, 33(1), 6-12.
[49]
Spink, A. (2004). Multitasking information behavior and information task switching: an exploratory study. Journal of Documentation, 60(3), 336-345.
[50]
Spink, A., & Jansen, B. J. (2004). Web search: public searching of the web. New York: Kluwer.
[51]
Spink, A., Jansen, B. J., Wolfram, D., & Saracevic, T. (2002a). From e-sex to e-commerce: Web search changes. IEEE Computer, 35(3), 107-111.
[52]
Spink, A., Ozmutlu, S., Ozmutlu, H. C., & Jansen, B. J. (2002b). US versus European Web searching trends. SIGIR Forum, 32(1), 30-37.
[53]
Spink, A., Wilson, T., Ellis, D., & Ford, F. (1998). Modeling users' successive searches in digital environments. D-Lib Magazine.
[54]
Voorbraak, F. (1991). On the justification of Dempster's rule of combination. Artificial Intelligence, 48(1), 171-197.
[55]
Wang, P., Berry, M., & Yang, Y. (2003). Mining longitudinal web queries: trends and patterns. Journal of the American Society for Information Science and Technology, 54(8), 743-758.
[56]
Watters, C. (1999). Information retrieval and the virtual document. Journal of the American Society for Information Science, 50(11), 1028-1029.
[57]
Wen, J.-R., Nie, J.-Y., & Zhang, H.-J. (2001). Clustering user queries of a search engine. In Proceedings of the 10th international conference on World Wide Web (pp. 162-168). Hong Kong. 1-5 May.
[58]
Wolfram, D., Spink, A., Jansen, B. J., & Saracevic, T. (2001). Vox populi: the public searching of the web. Journal of the American Society of Information Science and Technology, 52(12), 1073-1074.
[59]
Xie, Y., & O'Hallaron, D. (2002). Locality in search engine queries and its implications for caching. In Proceedings of the twenty-first annual joint conference of the IEEE computer and communications societies (pp. 307-317). New York City, New York, USA. 23-27 June.
[60]
Yu, L., & Apps, A. (2000). Studying e-journal user behavior using log files: the experience of superjournal. Library & Information Science Research, 22(3), 311-338.

Cited By

View all
  • (2020)A Comparison of Retrieval Result Relevance Judgments Between American and Chinese UsersJournal of Global Information Management10.4018/JGIM.202007010828:3(148-168)Online publication date: 1-Jul-2020
  • (2018)Cross-lingual analysis of English and Chinese web searchInternational Journal of Web and Grid Services10.5555/3292946.329294914:4(376-399)Online publication date: 1-Jan-2018
  • (2018)Characterising Dataset Search QueriesCompanion Proceedings of the The Web Conference 201810.1145/3184558.3191597(1485-1488)Online publication date: 23-Apr-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Information Processing and Management: an International Journal
Information Processing and Management: an International Journal  Volume 42, Issue 1
Special issue: Formal methods for information retrieval
January 2006
422 pages

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 January 2006

Author Tags

  1. transaction log analysis
  2. web search engines
  3. web searching

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2020)A Comparison of Retrieval Result Relevance Judgments Between American and Chinese UsersJournal of Global Information Management10.4018/JGIM.202007010828:3(148-168)Online publication date: 1-Jul-2020
  • (2018)Cross-lingual analysis of English and Chinese web searchInternational Journal of Web and Grid Services10.5555/3292946.329294914:4(376-399)Online publication date: 1-Jan-2018
  • (2018)Characterising Dataset Search QueriesCompanion Proceedings of the The Web Conference 201810.1145/3184558.3191597(1485-1488)Online publication date: 23-Apr-2018
  • (2017)Hashtag-centric Immersive Search on Social MediaProceedings of the 25th ACM international conference on Multimedia10.1145/3123266.3123442(1924-1932)Online publication date: 23-Oct-2017
  • (2017)Internet Search Roles of Adults in their HomesProceedings of the 2017 CHI Conference on Human Factors in Computing Systems10.1145/3025453.3025572(4948-4959)Online publication date: 2-May-2017
  • (2017)Supporting academic search tasks through citation visualization and explorationInternational Journal on Digital Libraries10.1007/s00799-016-0170-x18:1(59-72)Online publication date: 1-Mar-2017
  • (2017)The effects of credibility cues on the selection of search engine resultsJournal of the Association for Information Science and Technology10.1002/asi.2382068:8(1850-1862)Online publication date: 1-Aug-2017
  • (2017)Effects of task complexity on online search behavior of adolescentsJournal of the Association for Information Science and Technology10.1002/asi.2378268:6(1449-1461)Online publication date: 1-Jun-2017
  • (2017)Analysis of change in users' assessment of search results over timeJournal of the Association for Information Science and Technology10.1002/asi.2374568:5(1137-1148)Online publication date: 1-May-2017
  • (2016)SPYSEProceedings of the 38th International Conference on Software Engineering Companion10.1145/2889160.2889174(625-628)Online publication date: 14-May-2016
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media