Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Web searching on the Vivisimo search engine

Published: 01 December 2006 Publication History

Abstract

The application of clustering to Web search engine technology is a novel approach that offers structure to the information deluge often faced by Web searchers. Clustering methods have been well studied in research labs; however, real user searching with clustering systems in operational Web environments is not well understood. This article reports on results from a transaction log analysis of Vivisimo.com, which is a Web meta-search engine that dynamically clusters users' search results. A transaction log analysis was conducted on 2-week's worth of data collected from March 28 to April 4 and April 25 to May 2, 2004, representing 100% of site traffic during these periods and 2,029,734 queries overall. The results show that the highest percentage of queries contained two terms. The highest percentage of search sessions contained one query and was less than 1 minute in duration. Almost half of user interactions with clusters consisted of displaying a cluster's result set, and a small percentage of interactions showed cluster tree expansion. Findings show that 11.1% of search sessions were multitasking searches, and there are a broad variety of search topics in multitasking search sessions. Other searching interactions and statistics on repeat users of the search engine are reported. These results provide insights into search characteristics with a cluster-based Web search engine and extend research into Web searching trends. © 2006 Wiley Periodicals, Inc.

References

[1]
Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. New York: ACM Press.
[2]
Cacheda, F., & Viña, Á. (2001). Experiences retrieving information in the World Wide Web. Paper presented at the 6th IEEE Symposium on Computers and Communications, Hammamet, Tunisia (pp. 72-79).
[3]
Chen, H., & Dumais, S. (2000). Bringing order to the Web: Automatically categorizing search results. Paper presented at the SIGCHI conference on Human Factors in Computing, Hague, The Netherlands (pp. 145-152). New York: ACM Press.
[4]
Hearst, M. (1999). User interfaces and visualization. In R. Baeza-Yates & B. Ribeiro-Neto (Eds.), Modern information retrieval (pp. 257-323). New York: ACM Press.
[5]
Hearst, M., & Pedersen, J. (1996). Reexamining the cluster hypothesis: Scatter/gather on retrieval results. Paper presented at the 19th annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland (pp. 76-84). New York: ACM Press.
[6]
Jansen, B.J., Spink, A., & Pedersen, J. (2005). Trend analysis of Alta Vista Web searching. Journal of the American Society for Information Science and Technology, 56(6), 559-570.
[7]
Jansen, B.J., Spink, A., & Saracevic, T. (2000). Real life, real users and real needs: A study and analysis of user queries on the Web. Information Processing and Management, 36(2), 207-227.
[8]
Korfhage, R.R. (1997). Information storage and retrieval. New York: Wiley.
[9]
Koshman, S., Spink, A., & Jansen, B.J. (2005). Using clusters on the Vivisimo Web search engine. Paper presented at the HCI 2005 HCI International Conference, Las Vegas, NV. Mahwah, NJ: Erlbaum.
[10]
Leydesdorff, L. (1989). Words and co-words as indicators of intellectual organization. Research Policy, 18, 209-223.
[11]
Montgomery, A., & Faloutsos, C. (2001). Identifying Web browsing trends and patterns. IEEE Computer, 34(7), 94-95.
[12]
Silverstein, C., Henzinger, M., Marais, H., & Moricz, M. (1999). Analysis of a very large Web search engine query log. ACM SIGIR Forum, 33(1), 6-12.
[13]
Spink, A., & Jansen, B.J. (2004). Web search: Public searching of the Web. Dordrecht, The Netherlands: Kluwer Academic.
[14]
Spink, A., Jansen, B.J., & Saracevic, T. (2001). Searching the Web: The public and their queries. Journal of the American Society for Information Science and Technology, 52(3), 226-234.
[15]
Spink, A., Jansen, B.J., Wolfram, D., & Saracevic, T. (2002). From e-sex to e-commerce: Web search changes. IEEE Computer, 35(3), 107-111.
[16]
Spink, A., Ozmutlu, H.C., & Ozmutlu, S. (2002). Multitasking information seeking and searching processes. Journal of the American Society for Information Science and Technology, 53(8), 639-652.
[17]
Spink, A., Park, M., Jansen, B.J., & Pedersen, J. (2006). Multitasking during Web search sessions. Information Processing and Management, 42(1), 264-275.
[18]
Wang, Y., & Kitsregawa, M. (2002). Evaluating contents-link coupled Web page clustering for Web search results. Paper presented at the 11th International Conference on Information and Knowledge Management, McLean, VA (pp. 499-506). New York: ACM Press.
[19]
Wolfram, D., Spink, A., Jansen, B.J., & Saracevic, T. (2001). Vox Populi: The public searching of the Web. Journal of the American Society for Information Science and Technology, 52(12), 1073-1074.
[20]
Xie, Y., & O'Hallaron, D. (2002). Locality in search engine queries and its implications for caching. Paper presented at the IEEE Infocom Conference, New York.
[21]
Zamir, O., & Etzioni, O. (1998). Web document clustering: A feasibility demonstration. Paper presented at the 21st annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia (pp. 46-54). New York: ACM Press.
[22]
Zamir, O., & Etzioni, O. (1999). Grouper: A dynamic clustering interface for Web search results. Retrieved January 10, 2004, from http://www.cs. washington.edu/research/projects/WebWare1/etzioni/www/papers/ www8.pdf
[23]
Zeng, H., He, Q., Chen, Z., Ma, W., & Ma, J. (2004). Learning to cluster Web results. Paper presented at the SIGIR Conference on Research and Development in Information Retrieval, Sheffield, England (pp. 210-216). New York: ACM Press.

Cited By

View all
  • (2017)A protocol-driven approach to automatically finding authoritative answers to consumer health questions in online resourcesJournal of the Association for Information Science and Technology10.1002/asi.2380668:7(1724-1736)Online publication date: 1-Jul-2017
  • (2017)A longitudinal study of user queries and browsing requests in a case-based reasoning retrieval systemJournal of the Association for Information Science and Technology10.1002/asi.2373868:5(1124-1136)Online publication date: 1-May-2017
  • (2016)Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similaritiesJournal of the Association for Information Science and Technology10.1002/asi.2337467:1(106-133)Online publication date: 1-Jan-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the American Society for Information Science and Technology
Journal of the American Society for Information Science and Technology  Volume 57, Issue 14
December 2006
122 pages

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 December 2006

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2017)A protocol-driven approach to automatically finding authoritative answers to consumer health questions in online resourcesJournal of the Association for Information Science and Technology10.1002/asi.2380668:7(1724-1736)Online publication date: 1-Jul-2017
  • (2017)A longitudinal study of user queries and browsing requests in a case-based reasoning retrieval systemJournal of the Association for Information Science and Technology10.1002/asi.2373868:5(1124-1136)Online publication date: 1-May-2017
  • (2016)Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similaritiesJournal of the Association for Information Science and Technology10.1002/asi.2337467:1(106-133)Online publication date: 1-Jan-2016
  • (2014)Website Community Mining from Query Logs with Two-Phase ClusteringProceedings of the 15th International Conference on Computational Linguistics and Intelligent Text Processing - Volume 840410.1007/978-3-642-54903-8_17(201-212)Online publication date: 6-Apr-2014
  • (2012)Evaluating subtopic retrieval methodsInformation Processing and Management: an International Journal10.1016/j.ipm.2011.08.00448:2(358-373)Online publication date: 1-Mar-2012
  • (2011)Toward a web search model: Integrating multitasking, cognitive coordination, and cognitive shiftsJournal of the American Society for Information Science and Technology10.1002/asi.2155162:8(1446-1472)Online publication date: 1-Aug-2011
  • (2010)An approach to semantic information retrieval based on natural language query understandingProceedings of the 10th international conference on Current trends in web engineering10.5555/1927229.1927251(211-222)Online publication date: 5-Jul-2010
  • (2010)Mining Query LogsFoundations and Trends in Information Retrieval10.1561/15000000134:1—2(1-174)Online publication date: 1-Jan-2010
  • (2010)Enhancing search in a geospatial multimedia annotation systemProceedings of the 12th International Conference on Information Integration and Web-based Applications & Services10.1145/1967486.1967582(617-624)Online publication date: 8-Nov-2010
  • (2010)Query-oriented clusteringProceedings of the 2010 ACM Symposium on Applied Computing10.1145/1774088.1774467(1789-1795)Online publication date: 22-Mar-2010
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media