Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3201064.3201092acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article

Query for Architecture, Click through Military: Comparing the Roles of Search and Navigation on Wikipedia

Published: 15 May 2018 Publication History

Abstract

As one of the richest sources of encyclopedic information on the Web, Wikipedia generates an enormous amount of traffic. In this paper, we study large-scale article access data of the English Wikipedia in order to compare articles with respect to the two main paradigms of information seeking, i.e., search by formulating a query, and navigation by following hyperlinks. To this end, we propose and employ two main metrics, namely (i) searchshare -- the relative amount of views an article received by search --, and (ii) resistance -- the ability of an article to relay traffic to other Wikipedia articles -- to characterize articles. We demonstrate how articles in distinct topical categories differ substantially in terms of these properties. For example, architecture-related articles are often accessed through search and are simultaneously a "dead end'' for traffic, whereas historical articles about military events are mainly navigated. We further link traffic differences to varying network, content, and editing activity features. Lastly, we measure the impact of the article properties by modeling access behavior on articles with a gradient boosting approach. The results of this paper constitute a step towards understanding human information seeking behavior on the Web.

References

[1]
Tim Berners-Lee, Mark Fischetti, and Michael L Foreword By-Dertouzos . 2000. Weaving the Web: The original design and ultimate destiny of the World Wide Web by its inventor. HarperInformation.
[2]
David M Blei, Andrew Y Ng, and Michael I Jordan . 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research Vol. 3, Jan (2003), 993--1022.
[3]
Flavio Chierichetti, Ravi Kumar, Prabhakar Raghavan, and Tamas Sarlos . 2012. Are web users really markovian?. In Proceedings of the 21st international conference on World Wide Web. ACM, 609--618.
[4]
Julie Coiro and Elizabeth Dobler . 2007. Exploring the online reading comprehension strategies used by sixth-grade skilled readers to search for and locate information on the Internet. Reading research quarterly Vol. 42, 2 (2007), 214--257.
[5]
Dimitar Dimitrov, Philipp Singer, Denis Helic, and Markus Strohmaier . 2015. The Role of Structural Information for Designing Navigational User Interfaces Conference on Hypertext and Social Media.
[6]
Dimitar Dimitrov, Philipp Singer, Florian Lemmerich, and Markus Strohmaier . 2016. Visual Positions of Links and Clicks on Wikipedia. In Int. Conference Companion on World Wide Web.
[7]
Dimitar Dimitrov, Philipp Singer, Florian Lemmerich, and Markus Strohmaier . 2017. What Makes a Link Successful on Wikipedia?. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 917--926.
[8]
Fabian Flöck, Kenan Erdogan, and Maribel Acosta . 2017. TokTrack: A Complete Token Provenance and Change Tracking Dataset for the English Wikipedia. deftempurl%https://aaai.org/ocs/index.php/ICWSM/ICWSM17/paper/view/15689 tempurl
[9]
George W Furnas . 1997. Effective view navigation. In Proceedings of the ACM SIGCHI Conference on Human factors in computing systems. ACM, 367--374.
[10]
George W. Furnas, Thomas K. Landauer, Louis M. Gomez, and Susan T. Dumais . 1987. The vocabulary problem in human-system communication. Commun. ACM Vol. 30, 11 (1987), 964--971.
[11]
Patrick Gilderslave and Taha Yasseri . 2017. Inspiration, Captivation, and Misdirection: Emergent Properties in Networks of Online Navigation. arXiv:1710.03326 (2017).
[12]
Denis Helic, Markus Strohmaier, Michael Granitzer, and Reinhold Scherer . 2013. Models of human navigation in information networks based on decentralized search Conference on Hypertext and Social Media.
[13]
R Kumar and A Tomkins . 2009. A Characterization of Online Search Behaviour. Data Engineering Bullettin Vol. 32, 2 (2009), 2009.
[14]
Ravi Kumar and Andrew Tomkins . 2010. A characterization of online browsing behavior. In Proceedings of the 19th international conference on World wide web. ACM, 561--570.
[15]
Daniel Lamprecht, Dimitar Dimitrov, Denis Helic, and Markus Strohmaier . 2016. Evaluating and improving navigability of Wikipedia: A comparative study of eight language editions. In Proceedings of the 12th International Symposium on Open Collaboration. ACM, 17.
[16]
Daniel Lamprecht, Kristina Lerman, Denis Helic, and Markus Strohmaier . 2017. How the structure of wikipedia articles influences user navigation. New Review of Hypermedia and Multimedia Vol. 23, 1 (2017), 29--50.
[17]
Janette Lehmann, Claudia Müller-Birn, David Laniado, Mounia Lalmas, and Andreas Kaltenbrunner . 2014. Reader preferences and behavior on wikipedia. In Conference on Hypertext and Social Media.
[18]
Donald J Leu, Jill Castek, D Hartman, Julie Coiro, L Henry, J Kulikowich, and Stacy Lyver . 2005. Evaluating the development of scientific knowledge and new forms of reading comprehension during online learning. Final report presented to the North Central Regional Educational Laboratory/Learning Point Associates. Retrieved May Vol. 15 (2005), 2006.
[19]
Donald J Leu, Heidi Everett-Cacopardo, Lisa Zawilinski, Greg McVerry, and W Ian O'Byrne . 2012. New Literacies of online reading comprehension. The Encyclopedia of Applied Linguistics (2012).
[20]
Anne Mangen . 2008. Hypertext fiction reading: haptics and immersion. Journal of research in reading Vol. 31, 4 (2008), 404--419.
[21]
Connor McMahon, Isaac Johnson, and Brent Hecht . 2017. The Substantial Interdependence of Wikipedia and Google: A Case Study on the Relationship Between Peer Production Communities and Information Technologies. (2017).
[22]
Theodor H Nelson . 1965. Complex information processing: a file structure for the complex, the changing and the indeterminate. In Proceedings of the 1965 20th national conference. ACM, 84--100.
[23]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd . 1999. The PageRank citation ranking: bringing order to the web. Stanford InfoLab.
[24]
Ashwin Paranjape, Robert West, Leila Zia, and Jure Leskovec . 2016. Improving Website Hyperlink Structure Using Server Logs Int. Conference on Web Search and Data Mining.
[25]
Peter LT Pirolli and James E Pitkow . 1999. Distributions of Surfers' Paths through the World Wide Web: Empirical Characterizations. World Wide Web Vol. 2, 1--2 (1999), 29--45.
[26]
Jacob Ratkiewicz, Santo Fortunato, Alessandro Flammini, Filippo Menczer, and Alessandro Vespignani . 2010. Characterizing and modeling the dynamics of online popularity. Physical review letters Vol. 105, 15 (2010), 158701.
[27]
Radim v Rehr uv rek and Petr Sojka . 2010. Software Framework for Topic Modelling with Large Corpora Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, 45--50.
[28]
Anna Samoilenko, Florian Lemmerich, Katrin Weller, Maria Zens, and Markus Strohmaier . 2017. Analysing Timelines of National Histories across Wikipedia Editions: A Comparative Computational Approach. In Proceedings of the Eleventh International AAAI Conference on Web an Social Media (ICWSM 2017). 210--219.
[29]
Philipp Singer, Denis Helic, Andreas Hotho, and Markus Strohmaier . 2015. Hyptrails: A bayesian approach for comparing hypotheses about human trails on the web Int. Conference on World Wide Web.
[30]
Philipp Singer, Denis Helic, Behnam Taraghi, and Markus Strohmaier . 2014. Detecting memory and structure in human navigation patterns using markov chain models of varying order. PloS One Vol. 9, 7 (2014), e102070.
[31]
Philipp Singer, Florian Lemmerich, Robert West, Leila Zia, Ellery Wulczyn, Markus Strohmaier, and Jure Leskovec . 2017. Why We Read Wikipedia. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1591--1600.
[32]
Anselm Spoerri . 2007. What is popular on Wikipedia and why? First Monday Vol. 12, 4 (2007).
[33]
Marijn ten Thij, Yana Volkovich, David Laniado, and Andreas Kaltenbrunner . 2012. Modeling and predicting page-view dynamics on Wikipedia. CoRR Vol. abs/1212.5943 (2012).
[34]
Vivienne Waller . 2011. The search queries that took Australian Internet users to Wikipedia. Information Research Vol. 16, 2 (2011).
[35]
William Webber, Alistair Moffat, and Justin Zobel . 2010. A similarity measure for indefinite rankings. ACM Transactions on Information Systems (TOIS) Vol. 28, 4 (2010), 20.
[36]
Ingmar Weber and Alejandro Jaimes . 2011. Who uses web search for what: and how. In International Conference on Web Search and Data Mining.
[37]
Robert West and Jure Leskovec . 2012. Human wayfinding in information networks. In Int. Conference on World Wide Web.
[38]
Ellery Wulczyn and Dario Taraborelli . 2016. Wikipedia Clickstream. figshare. Accessed: 2017--5--3

Cited By

View all
  • (2023)Understanding Search Behavior Bias in WikipediaAdvances in Bias and Fairness in Information Retrieval10.1007/978-3-031-37249-0_11(134-146)Online publication date: 15-Jul-2023
  • (2022)Going Down the Rabbit Hole: Characterizing the Long Tail of Wikipedia Reading SessionsCompanion Proceedings of the Web Conference 202210.1145/3487553.3524930(1324-1330)Online publication date: 25-Apr-2022
  • (2022)ISRE-Framework: nonlinear and multimodal exploration of image search result spacesMultimedia Tools and Applications10.1007/s11042-022-12561-481:19(27275-27308)Online publication date: 25-Mar-2022
  • Show More Cited By

Index Terms

  1. Query for Architecture, Click through Military: Comparing the Roles of Search and Navigation on Wikipedia

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WebSci '18: Proceedings of the 10th ACM Conference on Web Science
      May 2018
      399 pages
      ISBN:9781450355636
      DOI:10.1145/3201064
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 May 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. log analysis
      2. navigation behavior
      3. search behavior
      4. wikipedia

      Qualifiers

      • Research-article

      Conference

      WebSci '18
      Sponsor:
      WebSci '18: 10th ACM Conference on Web Science
      May 27 - 30, 2018
      Amsterdam, Netherlands

      Acceptance Rates

      WebSci '18 Paper Acceptance Rate 30 of 113 submissions, 27%;
      Overall Acceptance Rate 245 of 933 submissions, 26%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 18 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Understanding Search Behavior Bias in WikipediaAdvances in Bias and Fairness in Information Retrieval10.1007/978-3-031-37249-0_11(134-146)Online publication date: 15-Jul-2023
      • (2022)Going Down the Rabbit Hole: Characterizing the Long Tail of Wikipedia Reading SessionsCompanion Proceedings of the Web Conference 202210.1145/3487553.3524930(1324-1330)Online publication date: 25-Apr-2022
      • (2022)ISRE-Framework: nonlinear and multimodal exploration of image search result spacesMultimedia Tools and Applications10.1007/s11042-022-12561-481:19(27275-27308)Online publication date: 25-Mar-2022
      • (2021)A Personalized Search Query Generating Method for Safety-Enhanced Vehicle-to-People NetworksIEEE Transactions on Vehicular Technology10.1109/TVT.2021.307562670:6(5296-5307)Online publication date: Jun-2021
      • (2020)Eyeing CRISPR on Wikipedia: Using Eye Tracking to Assess What Lay Audiences Look for to Learn about CRISPR and Genetic EngineeringEnvironmental Communication10.1080/17524032.2020.1723668(1-18)Online publication date: 11-Mar-2020
      • (2019)On the right track! Analysing and Predicting Navigation Success in WikipediaProceedings of the 30th ACM Conference on Hypertext and Social Media10.1145/3342220.3343650(143-152)Online publication date: 12-Sep-2019

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media