Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2641580.2641612acmconferencesArticle/Chapter ViewAbstractPublication PagesopencollabConference Proceedingsconference-collections
tutorial

Information Evolution in Wikipedia

Published: 27 August 2014 Publication History

Abstract

The Web of data is constantly evolving based on the dynamics of its content. Current Web search engine technologies consider static collections and do not factor in explicitly or implicitly available temporal information, that can be leveraged to gain insights into the dynamics of the data. In this paper, we hypothesize that by employing the temporal aspect as the primary means for capturing the evolution of entities, it is possible to provide entity-based accessibility to Web archives. We empirically show that the edit activity on Wikipedia can be exploited to provide evidence of the evolution of Wikipedia pages over time, both in terms of their content and in terms of their temporally defined relationships, classified in literature as events. Finally, we present results from our extensive analysis of a dataset consisting of 31,998 Wikipedia pages describing politicians, and observations from in-depth case studies. Our findings reflect the usefulness of leveraging temporal information in order to study the evolution of entities and breed promising grounds for further research.

References

[1]
E. Adar, J. Teevan, S. T. Dumais, and J. L. Elsas. The web changes everything: understanding the dynamics of web content. In Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM '09, pages 282--291, 2009.
[2]
J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking. In ACM SIGIR, 1998.
[3]
O. Alonso, J. Strötgen, R. Baeza-Yates, and M. Gertz. Temporal Information Retrieval: Challenges and Opportunities. In International Temporal Web Analytics Workshop, TWAW at WWW, 2011.
[4]
U. Brandes and J. Lerner. Revision and co-revision in Wikipedia. In Bridging the Gap between Semantic Web and Web 2.0, SemNet, 2007.
[5]
A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, Sept. 2002.
[6]
J. Carter. Cluebot and vandalism on wikipedia. 2010.
[7]
A. Ceroni and M. Fisichella. Towards an entity-based automatic event validation. In Proceedings of the 36th European Conference on IR Research, ECIR, pages 605--611, 2014.
[8]
M. Ciglan and K. Nørvåg. WikiPop: personalized event detection system based on Wikipedia page view statistics.
[9]
A. Das Sarma, A. Jain, and C. Yu. Dynamic relationship and event discovery. In WSDM, 2011.
[10]
D. Downey, S. Dumais, D. Liebling, and E. Horvitz. Understanding the relationship between searchers' queries and information goals. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM '08, pages 449--458. ACM, 2008.
[11]
M. Ferron and P. Massa. Collective memory building in Wikipedia: the case of north african uprisings. In WikiSym, 2011.
[12]
O. Ferschke, T. Zesch, and I. Gurevych. Wikipedia revision toolkit: Efficiently accessing wikipedia's edit history. HLT '11, 2011.
[13]
D. Fetterly, M. Manasse, M. Najork, and J. Wiener. A large-scale study of the evolution of web pages. In Proceedings of the 12th international conference on World Wide Web, WWW '03, pages 669--678, 2003.
[14]
M. Fisichella, A. Stewart, K. Denecke, and W. Nejdl. Unsupervised public health event detection for epidemic intelligence. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM '10, pages 1881--1884. ACM, 2010.
[15]
P. K.-F. Fong and R. P. Biuk-Aghai. What did they do?: deriving high-level edit histories in wikis. In P. Ayers and F. Ortega, editors, Int. Sym. Wikis. ACM, 2010.
[16]
M. Georgescu, N. Kanhabua, D. Krause, W. Nejdl, and S. Siersdorfer. Extracting event-related information from article updates in wikipedia. In ECIR, 2013.
[17]
M. Georgescu, D. D. Pham, N. Kanhabua, S. Zerr, S. Siersdorfer, and W. Nejdl. Temporal summarization of event-related updates in wikipedia. In Proceedings of the 22nd international conference on World Wide Web companion, pages 281--284. International World Wide Web Conferences Steering Committee, 2013.
[18]
J. Hoffart, F. Suchanek, K. Berberich, and G. Weikum. Yago2: A spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence Journal, Special Issue on Wikipedia and Semi-Structured Resources, 2012.
[19]
B. Keegan, D. Gergle, and N. Contractor. Staying in the loop: structure and dynamics of wikipedia's breaking news collaborations. In WikiSym 2012, 2012.
[20]
R. Kumar and A. Tomkins. A characterization of online browsing behavior. In Proceedings of the 19th International Conference on World Wide Web, WWW '10, pages 561--570, New York, NY, USA, 2010. ACM.
[21]
E. Kuzey and G. Weikum. Extraction of temporal facts and events from wikipedia. In TempWeb 2012.
[22]
A. Lih. Wikipedia as participatory journalism: Reliable sources? metrics for evaluating collaborative media as a news resource. In the 5th International Symposium on Online Journalism, 2004.
[23]
O. Medelyan, D. Milne, C. Legg, and I. H. Witten. Mining meaning from Wikipedia. Int. J. Hum.-Comput. Stud., 67, 2009.
[24]
A. Ntoulas, J. Cho, and C. Olston. What's new on the web?: the evolution of the web from a search engine perspective. In Proceedings of the 13th international conference on World Wide Web, WWW '04, pages 1--12, 2004.
[25]
S. Nunes, C. Ribeiro, and G. David. Wikichanges: exposing wikipedia revision activity. In A. Aguiar and M. Bernstein, editors, Int. Sym. Wikis. ACM, 2008.
[26]
S. P. Ponzetto and M. Strube. Wikitaxonomy: A large scale knowledge resource. In ECAI 2008, Frontiers in Artificial Intelligence and Applications.
[27]
D. E. Rose and D. Levinson. Understanding user goals in web search. In Proceedings of the 13th International Conference on World Wide Web, WWW '04, pages 13--19, New York, NY, USA, 2004. ACM.
[28]
D. Shan, W. X. Zhao, R. Chen, B. Shu, Z. Wang, J. Yao, H. Yan, and X. Li. Eventsearch: a system for event discovery and retrieval on multi-type historical data. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '12, pages 1564--1567, New York, NY, USA, 2012. ACM.
[29]
R. Shaw, R. Troncy, and L. Hardman. Lode: Linking open descriptions of events. In A. Gómez-Pérez, Y. Yu, and Y. Ding, editors, The Semantic Web, volume 5926 of Lecture Notes in Computer Science, pages 153--167. Springer Berlin Heidelberg, 2009.
[30]
J. Strötgen and M. Gertz. Event-centric search and exploration in document collections. In Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, JCDL '12, pages 223--232, New York, NY, USA, 2012. ACM.
[31]
G. B. Tran and M. Alrifai. Indexing and analyzing wikipedia's current events portal, the daily news summaries by the crowd. Proceedings of World Wide Web 2014, Web Science Track, April 2014.
[32]
T. A. Tuan, S. Elbassuoni, N. Preda, and G. Weikum. Cate: context-aware timeline for entity illustration. In WWW 2011, 2011.
[33]
W. R. van Hage, V. Malaisé, R. Segers, L. Hollink, and G. Schreiber. Design and use of the simple event model (sem). Web Semantics: Science, Services and Agents on the World Wide Web, 9(2):128--136, 2011. Provenance in the Semantic Web.
[34]
A. G. West, S. Kannan, and I. Lee. Stiki: an anti-vandalism tool for wikipedia using spatio-temporal analysis of revision metadata. In Proceedings of the 6th International Symposium on Wikis and Open Collaboration, page 32. ACM, 2010.
[35]
R. W. White, W. Chu, A. Hassan, X. He, Y. Song, and H. Wang. Enhancing personalized search by mining and modeling task behavior. In Proceedings of the 22Nd International Conference on World Wide Web, WWW '13, pages 1411--1420. International World Wide Web Conferences Steering Committee, 2013.

Cited By

View all
  • (2021)Event Detection in Wikipedia Edit History Improved by Documents Web Based Automatic AssessmentBig Data and Cognitive Computing10.3390/bdcc50300345:3(34)Online publication date: 4-Aug-2021
  • (2021)Tracing the Factoids: the Anatomy of Information Re-organization in Wikipedia ArticlesCompanion Proceedings of the Web Conference 202110.1145/3442442.3452342(572-579)Online publication date: 19-Apr-2021
  • (2021)Structured Object Matching across Web Page Revisions2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00115(1284-1295)Online publication date: Apr-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
OpenSym '14: Proceedings of The International Symposium on Open Collaboration
August 2014
302 pages
ISBN:9781450330169
DOI:10.1145/2641580
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • TJEF: The John Ernest Foundation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 August 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Entity Evolution
  2. Events
  3. Temporal Information
  4. Wikipedia

Qualifiers

  • Tutorial
  • Research
  • Refereed limited

Conference

OpenSym '14

Acceptance Rates

OpenSym '14 Paper Acceptance Rate 29 of 64 submissions, 45%;
Overall Acceptance Rate 108 of 195 submissions, 55%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Event Detection in Wikipedia Edit History Improved by Documents Web Based Automatic AssessmentBig Data and Cognitive Computing10.3390/bdcc50300345:3(34)Online publication date: 4-Aug-2021
  • (2021)Tracing the Factoids: the Anatomy of Information Re-organization in Wikipedia ArticlesCompanion Proceedings of the Web Conference 202110.1145/3442442.3452342(572-579)Online publication date: 19-Apr-2021
  • (2021)Structured Object Matching across Web Page Revisions2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00115(1284-1295)Online publication date: Apr-2021
  • (2017)JustEvents: A Crowdsourced Corpus for Event Validation with Strict Temporal ConstraintsAdvances in Information Retrieval10.1007/978-3-319-56608-5_38(484-492)Online publication date: 8-Apr-2017
  • (2015)The implications of Wikipedia for contemporary science educationProceedings of the 3rd International Conference on Technological Ecosystems for Enhancing Multiculturality10.1145/2808580.2808641(403-410)Online publication date: 7-Oct-2015
  • (2014)WikipEventProceedings of the 2014 International Conference on Posters & Demonstrations Track - Volume 127210.5555/2878453.2878485(125-128)Online publication date: 21-Oct-2014
  • (2014)WikipEvent: Leveraging Wikipedia Edit History for Event DetectionWeb Information Systems Engineering – WISE 201410.1007/978-3-319-11746-1_7(90-108)Online publication date: 2014

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media