Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2998181.2998183acmconferencesArticle/Chapter ViewAbstractPublication PagescscwConference Proceedingsconference-collections
research-article
Open access

Measuring Global Disease with Wikipedia: Success, Failure, and a Research Agenda

Published: 25 February 2017 Publication History

Abstract

Effective disease monitoring provides a foundation for effective public health systems. This has historically been accomplished with patient contact and bureaucratic aggregation, which tends to be slow and expensive. Recent internet-based approaches promise to be real-time and cheap, with few parameters. However, the question of when and how these approaches work remains open. We addressed this question using Wikipedia access logs and category links. Our experiments, replicable and extensible using our open source code and data, test the effect of semantic article filtering, amount of training data, forecast horizon, and model staleness by comparing across 6 diseases and 4 countries using thousands of individual models. We found that our minimal-configuration, language-agnostic article selection process based on semantic relatedness is effective for improving predictions, and that our approach is relatively insensitive to the amount and age of training data. We also found, in contrast to prior work, very little forecasting value, and we argue that this is consistent with theoretical considerations about the nature of forecasting. These mixed results lead us to propose that the currently observational field of internet-based disease surveillance must pivot to include theoretical models of information flow as well as controlled experiments based on simulations of disease.

References

[1]
Harshavardhan Achrekar et al. 2011. Predicting flu trends using Twitter data. In Computer Communications Workshops (INFOCOM Workshops)).
[2]
Harshavardhan Achrekar et al. 2012. Twitter improves seasonal influenza prediction. In Health Informatics (HEALTHINF). http://www.cs.uml.edu/~bliu/pub/healthinf_2012.pdf
[3]
Byung Gyu Ahn, Benjamin Van Durme, and Chris Callison-Burch. 2011. WikiTopics: What is popular on Wikipedia and why. In Workshop on Automatic Summarization for Different Genres, Media, and Languages (WASDGML). http://dl.acm.org/citation.cfm?id=2018987.2018992
[4]
Murray Aitken, Thomas Altmann, and Daniel Rosen. 2014. Engaging patients through social media. Tech report. IMS Institute for Healthcare Informatics.
[5]
Cristiano Alicino et al. Assessing Ebola-related web search behaviour: Insights and implications from an analytical study of Google Trends-based query volumes. Infectious Diseases of Poverty 4 (2015).
[6]
Tim Althoff et al. 2013. Analysis and forecasting of trending topics in online media streams. In Multimedia.
[7]
Benjamin M. Althouse, Yih Yng Ng, and Derek A. T. Cummings. Prediction of dengue incidence using 15http://colorbrewer2.org search query surveillance. PLOS Neglected Tropical Diseases 5, 8 (Aug. 2011).
[8]
Eiji Aramaki, Sachiko Maskawa, and Mizuki Morita. 2011. Twitter catches the flu: Detecting influenza epidemics using Twitter. In Empirical Methods in Natural Language Processing (EMNLP). http://dl. acm.org/citation.cfm?id=2145432.2145600
[9]
Ozgur M. Araz, Dan Bentley, and Robert L. Muelleman. Using Google flu Trends data in forecasting influenza-like-illness related ED visits in Omaha, Nebraska. The American Journal of Emergency Medicine 32, 9 (Sept. 2014).
[10]
Anoshé A. Aslam et al. The reliability of tweets as a supplementary method of seasonal influenza surveillance. Journal of Medical Internet Research 16, (Nov. 2014).
[11]
John W. Ayers et al. Seasonality in seeking mental health information on Google. American Journal of Preventive Medicine 44, 5 (May 2013).
[12]
Gyung Jin Bahk, Yong Soo Kim, and Myoung Su Park. Use of internet search queries to enhance surveillance of foodborne illness. Emerging Infectious Diseases 21, 11 (Nov. 2015).
[13]
Batuhan Bardak and Mehmet Tan. 2015. Prediction of influenza outbreaks by integrating Wikipedia article access logs and Google flu Trend data. In IEEE Bioinformatics and Bioengineering (BIBE).
[14]
Michał Bogdziewicz and Jakub Szymkowiak. Oak acorn crop and Google search volume predict Lyme disease risk in temperate Europe. Basic and Applied Ecology (Jan. 2016).
[15]
Stephanie M. Borchardt, Kathleen A. Ritger, and Mark S. Dworkin. Categorization, prioritization, and surveillance of potential bioterrorism agents. Infectious Disease Clinics of North America 20, 2 (June 2006).
[16]
Dena M. Bravata et al. Systematic review: Surveillance systems for early detection of bioterrorism-related diseases. Annals of Internal Medicine 140, 11 (June 2004).
[17]
Benjamin N. Breyer et al. Use of Google Insights for Search to track seasonal and geographic kidney stone incidence in the USA. Urology 78, 2 (Aug. 2011).
[18]
Francesco Brigo and Roberto Erro. Why do people Google movement disorders? An infodemiological study of information seeking behaviors. Neurological Sciences (Feb. 2016).
[19]
David Andre Broniatowski et al. Using social media to perform local influenza surveillance in an inner-city hospital: A retrospective observational study. JMIR Public Health and Surveillance 1, 1 (2015).
[20]
David A. Broniatowski, Michael J. Paul, and Mark Dredze. National and local influenza surveillance through Twitter: An analysis of the 2012-2013 influenza epidemic. PLOS ONE 8, 12 (Dec. 2013).
[21]
Logan C. Brooks et al. flexible modeling of epidemics with an empirical bayes framework. PLOS Computational Biology 11, 8 (Aug. 2015).
[22]
Matt Brooks. Was the NBA draft lottery rigged for the New Orleans Hornets to win? Washington Post (May 2012). https://www.washingtonpost.com/blogs/early-lead/post/was-the-nba-draft-lotteryrigged-for-the-new-orleans-hornets-towin/2012/05/31/gJQAmL5V4U_blog.html
[23]
Jan Burdziej and Piotr Gawrysiak. 2012. Using web mining for discovering spatial patterns and hot spots for spatial generalization. In Foundations of Intelligent Systems, Li Chen et al. (Eds.). Number 7661. http://link.springer.com/chapter/10.1007/ 978--3--642--34624--8_21
[24]
Declan Butler. When Google got flu wrong. Nature 494, 7436 (Feb. 2013).
[25]
Herman Anthony Carneiro and Eleftherios Mylonakis. Google Trends: A web-based tool for real-time surveillance of disease outbreaks. Clinical Infectious Diseases 49, 10 (Nov. 2009).
[26]
Rachael Cayce, Kathleen Hesterman, and Paul Bergstresser. Google technology in the surveillance of hand foot mouth disease in Asia. International Journal of Integrative Pediatrics and Environmental Medicine 1 (2014). http://www.ijipem.com/index.php/ijipem/article/view/6
[27]
Centers for Disease Control and Prevention (CDC). MMWR morbidity tables. (2015). http://wonder.cdc.gov/mmwr/mmwrmorb.asp
[28]
2016. Overview of influenza surveillance in the USA. Technical Report. Centers for Disease Control and Prevention (CDC). http://www.cdc.gov/flu/pdf/weekly/overview.pdf
[29]
Boris Cergol and Matjaz Omladić. What can Wikipedia and Google tell us about stock prices under diferent market regimes? Ars Mathematica Contemporanea 9, 2 (June 2015). http://amcjournal.eu/index.php/amc/article/view/561
[30]
Prithwish Chakraborty et al. 2014. Forecasting a moving target: Ensemble models for ILI case count predictions. In SIAM Data Mining.
[31]
Emily H. Chan et al. Using web search query data to monitor dengue epidemics: A new model for neglected tropical disease surveillance. PLOS Neglected Tropical Diseases 5, 5 (May 2011).
[32]
Jedsada Chartree. 2014. Monitoring dengue outbreaks using online data. Ph.D. University of North Texas. http://digital.library.unt.edu/ark:/67531/ metadc500167/m2/1/high_res_d/dissertation. pdf
[33]
Sungjin Cho et al. Correlation between national influenza surveillance data and Google Trends in South Korea. PLOS ONE 8, 12 (Dec. 2013).
[34]
Rumi Chunara et al. Online reporting for malaria surveillance using micro-monetary incentives, in urban India 2010-2011. Malaria Journal 11, 1 (Feb. 2012).
[35]
Rumi Chunara et al. flu Near You: An online self-reported influenza surveillance system in the USA. Online Journal of Public Health Informatics 5, 1 (March 2013).
[36]
Rumi Chunara, Jason R Andrews, and John S Brownstein. Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak. American Journal of Tropical Medicine and Hygiene 86, 1 (Jan. 2012).
[37]
Marek Ciglan and Kjetil Nørvåg. 2010. WikiPop: Personalized event detection system based on Wikipedia page view statistics. In Information and Knowledge Management (CIKM).
[38]
Nigel Collier et al. BioCaster: Detecting public health rumors with a Web-based text mining system. Bioinformatics 24, 24 (Dec. 2008).
[39]
Crystale Purvis Cooper et al. Cancer internet search activity on a major search engine, USA 2001-2003. Journal of Medical Internet Research 7, 3 (July 2005).
[40]
Aron Culotta. 2010. Towards detecting influenza epidemics by analyzing Twitter messages. In Workshop on Social Media Analytics (SOMA).
[41]
Aron Culotta. Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages. Language Resources and Evaluation 47, 1 (March 2013).
[42]
Aron Culotta. 2014. Estimating county health statistics with Twitter. In Human Factors in Computing Systems (CHI).
[43]
Michael W. Davidson, Dotan A. Haim, and Jennifer M. Radin. Using networks to combine "big data" and traditional surveillance to improve influenza predictions. Scientific Reports 5 (Jan. 2015).
[44]
Brian de Silva and Ryan Compton. Prediction of foreign box office revenues based on Wikipedia page activity. arXiv:1405.5924 {cs.SI} (May 2014). http://arxiv.org/abs/1405.5924
[45]
Rishi Desai et al. Norovirus disease surveillance using Google internet query share data. Clinical Infectious Diseases 55, 8 (Oct. 2012).
[46]
Son Doan, Lucila Ohno-Machado, and Nigel Collier. 2012. Enhancing Twitter data analysis with simple semantic filtering: Example in tracking influenza-like illnesses. In Healthcare Informatics, Imaging and Systems Biology (HISB).
[47]
Timothy J. Doyle, M. Kathleen Glynn, and Samuel L. Groseclose. Completeness of notifiable infectious disease reporting in the USA: An analytical literature review. American Journal of Epidemiology 155, 9 (Jan. 2002).
[48]
Andrea Freyer Dugas et al. Influenza forecasting with Google flu Trends. PLOS ONE 8, 2 (Feb. 2013).
[49]
Vanja M. Dukic, Michael Z. David, and Diane S. Lauderdale. Internet queries and methicillin-resistant Staphylococcus aureus surveillance. Emerging Infectious Diseases 17, 6 (June 2011).
[50]
Michael Edelstein et al. Detecting the norovirus season in Sweden using search engine data -- Meeting the needs of hospital infection control teams. PLOS ONE 9, 6 (June 2014).
[51]
Johannes C. Eichstaedt et al. Psychological language on Twitter predicts county-level heart disease mortality. Psychological Science 26, 2 (Feb. 2015).
[52]
Andreas Ekström et al. Forecasting emergency department visits using internet data. Annals of Emergency Medicine 65, 4 (April 2015).
[53]
Gunther Eysenbach. Infodemiology: Tracking flu-related searches on the web for syndromic surveillance. AMIA Annual Symposium 2006 (2006). http://www.ncbi.nlm.nih.gov/pmc/articles/ PMC1839505/
[54]
Geoffrey Fairchild et al. 2015. Eliciting disease data from Wikipedia articles. In Weblogs and Social Media (ICWSM) Workshops. http://www.aaai.org/ocs/ index.php/ICWSM/ICWSM15/paper/view/10630
[55]
Clark C. Freifeld et al. HealthMap: Global infectious disease monitoring through automated classification and visualization of internet media reports. Journal of the American Medical Informatics Association 15, 2 (Jan. 2008).
[56]
Thomas R. Frieden. A framework for public health action: The health impact pyramid. American Journal of Public Health 100, 4 (April 2010).
[57]
Nicholas Generous et al. Global disease monitoring and forecasting with Wikipedia. PLOS Computational Biology 10, 11 (Nov. 2014).
[58]
Francesco Gesualdo et al. Can Twitter be a source of information on allergy? Correlation of pollen counts with tweets reporting symptoms of allergic rhinoconjunctivitis and names of antihistamine drugs. PLOS ONE 10, 7 (July 2015).
[59]
Jeremy Ginsberg et al. Detecting influenza epidemics using search engine query data. Nature 457, 7232 (Nov. 2008).
[60]
Steven Gittelman et al. A new source of data for public health surveillance: Facebook likes. Journal of Medical Internet Research 17, 4 (April 2015).
[61]
Sharad Goel et al. Predicting consumer behavior with Web search. PNAS 107, 41 (Oct. 2010).
[62]
Janaína Gomide et al. 2011. Dengue surveillance based on a computational model of spatio-temporal locality of Twitter. In Web Science Conference (WebSci). http://www.websci11.org/fileadmin/websci/ Papers/92_paper.pdf
[63]
Yuzhou Gu et al. Early detection of an epidemic erythromelalgia outbreak using Baidu search data. Scientific Reports 5 (July 2015).
[64]
Akihito Hagihara, Shogo Miyazaki, and Takeru Abe. Internet suicide searches and the incidence of suicide in young people in Japan. European Archives of Psychiatry and Clinical Neuroscience 262, 1 (Feb. 2012).
[65]
Francis H. Harlow and Jacob E. Fromm. Computer experiments in fluid dynamics. Scientific American 212, 3 (March 1965).
[66]
Miguel Helft. Google uses web searches to track fluids spread. The New York Times (Nov. 2008). http://www.nytimes.com/2008/11/12/ technology/internet/12flu.html
[67]
Kyle S. Hickmann et al. Forecasting the 2013-2014 influenza season using Wikipedia. PLOS Computational Biology 11, 5 (May 2015).
[68]
Hideo Hirose and Liangliang Wang. 2012. Prediction of infectious disease spread using Twitter: A case of influenza. In Parallel Architectures, Algorithms and Programming (PAAP).
[69]
Arthur E. Hoerl and Robert W. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 1 (Feb. 1970).
[70]
Martin Rudi Holaker and Eirik Emanuelsen. 2013. Event detection using Wikipedia. Master's thesis. Institutt for Datateknikk og Informasjonsvitenskap. http://www.diva-portal.org/smash/record.jsf?pid=diva2:655606
[71]
Anette Hulth et al. Eye-opening approach to norovirus surveillance. Emerging Infectious Diseases 16, 8 (Aug. 2010).
[72]
Anette Hulth and Gustaf Rydevik. Web query-based surveillance in Sweden during the influenza A(H1N1)2009 pandemic, April 2009 to February 2010. Euro Surveillance 16, 18 (2011).
[73]
Anette Hulth, Gustaf Rydevik, and Annika Linde. Web queries as a source for syndromic surveillance. PLOS ONE 4, 2 (Feb. 2009).
[74]
Robert Koch Institute. SurvStat@RKI 2.0. (2016). https://survstat.rki.de/Content/Query/ Create.aspx
[75]
Instituto Nacional de Salud. Boletín epidemiológico. (2015). http://www.ins.gov.co/boletinepidemiologico/Paginas/default.aspx
[76]
Molly E. Ireland et al. Action tweets linked to reduced county-level HIV prevalence in the USA: Online messages and structural determinants. AIDS and Behavior (Dec. 2015).
[77]
Bao Jia-xing et al. 2013. Gonorrhea incidence forecasting research based on Baidu search data. In Management Science and Engineering (ICMSE).
[78]
Amy K. Johnson and Supriya D. Mehta. A comparison of internet search trends and sexually transmitted infection rates using Google Trends. Sexually Transmitted Diseases 41, 1 (Jan. 2014).
[79]
Heather A Johnson et al. Analysis of Web access logs for surveillance of influenza. Studies in Health Technology and Informatics 107, 2 (2004). http:// www.ncbi.nlm.nih.gov/ /15361003
[80]
Mirko Kämpf et al. The detection of emerging trends using Wikipedia traffic data and context networks. PLOS ONE 10, 12 (Dec. 2015).
[81]
Min Kang et al. Using Google Trends for influenza surveillance in South China. PLOS ONE 8, 1 (Jan. 2013).
[82]
M.-G. Kang et al. Google unveils a glimpse of allergic rhinitis in the real world. Allergy 70, 1 (Jan. 2015).
[83]
Asad Ullah Rafiq Khan, Mohammad Badruddin Khan, and Khalid Mahmood. 2015. Cloud service for assessment of news? popularity in internet based on Google and Wikipedia indicators. In National Symposium on Information Technology: Towards New Smart World (NSITNSW).
[84]
Eui-Ki Kim et al. Use of Hangeul Twitter to track and predict human influenza infection. PLOS ONE 8, 7 (July 2013).
[85]
Kwang Deok Kim and Liaquat Hossain. 2014. Towards early detection of influenza epidemics by using social media analytics. In DSS 2.0 -- Supporting Decision Making with New Technologies. Vol. 261.
[86]
Nicholas E. Kman and Daniel J. Bachmann. Biosurveillance: a review and update. Advances in Preventive Medicine 2012 (Jan. 2012).
[87]
Volker König and Ralph Mösges. A model for the determination of pollen count using Google search queries for patients suffering from allergic rhinitis. Journal of Allergy 2014 (June 2014).
[88]
Natalie Kupferberg and Bridget McCrate Protus. Accuracy and completeness of drug information in Wikipedia: An assessment. Journal of the Medical Library Association 99, 4 (Oct. 2011).
[89]
Alex Lamb, Michael J. Paul, and Mark Dredze. 2013. Separating fact from fear: Tracking flu infections on Twitter. In Human Language Technologies (NAACL-HLT). http://www.aclweb.org/anthology/N/N13/N131097.pdf
[90]
Vasileios Lampos et al. Advances in nowcasting influenza-like illness rates using search query logs. Scientific Reports 5 (Aug. 2015).
[91]
Vasileios Lampos et al. Assessing the impact of a health intervention via user-generated Internet content. Data Mining and Knowledge Discovery 29, 5 (July 2015).
[92]
Vasileios Lampos and Nello Cristianini. 2010. Tracking the flu pandemic by monitoring the social web. In Cognitive Information Processing (CIP).
[93]
Vasileios Lampos and Nello Cristianini. Nowcasting events from the social web with statistical learning. Transactions on Intelligent Systems and Technology 3, 4 (Sept. 2012).
[94]
Michaël R. Laurent and Tim J. Vickers. Seeking Health Information Online: Does Wikipedia Matter? Journal of the American Medical Informatics Association 16, 4 (July 2009).
[95]
David Lazer et al. The parable of Google flu: Traps in big data analysis. Science 343, 14 March (2014).
[96]
Andreas Leithner et al. Wikipedia and osteosarcoma: A trustworthy patients' information? Journal of the American Medical Informatics Association 17, 4 (Jan. 2010).
[97]
Shengli Li and Xichuan Zhou. Research of the correlation between the H1N1 morbidity data and Google Trends in Egypt. arXiv:1511.05300 {cs.SI} (Nov. 2015). http://arxiv.org/abs/1511.05300
[98]
Johan Lindh et al. Head lice surveillance on a deregulated OTC-sales market: A study using web query data. PLOS ONE 7, 11 (Nov. 2012).
[99]
Ruoqian Liu et al. 2014. Enhancing financial decision-making using social behavior modeling. In Social Network Mining and Analysis (SNAKDD).
[100]
Kevin Lutsky, Joseph Bernstein, and Pedro Beredjiklian. Quality of information on the internet about carpal tunnel syndrome: An update. Orthopedics 36, 8 (2013). http://www.healio.com/orthopedics/ journals/ortho/%7Bf97c8407--7483--4d26--9aac2b860b0e6d2c%7D/quality-of-information-onthe-internet-about-carpal-tunnel-syndromean-update
[101]
T. Ma et al. Syndromic surveillance of influenza activity in Sweden: an evaluation of three tools. Epidemiology & Infection 143, 11 (Aug. 2015).
[102]
Douglas Martin. Jack Twyman, N.B.A. star, dies at 78. The New York Times (May 2012). http://www.nytimes.com/2012/06/01/sports/ basketball/jack-twyman-nba-star-dies-at78.html
[103]
Leah J. Martin, B. E. Lee, and Yutaka Yasui. Google flu Trends in Canada: A comparison of digital disease surveillance data with physician consultations and respiratory virus surveillance data, 2010-2014. Epidemiology & Infection 144, 02 (Jan. 2016).
[104]
Leah J. Martin, Biying Xu, and Yutaka Yasui. Improving Google flu Trends estimates for the USA through transformation. PLOS ONE 9, 12 (Dec. 2014).
[105]
David J. McIver and John S. Brownstein. Wikipedia usage estimates prevalence of influenza-like illness in the USA in near real-time. PLOS Computational Biology 10, 4 (April 2014).
[106]
Wes McKinney. 2010. Data structures for statistical computing in Python. In Python in Science (SCIPY), Vol. 445. http://conference.scipy.org/ proceedings/scipy2010/pdfs/mckinney.pdf
[107]
Anthony J. McMichael. Globalization, climate change, and human health. New England Journal of Medicine 368, 14 (April 2013).
[108]
Márton Mestyán, Taha Yasseri, and János Kertész. Early prediction of movie box office success based on Wikipedia activity big data. PLOS ONE 8, 8 (Aug. 2013).
[109]
Gabriel J. Milinovich et al. Using internet search queries for infectious disease surveillance: Screening diseases for suitability. BMC Infectious Diseases 14 (2014).
[110]
David Milne and Ian H. Witten. An open-source toolkit for mining Wikipedia. Artificial Intelligence 194 (Jan. 2013).
[111]
Ministry of Health Israel. Weekly epidemiological reports. (2015). http://www.health.gov.il/ UnitsOffice/HD/PH/epidemiology/Pages/ epidemiology_report.aspx
[112]
Susan M. Mniszewski et al. 2014. Understanding the impact of face mask usage through epidemic simulation of large social networks. In Theories and Simulations of Complex Social Systems, Vahid Dabbaghian and Vijay Kumar Mago (Eds.). Number 52. http://link.springer.com/chapter/10.1007/ 978--3--642--39149--1_8
[113]
Helen Susannah Moat et al. Quantifying Wikipedia usage patterns before stock market moves. Scientific Reports 3 (May 2013).
[114]
Helen Susannah Moat et al. 2014. Anticipating stock market movements with Google and Wikipedia. In Nonlinear Phenomena in Complex Systems: From Nano to Macro Scale, Davron Matrasulov and H. Eugene Stanley (Eds.).
[115]
Ruchit Nagar et al. A case study of the New York City 2012-2013 influenza season with daily geocoded Twitter data from temporal and spatiotemporal perspectives. Journal of Medical Internet Research 16, 10 (Oct. 2014).
[116]
Anna C. Nagel et al. The complex relationship of realspace events and messages in cyberspace: Case study of influenza and pertussis using tweets. Journal of Medical Internet Research 15, 10 (Oct. 2013).
[117]
N.J.D. Nagelkerke. A note on a general definition of the coefficient of determination. Biometrika 78, 3 (1991).
[118]
Kok W. Ng. 2014. The use of Twitter to predict the level of influenza activity in the USA. M.S. Naval Postgraduate School. http://oai.dtic.mil/oai/ oai?verb=getRecord&metadataPrefix=html& identifier=ADA620696
[119]
Alex J. Ocampo, Rumi Chunara, and John S. Brownstein. Using search queries for malaria surveillance, Thailand. Malaria Journal 12, 1 (Nov. 2013).
[120]
Donald R. Olson et al. Reassessing Google flu Trends data for detection of seasonal and pandemic influenza: A comparative epidemiological study at three geographic scales. PLOS Computational Biology 9, 10 (Oct. 2013).
[121]
Miles Osborne et al. 2012. Bieber no more: First story detection using Twitter and Wikipedia. In SIGIR Workshop on Time-aware Information Access (TAIA). http://www.dcs.gla.ac.uk/~craigm/ publications/osborneTAIA2012.pdf
[122]
John Paparrizos, Ryen W. White, and Eric Horvitz. Screening for pancreatic adenocarcinoma using signals from web search logs: Feasibility study and results. Journal of Oncology Practice (June 2016).
[123]
Michael J. Paul and Mark Dredze. 2011. You are what you tweet: Analyzing Twitter for public health. In Weblogs and Social Media (ICWSM).
[124]
Michael J. Paul, Mark Dredze, and David Broniatowski. Twitter improves influenza forecasting. PLOS Currents (Oct. 2014).
[125]
Fabian Pedregosa et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, Oct (2011). http://jmlr.org/papers/v12/pedregosa11a.html
[126]
Camille Pelat et al. More diseases tracked by using Google Trends. Emerging Infectious Diseases 15, 8 (Aug. 2009).
[127]
Geng Peng and Jiyuan Wang. 2014. Detecting syphilis amount in China based on Baidu query data. In Soft Computing in Information Communication Technology (SCICT 2014).
[128]
Fernando Pérez and Brian E. Granger. IPython: A system for interactive scientific computing. Computing in Science & Engineering 9, 3 (2007).
[129]
Lyle R. Petersen et al. Zika virus. New England Journal of Medicine 374, 16 (April 2016).
[130]
David T. Plante and David G. Ingram. Seasonal trends in tinnitus symptomatology: Evidence from Internet search engine query data. European Archives of Oto-Rhino-Laryngology 272, 10 (Sept. 2014).
[131]
Philip M. Polgreen et al. Using internet searches for influenza surveillance. Clinical Infectious Diseases 47, 11 (Jan. 2008).
[132]
Tobias Preis and Helen Susannah Moat. Adaptive nowcasting of influenza outbreaks using Google searches. Royal Society Open Science 1, 2 (Oct. 2014).
[133]
Reid Priedhorsky et al. 2007. Creating, destroying, and restoring value in Wikipedia. In Supporting Group Work (GROUP).
[134]
Reid Priedhorsky, Geoffrey Fairchild, and Sara Y. Del Valle. Research:Geo-aggregation of Wikipedia pageviews. (2015). https://meta.wikimedia.org/ wiki/Research:Geoaggregation_of_Wikipedia_pageviews
[135]
Malolan S. Rajagopalan et al. Patient-oriented cancer information on the internet: A comparison of Wikipedia and a professionally maintained database. Journal of Oncology Practice 7, 5 (Jan. 2011).
[136]
Sudha Ram et al. Predicting asthma-related emergency department visits using big data. IEEE Journal of Biomedical and Health Informatics 19, 4 (July 2015).
[137]
Ronald E. Rice. Influences, usage, and outcomes of Internet health information searching: Multivariate results from the Pew surveys. International Journal of Medical Informatics 75, 1 (Jan. 2006).
[138]
Joshua Ritterman, Miles Osborne, and Ewan Klein. 2009. Using prediction markets and Twitter to predict a swine flu pandemic. In Workshop on Mining Social Media. http://homepages.inf.ed.ac.uk/miles/ papers/swine09.pdf
[139]
Caitlin M. Rivers et al. Modeling the impact of interventions on an epidemic of Ebola in Sierra Leone and Liberia. PLOS Currents (2014).
[140]
Ankit Rohatgi. WebPlotDigitizer. (Oct. 2015). http://arohatgi.info/WebPlotDigitizer
[141]
Mauricio Santillana et al. Using clinicians' search query data to monitor influenza epidemics. Clinical Infectious Diseases 59, 10 (Nov. 2014).
[142]
Mauricio Santillana et al. What can digital disease detection learn from (an external revision to) Google flu Trends? American Journal of Preventive Medicine 47, 3 (Sept. 2014).
[143]
Mauricio Santillana et al. Combining search, social media, and traditional data sources to improve influenza surveillance. PLOS Computational Biology 11, 10 (Oct. 2015).
[144]
Sercan Sarigul and Huaxia Rui. 2014. Nowcasting obesity in the U.S. using Google search volume data. In AAEA/EAAE/CAES Joint Symposium: Social Networks, Social Media and the Economics of Food. http://econpapers.repec.org/paper/ agsaajs14/166113.htm
[145]
Shilad Sen et al. 2014. WikiBrain: Democratizing computation on Wikipedia. In OpenSym.
[146]
Dong-Woo Seo et al. Cumulative query method for influenza surveillance using search engine data. Journal of Medical Internet Research 16, 12 (Dec. 2014).
[147]
Jeffrey Shaman and Alicia Karspeck. Forecasting seasonal outbreaks of influenza. Proceedings of the National Academy of Sciences 109, 50 (Nov. 2012).
[148]
Alessio Signorini. 2014. Use of social media to monitor and predict outbreaks and public opinion on health topics. Ph.D. University of Iowa. http://ir.uiowa.edu/etd/1503/
[149]
Alessio Signorini, Alberto Maria Segre, and Philip M. Polgreen. The use of Twitter to track levels of disease activity and public concern in the U.S. during the influenza A H1N1 pandemic. PLOS ONE 6, 5 (May 2011).
[150]
Amit Singhal. Introducing the Knowledge Graph: Things, not strings. (May 2012). https://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-thingsnot.html
[151]
Giovanni Stilo et al. 2014. Predicting flu epidemics using Twitter and historical data. In Brain Informatics and Health, Dominik Ślezak et al. (Eds.). Number 8609.
[152]
Michael Strube and Simone Paolo Ponzetto. 2006. WikiRelate! Computing semantic relatedness using Wikipedia. In AAAI, Vol. 6. http://www.aaai.org/Papers/AAAI/2006/AAAI06--223.pdf
[153]
Yla Tausczik et al. Public Anxiety and Information Seeking Following the H1N1 Outbreak: Blogs, Newspaper Articles, and Wikipedia Visits. Health Communication 27, 2 (2012).
[154]
flu Trends Team. The next chapter for flu Trends. (Aug. 2015). http://googleresearch.blogspot.com/2015/08/the-next-chapter-for-flu-trends.html
[155]
Marijn ten Thij et al. Modeling page-view dynamics on Wikipedia. arXiv:1212.5943 {physics} (Dec. 2012). http://arxiv.org/abs/1212.5943
[156]
Garry R. Thomas et al. An evaluation of Wikipedia as a resource for patient education in nephrology. Seminars in Dialysis 26, 2 (2013).
[157]
L. H. Thompson et al. Emergency department and 'Google flu Trends' data as syndromic surveillance indicators for seasonal influenza. Epidemiology & Infection 142, 11 (Nov. 2014).
[158]
Anna R. Thorner et al. Correlation between UpToDate searches and reported cases of Middle East respiratory syndrome during outbreaks in Saudi Arabia. Open Forum Infectious Diseases 3, 1 (Jan. 2016). http:/

Cited By

View all
  • (2024)Evaluating disease surveillance strategies for early outbreak detection in contact networks with varying community structureSocial Networks10.1016/j.socnet.2024.06.00379(122-132)Online publication date: Oct-2024
  • (2022)Wikipedia searches and the epidemiology of infectious diseasesData & Knowledge Engineering10.1016/j.datak.2022.102093142:COnline publication date: 1-Nov-2022
  • (2022)The SAINT observatory subsystem: an open-source intelligence tool for uncovering cybersecurity threatsInternational Journal of Information Security10.1007/s10207-022-00599-221:5(1091-1106)Online publication date: 3-Aug-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CSCW '17: Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing
February 2017
2556 pages
ISBN:9781450343350
DOI:10.1145/2998181
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 February 2017

Check for updates

Author Tags

  1. disease
  2. epidemiology
  3. forecasting
  4. modeling
  5. wikipedia

Qualifiers

  • Research-article

Funding Sources

Conference

CSCW '17
Sponsor:
CSCW '17: Computer Supported Cooperative Work and Social Computing
February 25 - March 1, 2017
Oregon, Portland, USA

Acceptance Rates

CSCW '17 Paper Acceptance Rate 183 of 530 submissions, 35%;
Overall Acceptance Rate 2,235 of 8,521 submissions, 26%

Upcoming Conference

CSCW '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)319
  • Downloads (Last 6 weeks)27
Reflects downloads up to 06 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Evaluating disease surveillance strategies for early outbreak detection in contact networks with varying community structureSocial Networks10.1016/j.socnet.2024.06.00379(122-132)Online publication date: Oct-2024
  • (2022)Wikipedia searches and the epidemiology of infectious diseasesData & Knowledge Engineering10.1016/j.datak.2022.102093142:COnline publication date: 1-Nov-2022
  • (2022)The SAINT observatory subsystem: an open-source intelligence tool for uncovering cybersecurity threatsInternational Journal of Information Security10.1007/s10207-022-00599-221:5(1091-1106)Online publication date: 3-Aug-2022
  • (2021)A general method for estimating the prevalence of influenza-like-symptoms with Wikipedia dataPLOS ONE10.1371/journal.pone.025685816:8(e0256858)Online publication date: 31-Aug-2021
  • (2020)Comparison of Social Media, Syndromic Surveillance, and Microbiologic Acute Respiratory Infection Data: Observational StudyJMIR Public Health and Surveillance10.2196/149866:2(e14986)Online publication date: 24-Apr-2020
  • (2020)Surveilling Influenza Incidence With Centers for Disease Control and Prevention Web Traffic Data: Demonstration Using a Novel DatasetJournal of Medical Internet Research10.2196/1433722:7(e14337)Online publication date: 3-Jul-2020
  • (2020)The Role of Wikipedia in providing information on coronavirus to Societies during the COVID-19 PandemicMiddle Black Sea Journal of Health Science10.19127/mbsjohs.7819306:3(316-324)Online publication date: 31-Dec-2020
  • (2020)Situating Wikipedia as a health information resource in various contexts: A scoping reviewPLOS ONE10.1371/journal.pone.022878615:2(e0228786)Online publication date: 18-Feb-2020
  • (2020)The impact of news exposure on collective attention in the United States during the 2016 Zika epidemicPLOS Computational Biology10.1371/journal.pcbi.100763316:3(e1007633)Online publication date: 12-Mar-2020
  • (2020)Social Media– and Internet-Based Disease Surveillance for Public HealthAnnual Review of Public Health10.1146/annurev-publhealth-040119-09440241:1(101-118)Online publication date: 2-Apr-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media