Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3559613.3563206acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
short-paper

Darwin's Theory of Censorship: Analysing the Evolution of Censored Topics with Dynamic Topic Models

Published: 07 November 2022 Publication History

Abstract

We present a statistical analysis of changes in the Internet censorship policy of the government of India from 2016 to 2020. Using longitudinal observations of censorship collected by the ICLab censorship measurement project, together with historical records of web page contents collected by the Internet Archive, we find that machine classification techniques can detect censors' reactions to events without prior knowledge of what those events are. However, gaps in ICLab's observations can cause the classifier to fail to detect censored topics, and gaps in the Internet Archive's records can cause it to misidentify them.

References

[1]
Ziv Bar-Yossef and Sridhar Rajagopalan. 2002. Template detection via data mining and its applications. In Web Conference. 580--591. https://doi.org/10.1145/511446.511522
[2]
Adrien Barbaresi. 2021. Trafilatura: A Web Scraping Library and Command-Line Tool for Text Discovery and Extraction. In Annual Meeting of the Association for Computational Linguistics. 122--131. https://doi.org/10.18653/v1/2021.acl-demo.15
[3]
Mehrab Bin Morshed, Michaelanne Dye, Syed Ishtiaque Ahmed, and Neha Kumar. 2017. When the Internet Goes Down in Bangladesh. In Computer Supported Cooperative Work. 1591--1604. https://doi.org/10.1145/2998181.2998237
[4]
David M. Blei and John D. Lafferty. 2006. Dynamic Topic Models. In International Conference on Machine Learning. 113--120.
[5]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, Vol. 3 (March 2003), 993--1022.
[6]
Abdelberi Chaabane, Terence Chen, Mathieu Cunche, Emiliano De Cristofaro, Arik Friedman, and Mohamed Kaafar. 2014. Censorship in the Wild: Analyzing Internet Filtering in Syria. In Internet Measurement Conference. 285--298.
[7]
Le Chen, Chi Zhang, and Christo Wilson. 2013. Tweeting under Pressure: Analyzing Trending Topics and Evolving Word Choice on Sina Weibo. In Online Social Networks. 89--100.
[8]
Shinyoung Cho, Rishab Nithyanand, Abbas Razaghpanah, and Phillipa Gill. 2017. A Churn for the Better: Localizing Censorship Using Network-Level Path Churn and Network Tomography. In Conference on Emerging Networking EXperiments and Technologies (CoNEXT). 81--87. https://doi.org/10.1145/3143361.3143386
[9]
Alexander Darer, Oliver Farnan, and Joss Wright. 2018. Automated discovery of internet censorship by web crawling. In Web Science. 195--204.
[10]
Ronald Deibert, John Palfrey, Rafal Rohozinski, and Jonathan Zittrain (Eds.). 2010. Access Controlled: The Shaping of Power, Rights, and Rule in Cyberspace.
[11]
Adji B. Dieng, Francisco J. R. Ruiz, and David M. Blei. 2019. The Dynamic Embedded Topic Model. (2019). showeprint[arXiv]1907.05545 [cs.CL]
[12]
Arturo Filasto and Jacob Appelbaum. 2012. OONI: Open Observatory of Network Interference. In Free and Open Communications on the Internet. 8 pages.
[13]
FortiNet. 2005--. FortiGuard Labs Web Filter. https://fortiguard.com/webfilter
[14]
Devashish Gosain, Anshika Agarwal, Sahil Shekhawat, Hrishikesh B. Acharya, and Sambuddho Chakravarty. 2018. Mending Wall: On the Implementation of Censorship in India. In Security and Privacy in Communication Networks. 418--437. showeprint[arXiv]1806.06518 [cs.CR]
[15]
Derek Greene and James P. Cross. 2017. Exploring the Political Agenda of the European Parliament Using a Dynamic Topic Modeling Approach. Political Analysis, Vol. 25, 1 (2017), 77--94.
[16]
Matthew Hoffman, Francis Bach, and David M. Blei. 2010. Online Learning for Latent Dirichlet Allocation. Advances in Neural Information Processing Systems, Vol. 23 (2010), 856--864.
[17]
Internet Archive. 1996--. Wayback Machine. https://web.archive.org
[18]
Jeffrey Knockel, Masashi Crete-Nishihata, Jason Q. Ng, Adam Senft, and Jedidiah R. Crandall. 2015. Every Rose Has Its Thorn: Censorship and Surveillance on Social Video Platforms in China. In Free and Open Communications on the Internet. 10 pages.
[19]
Jeffrey Knockel, Lotus Ruan, and Masashi Crete-Nishihata. 2018. An analysis of automatic image filtering on WeChat Moments. In Free and Open Communications on the Internet. 12 pages.
[20]
Marco Linton, Ernie Gin Swee Teo, Elisabeth Bommes, CY Chen, and Wolfgang Karl H"ardle. 2017. Dynamic topic modelling for cryptocurrency community forums. In Applied quantitative finance. 355--372.
[21]
Arian Akhavan Niaki, Shinyoung Cho, Zachary Weinberg, Nguyen Phong Hoang, Abbas Razaghpanah, Nicolas Christin, and Phillipa Gill. 2020. ICLab: A Global, Longitudinal Internet Censorship Measurement Platform. In Symposium on Security and Privacy. 214--230.
[22]
Raymond Rambert, Zachary Weinberg, Diogo Barradas, and Nicolas Christin. 2021. Chinese Wall or Swiss Cheese? Keyword filtering in the Great Firewall of China. In Web Conference. 472--483. https://doi.org/10.1145/3442381.3450076
[23]
Reethika Ramesh, Ram Sundara Raman, Matthew Bernhard, Victor Ongkowijaya, Leonid Evdokimov, Anne Edmundson, Steven Sprecher, Muhammad Ikram, and Roya Ensafi. 2020. Decentralized Control: A Case Study of Russia. In Network and Distributed Systems Security Symposium. 18 pages.
[24]
Ram Sundara Raman, Adrian Stoll, Jakub Dalek, Reethkia Ramesh, Will Scott, and Roya Ensafi. 2020. Measuring the Deployment of Network Censorship Filters at Global Scale. In Network and Distributed System Security Symposium. 16 pages.
[25]
Rima S. Tanash, Zhouhan Chen, Tanmay Thakur, Dan S. Wallach, and Devika Subramanian. 2015. Known Unknowns: An Analysis of Twitter Censorship in Turkey. In Workshop on Privacy in the Electronic Society. 11--20. https://doi.org/10.1145/2808138.2808147
[26]
Zachary Weinberg, Mahmood Sharif, Janos Szurdi, and Nicolas Christin. 2017. Topics of Controversy: An Empirical Analysis of Web Censorship Lists. Privacy Enhancing Technologies, Vol. 2017, 1 (2017), 42--61.
[27]
Pengtao Xie, Yuntian Deng, and Eric Xing. 2015. Diversifying Restricted Boltzmann Machine for Document Modeling. In Knowledge Discovery and Data Mining. 1315--1324. https://doi.org/10.1145/2783258.2783264
[28]
Xueyang Xu, Z. Morley Mao, and J. Alex Halderman. 2011. Internet censorship in China: Where does the filtering occur?. In Passive and Active Network Measurement. 133--142.
[29]
Tarun Kumar Yadav, Akshat Sinha, Devashish Gosain, Piyush Kumar Sharma, and Sambuddho Chakravarty. 2018. Where The Light Gets In: Analyzing Web Censorship Mechanisms in India. In Internet Measurement Conference. 252--264. arxiv: 1808.01708 [cs.CY]io

Index Terms

  1. Darwin's Theory of Censorship: Analysing the Evolution of Censored Topics with Dynamic Topic Models

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WPES'22: Proceedings of the 21st Workshop on Privacy in the Electronic Society
      November 2022
      227 pages
      ISBN:9781450398732
      DOI:10.1145/3559613
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 November 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. censorship
      2. internet measurement
      3. topic modelling

      Qualifiers

      • Short-paper

      Conference

      CCS '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 106 of 355 submissions, 30%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 91
        Total Downloads
      • Downloads (Last 12 months)27
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 16 Oct 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media