Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Building a Personalized Model for Social Media Textual Content Censorship

Published: 11 November 2022 Publication History

Abstract

Social media users often suffer from the problem of content over-disclosure. Most existing studies attempt to solve this problem by recommending proper audiences for users when sharing content. However, the audience management strategy cannot filter out sensitive information from the post and narrow the scope of content permeation. On the contrary, this paper conducts research from the content perspective and aims to design a content censorship model to help users evaluate the publicity of a post and find the sensitive information from it. The user can revise the content accordingly to achieve goals of sensitive information protection and broader content permeation. For this intention, we first built a dataset to explore the factors related to the public level of a post and the sensitive information. Based on the findings, a novel personalized multi-task content censorship model was built using several state-of-the-art deep learning techniques such as Seq2Seq and Co-training. We also implemented a prototype, i.e. a Browser plugin-based content censorship tool, by utilizing Weibo as a research site. Our model and its prototype were evaluated through automatic and human evaluations. The automatic evaluation suggests that our model outperforms the baseline methods on several metrics including precision, recall, and F1-score. The human evaluation also reveals that our model and prototype play an important role in helping users identify sensitive information. Based on these results, we proposed several insights for the future design of the social media content censorship system.

References

[1]
Davide Alberto Albertini, Barbara Carminati, and Elena Ferrari. 2016. Privacy settings recommender for online social network. In 2016 IEEE 2nd International Conference on Collaboration and Internet Computing (CIC). IEEE, 514--521.
[2]
Irwin Altman. 1975. The environment and social behavior: privacy, personal space, territory, and crowding. (1975).
[3]
Irwin Altman, Anne Vinsel, and Barbara B Brown. 1981. Dialectic conceptions in social psychology: An application to social penetration and privacy regulation. In Advances in experimental social psychology. Vol. 14. Elsevier, 107--160.
[4]
Michael S Bernstein, Eytan Bakshy, Moira Burke, and Brian Karrer. 2013. Quantifying the invisible audience in social networks. In Proceedings of the SIGCHI conference on human factors in computing systems. 21--30.
[5]
Lindsay Blackwell, Jill Dimond, Sarita Schoenebeck, and Cliff Lampe. 2017. Classification and its consequences for online harassment: Design insights from heartmob. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (2017), 1--19.
[6]
Pete Burnap and Matthew L Williams. 2015. Cyber hate speech on twitter: An application of machine classification and statistical modeling for policy and decision making. Policy & internet 7, 2 (2015), 223--242.
[7]
Liang Cai, Haoye Wang, Bowen Xu, Qiao Huang, Xin Xia, David Lo, and Zhenchang Xing. 2019. AnswerBot: an answer summary generation tool based on stack overflow. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1134--1138.
[8]
Stevie Chancellor, Zhiyuan Lin, Erica L Goodman, Stephanie Zerwas, and Munmun De Choudhury. 2016. Quantifying and predicting mental illness severity in online pro-eating disorder communities. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. 1171--1184.
[9]
Gorrell P Cheek and Mohamed Shehab. 2012. Policy-by-example for online social networks. In Proceedings of the 17th ACM symposium on Access Control Models and Technologies. 23--32.
[10]
Eugene Cho, S Shyam Sundar, Saeed Abdullah, and Nasim Motalebi. 2020. Will deleting history make alexa more trustworthy? effects of privacy and content customization on user experience of smart speakers. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--13.
[11]
Hichang Cho and Anna Filippova. 2016. Networked privacy management in facebook: A mixed-methods and multinational study. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. 503--514.
[12]
Shaika Chowdhury, Chenwei Zhang, and Philip S Yu. 2018. Multi-task pharmacovigilance mining from social media posts. In Proceedings of the 2018 World Wide Web Conference. 117--126.
[13]
Sauvik Das and Adam Kramer. 2013. Self-censorship on Facebook. In Seventh international AAAI conference on weblogs and social media.
[14]
data.weibo.com. 2020. Weibo User Development Report. Retrieved Aug 12, 2022 from https://data.weibo.com/report/reportDetail?id=456
[15]
Munmun De Choudhury, Michael Gamon, Scott Counts, and Eric Horvitz. 2013. Predicting depression via social media. In Seventh international AAAI conference on weblogs and social media.
[16]
Michael A DeVito, Ashley Marie Walker, and Jeremy Birnholtz. 2018. 'Too Gay for Facebook' Presenting LGBTQ Identity Throughout the Personal Social Media Ecosystem. Proceedings of the ACM on Human-Computer Interaction 2, CSCW (2018), 1--23.
[17]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[18]
Daxiang Dong, Hua Wu, Wei He, Dianhai Yu, and Haifeng Wang. 2015. Multi-task learning for multiple language translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1723--1732.
[19]
Sindhu Kiranmai Ernala, Stephanie S Yang, Yuxi Wu, Rachel Chen, Kristen Wells, and Sauvik Das. 2021. Exploring the Utility Versus Intrusiveness of Dynamic Audience Selection on Facebook. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2 (2021), 1--30.
[20]
Lujun Fang and Kristen LeFevre. 2010. Privacy wizards for social networking sites. In Proceedings of the 19th international conference on World wide web. 351--360.
[21]
Casey Fiesler, Michaelanne Dye, Jessica L Feuston, Chaya Hiruncharoenvate, Clayton J Hutto, Shannon Morrison, Parisa Khanipour Roshan, Umashanthi Pavalanathan, Amy S Bruckman, Munmun De Choudhury, et al. 2017. What (or who) is public? Privacy settings and social media content sharing. In Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing. 567--580.
[22]
Kambiz Ghazinour, Stan Matwin, and Marina Sokolova. 2013. Monitoring and recommending privacy settings in social networks. In Proceedings of the Joint EDBT/ICDT 2013 Workshops. 164--168.
[23]
Anatoliy Gruzd, Barry Wellman, and Yuri Takhteyev. 2011. Imagining Twitter as an imagined community. American Behavioral Scientist 55, 10 (2011), 1294--1318.
[24]
Karim Hadjar and Ahmed Jedidi. 2019. A New Approach for Scheduling Tasks and/or Jobs in Big Data Cluster. In 2019 4th MEC International Conference on Big Data and Smart City (ICBDSC). IEEE, 1--4.
[25]
Ruidan He, Wee Sun Lee, Hwee Tou Ng, and Daniel Dahlmeier. 2018. Exploiting document knowledge for aspect-level sentiment classification. arXiv preprint arXiv:1806.04346 (2018).
[26]
Ruidan He, Wee Sun Lee, Hwee Tou Ng, and Daniel Dahlmeier. 2019. An interactive multi-task learning network for end-to-end aspect-based sentiment analysis. arXiv preprint arXiv:1906.06906 (2019).
[27]
Joanne Hinds and Adam N Joinson. 2018. What demographic attributes do our digital footprints reveal? A systematic review. PloS one 13, 11 (2018), e0207112.
[28]
Bernie Hogan. 2010. The presentation of self in the age of social media: Distinguishing performances and exhibitions online. Bulletin of Science, Technology & Society 30, 6 (2010), 377--386.
[29]
Lee Humphreys, Phillipa Gill, and Balachander Krishnamurthy. 2010. How much is too much? Privacy issues on Twitter. In Conference of international communication association, singapore. Citeseer.
[30]
Kokil Jaidka, Sharath Guntuku, and Lyle Ungar. 2018. Facebook versus Twitter: Differences in self-disclosure and trait prediction. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 12.
[31]
Carter Jernigan and Behram FT Mistree. 2009. Gaydar: Facebook friendships expose sexual orientation. First Monday (2009).
[32]
Shagun Jhaver, Iris Birman, Eric Gilbert, and Amy Bruckman. 2019. Human-machine collaboration for content regulation: The case of reddit automoderator. ACM Transactions on Computer-Human Interaction (TOCHI) 26, 5 (2019), 1--35.
[33]
Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, and Tomas Mikolov. 2016. Fasttext.zip: Compressing text classification models. arXiv preprint arXiv:1612.03651 (2016).
[34]
Prashant Kapil and Asif Ekbal. 2020. A deep neural network based multi-task learning approach to hate speech detection. Knowledge-Based Systems 210 (2020), 106458.
[35]
Jennifer King. 2019. " Becoming Part of Something Bigger" Direct to Consumer Genetic Testing, Privacy, and Personal Disclosure. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1--33.
[36]
A Can Kurtan and Pinar Yolum. 2021. Assisting humans in privacy management: an agent-based approach. Autonomous Agents and Multi-Agent Systems 35, 1 (2021), 1--33.
[37]
Bing Liu and Ian Lane. 2016. Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454 (2016).
[38]
Harish Tayyar Madabushi, Edward Gow-Smith, Carolina Scarton, and Aline Villavicencio. 2021. A Stitch In Language-Models: Dataset and Methods for the Exploration of Idiomaticity in Pre-Trained Language Models. arXiv preprint arXiv:2109.04413 (2021).
[39]
Michelle Madejski, Maritza Johnson, and Steven M Bellovin. 2012. A study of privacy settings errors in an online social network. In 2012 IEEE International Conference on Pervasive Computing and Communications Workshops. IEEE, 340--345.
[40]
Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, and Julian McAuley. 2019. Generating personalized recipes from historical user preferences. arXiv preprint arXiv:1909.00105 (2019).
[41]
Alice E Marwick and Danah Boyd. 2011. I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience. New media & society 13, 1 (2011), 114--133.
[42]
Gaurav Misra and Jose M Such. 2017. Pacman: Personal agent for access control in social media. IEEE Internet Computing 21, 6 (2017), 18--26.
[43]
Raphael Ottoni, Diego Las Casas, Joao Paulo Pesce, Wagner Meira Jr, Christo Wilson, Alan Mislove, and Virgilio Almeida. 2014. Of pins and tweets: Investigating how users behave across image-and text-based social networks. In Eighth international aaai conference on weblogs and social media.
[44]
Xinru Page, Reza Ghaiumy Anaraky, Bart P Knijnenburg, and Pamela J Wisniewski. 2019. Pragmatic Tool vs. Relational Hindrance: Exploring Why Some Social Media Users Avoid Privacy Features. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1--23.
[45]
James W Pennebaker, Martha E Francis, and Roger J Booth. 2001. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates 71, 2001 (2001), 2001.
[46]
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.
[47]
Sandra Petronio. 2002. Boundaries of privacy: Dialectics of disclosure. Suny Press.
[48]
Sandra Petronio. 2013. Brief status report on communication privacy management theory. Journal of Family Communication 13, 1 (2013), 6--14.
[49]
Yasmeen Rashidi, Apu Kapadia, Christena Nippert-Eng, and Norman Makoto Su. 2020. " It's easier than causing confrontation": Sanctioning Strategies to Maintain Social Norms and Privacy on Social Media. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1 (2020), 1--25.
[50]
Patrick Skeba and Eric PS Baumer. 2020. Informational Friction as a Lens for Studying Algorithmic Aspects of Privacy. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2 (2020), 1--22.
[51]
Manya Sleeper, Rebecca Balebako, Sauvik Das, Amber Lynn McConahy, Jason Wiese, and Lorrie Faith Cranor. 2013. The post that wasn't: exploring self-censorship on facebook. In Proceedings of the 2013 conference on Computer supported cooperative work. 793--802.
[52]
Xuemeng Song, Xiang Wang, Liqiang Nie, Xiangnan He, Zhumin Chen, and Wei Liu. 2018. A personal privacy preserving framework: I let you know who can see what. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 295--304.
[53]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15, 1 (2014), 1929--1958.
[54]
Frederic Stutzman and Woodrow Hartzog. 2012. Boundary regulation in social media. In Proceedings of the ACM 2012 conference on computer supported cooperative work. 769--778.
[55]
Adrian Tear and Humphrey Southall. 2019. Social media data. Data in Society: Challenging Statistics in an Age of Globalisation (2019), 47.
[56]
Jessica Vitak, Katie Shilton, and Zahra Ashktorab. 2016. Beyond the Belmont principles: Ethical challenges, practices, and beliefs in the online data research community. In Proceedings of the 19th ACM conference on computer-supported cooperative work & social computing. 941--953.
[57]
Yang Wang, Gregory Norcie, Saranga Komanduri, Alessandro Acquisti, Pedro Giovanni Leon, and Lorrie Faith Cranor. 2011. " I regretted the minute I pressed share" a qualitative study of regrets on Facebook. In Proceedings of the seventh symposium on usable privacy and security. 1--16.
[58]
Pamela Wisniewski, Heather Lipford, and David Wilson. 2012. Fighting for my space: Coping mechanisms for SNS boundary regulation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 609--618.
[59]
Austin P Wright, Omar Shaikh, Haekyu Park, Will Epperson, Muhammed Ahmed, Stephane Pinel, Duen Horng Chau, and Diyi Yang. 2021. RECAST: Enabling User Recourse and Interpretability of Toxicity Detection Models with Interactive Visualization. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1--26.
[60]
Jue Wu, Junyi Ma, Yasha Wang, and Jiangtao Wang. 2021. Understanding and Predicting the Burst of Burnout via Social Media. Proceedings of the ACM on Human-Computer Interaction 4, CSCW3 (2021), 1--27.
[61]
www.pewresearch.org. 2014. Public Perceptions of Privacy and Security in the Post-Snowden Era. Retrieved Aug 12, 2022 from https://www.pewresearch.org/internet/2014/11/12/public-privacy-perceptions/
[62]
Liang Xu, Qianqian Dong, Cong Yu, Yin Tian, Weitang Liu, Lu Li, and Xuanwei Zhang. 2020. CLUENER2020: Fine-grained Name Entity Recognition for Chinese. arXiv preprint arXiv:2001.04351 (2020).
[63]
Alyson L Young and Anabel Quan-Haase. 2009. Information revelation and internet privacy concerns on social network sites: a case study of facebook. In Proceedings of the fourth international conference on Communities and technologies. 265--274.
[64]
Peng Zhang, Baoxi Liu, Xianghua Ding, Tun Lu, Hansu Gu, and Ning Gu. 2021. Studying and understanding characteristics of post-syncing practice and goal in social network sites. ACM Transactions on the Web (TWEB) 15, 4 (2021), 1--26.
[65]
Yu Zhang and Qiang Yang. 2018. An overview of multi-task learning. National Science Review 5, 1 (2018), 30--43.
[66]
Yinhe Zheng, Rongsheng Zhang, Minlie Huang, and Xiaoxi Mao. 2020. A pre-training based personalized dialogue generation model with persona-sparse data. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 9693--9700.

Cited By

View all
  • (2024)Lipwatch: Enabling Silent Speech Recognition on Smartwatches using Acoustic SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596148:2(1-29)Online publication date: 15-May-2024
  • (2024)Interactive Abstract Interpretation with Demanded SummarizationACM Transactions on Programming Languages and Systems10.1145/364844146:1(1-40)Online publication date: 29-Mar-2024
  • (2024)CW-AcousLen: A Configurable Wideband Acoustic MetasurfaceProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661882(29-41)Online publication date: 3-Jun-2024
  • Show More Cited By

Index Terms

  1. Building a Personalized Model for Social Media Textual Content Censorship

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Human-Computer Interaction
    Proceedings of the ACM on Human-Computer Interaction  Volume 6, Issue CSCW2
    CSCW
    November 2022
    8205 pages
    EISSN:2573-0142
    DOI:10.1145/3571154
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 November 2022
    Published in PACMHCI Volume 6, Issue CSCW2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Weibo
    2. content censorship
    3. personalized model
    4. social media

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)118
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Lipwatch: Enabling Silent Speech Recognition on Smartwatches using Acoustic SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596148:2(1-29)Online publication date: 15-May-2024
    • (2024)Interactive Abstract Interpretation with Demanded SummarizationACM Transactions on Programming Languages and Systems10.1145/364844146:1(1-40)Online publication date: 29-Mar-2024
    • (2024)CW-AcousLen: A Configurable Wideband Acoustic MetasurfaceProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661882(29-41)Online publication date: 3-Jun-2024
    • (2024)MetaFormerProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435508:1(1-27)Online publication date: 6-Mar-2024
    • (2024)XRF55Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435438:1(1-34)Online publication date: 6-Mar-2024
    • (2024)PmTrackProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314337:4(1-30)Online publication date: 12-Jan-2024
    • (2023)mmHSV: In-Air Handwritten Signature Verification via Millimeter-Wave RadarACM Transactions on Internet of Things10.1145/36144434:4(1-22)Online publication date: 22-Nov-2023
    • (2023)Enriching Social Sharing for the Dementia Community: Insights from In-Person and Online Social ProgramsACM Transactions on Accessible Computing10.1145/358255816:1(1-33)Online publication date: 29-Mar-2023
    • (2023)SoundSieve: Seconds-Long Audio Event Recognition on Intermittently-Powered SystemsProceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services10.1145/3581791.3596859(28-41)Online publication date: 18-Jun-2023
    • (2023)MidasProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35808727:1(1-26)Online publication date: 28-Mar-2023
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media