Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3209978.3209995acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

A Personal Privacy Preserving Framework: I Let You Know Who Can See What

Published: 27 June 2018 Publication History

Abstract

The booming of social networks has given rise to a large volume of user-generated contents (UGCs), most of which are free and publicly available. A lot of users' personal aspects can be extracted from these UGCs to facilitate personalized applications as validated by many previous studies. Despite their value, UGCs can place users at high privacy risks, which thus far remains largely untapped. Privacy is defined as the individual's ability to control what information is disclosed, to whom, when and under what circumstances. As people and information both play significant roles, privacy has been elaborated as a boundary regulation process, where individuals regulate interaction with others by altering the openness degree of themselves to others. In this paper, we aim to reduce users' privacy risks on social networks by answering the question of Who Can See What. Towards this goal, we present a novel scheme, comprising of descriptive, predictive and prescriptive components. In particular, we first collect a set of posts and extract a group of privacy-oriented features to describe the posts. We then propose a novel taxonomy-guided multi-task learning model to predict which personal aspects are uncovered by the posts. Lastly, we construct standard guidelines by the user study with 400 users to regularize users' actions for preventing their privacy leakage. Extensive experiments on a real-world dataset well verified our scheme.

References

[1]
Qingyao Ai, Yongfeng Zhang, Keping Bi, Xu Chen, and W. Bruce Croft . 2017. Learning a Hierarchical Embedding Model for Personalized Product Search Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 645--654.
[2]
Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil . 2008. Convex multi-task feature learning. Machine Learning Vol. 73, 3 (2008), 243--272.
[3]
Jing Bai, Ke Zhou, Guirong Xue, Hongyuan Zha, Gordon Sun, Belle Tseng, Zhaohui Zheng, and Yi Chang . 2009. Multi-task learning for learning to rank in web search The 24th ACM International Conference on Information and Knowledge Management. ACM, 1549--1552.
[4]
Amir Beck and Marc Teboulle . 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences Vol. 2, 1 (2009), 183--202.
[5]
Joanna Asia Biega, Rishiraj Saha Roy, and Gerhard Weikum . 2017. Privacy through Solidarity: A User-Utility-Preserving Framework to Counter Profiling. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 675--684.
[6]
Aylin Caliskan Islam, Jonathan Walsh, and Rachel Greenstadt . 2014. Privacy Detective: Detecting Private Information and Collective Privacy Behavior in a Large Social Network. In Workshop on Privacy in the Electronic Society. 35--46.
[7]
Rich Caruana . 1997. Multitask learning. Machine learning Vol. 28, 1 (1997), 41--75.
[8]
Chih-Chung Chang and Chih-Jen Lin . 2011. LIBSVM: A library for support vector machines. TIST Vol. 2, 3 (2011), 27.
[9]
Zhiyong Cheng, Jialie Shen, and Steven C. H. Hoi . 2016. On Effective Personalized Music Retrieval by Exploring Online User Behaviors Proceedings of the International ACM SIGIR conference on Research and Development in Information Retrieval. 125--134.
[10]
Zhiyong Cheng, Jialie Shen, Lei Zhu, Mohan S. Kankanhalli, and Liqiang Nie . 2017. Exploiting Music Play Sequence for Music Recommendation Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI. 3654--3660.
[11]
Corinna Cortes and Vladimir Vapnik . 1995. Support-vector networks. Machine learning Vol. 20, 3 (1995), 273--297.
[12]
Munmun De Choudhury, Scott Counts, and Eric Horvitz . 2013. Major life changes and behavioral markers in social media: case of childbirth Proceedings of the 2013 conference on Computer supported cooperative work. ACM, 1431--1442.
[13]
Valerian J Derlega and Alan L Chaikin . 1977. Privacy and self-disclosure in social relationships. Journal of Social Issues Vol. 33, 3 (1977), 102--115.
[14]
Jianping Fan, Yuli Gao, and Hangzai Luo . 2007 a. Hierarchical classification for automatic image annotation The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 111--118.
[15]
Jianping Fan, Yuli Gao, and Hangzai Luo . 2007 b. Hierarchical classification for automatic image annotation Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 111--118.
[16]
Hongliang Fei, Ruoyi Jiang, Yuhao Yang, Bo Luo, and Jun Huan . 2011. Content based social behavior prediction: a multi-task learning approach The ACM International Conference on Information and Knowledge Management. ACM, 995--1000.
[17]
Fuli Feng, Liqiang Nie, Xiang Wang, Richang Hong, and Tat-Seng Chua . 2017. Computational social indicators: a case study of chinese university ranking The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 455--464.
[18]
Joseph L Fleiss, Jacob Cohen, and B. S Everitt . 1969. Large sample standard errors of kappa and weighted kappa. Psychological Bulletin Vol. 72, 5 (1969), 323--327.
[19]
Yoav Freund, Robert E Schapire, et almbox. . 1996. Experiments with a new boosting algorithm. In International Conference on Machine Learning, Vol. Vol. 96. ACM, 148--156.
[20]
Debasis Ganguly, Dwaipayan Roy, Mandar Mitra, and Gareth JF Jones . 2015. Word Embedding based Generalized Language Model for Information Retrieval The International ACM SIGIR Conference on Research and Development in Information Retrieval. 795--798.
[21]
Shuguang Han, Daqing He, and Zhen Yue . 2014. Benchmarking the Privacy-Preserving People Search. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.
[22]
Xiangnan He and Tat-Seng Chua . 2017. Neural Factorization Machines for Sparse Predictive Analytics Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 355--364.
[23]
Roger A Horn and Charles R Johnson . 1991. Topics in matrix analysis. Cambridge University Presss, Cambridge Vol. 37 (1991), 39.
[24]
Lee Humphreys, Phillipa Gill, and Balachander Krishnamurthy . 2010. How much is too much? Privacy issues on Twitter. In Conference of International Communication Association, Singapore.
[25]
Lee Humphreys, Phillipa Gill, and Balachander Krishnamurthy . 2014. Twitter: a content analysis of personal information. Information, Communication & Society Vol. 17, 7 (2014), 843--857.
[26]
Melinda L Korzaan and Katherine T Boswell . 2008. The influence of personality traits and information privacy concerns on behavioral intentions. Journal of Computer Information Systems Vol. 48, 4 (2008), 15--24.
[27]
Abhishek Kumar and Hal Daumé III . 2012. Learning Task Grouping and Overlap in Multi-task Learning International Conference on Machine Learning. 1383--1390.
[28]
J Richard Landis and Gary G Koch . 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159--174.
[29]
Kun Liu and Evimaria Terzi . 2010. A framework for computing the privacy scores of users in online social networks. ACM Transactions on Knowledge Discovery from Data Vol. 5, 1 (2010), 6.
[30]
Huina Mao, Xin Shuai, and Apu Kapadia . 2011. Loose tweets: an analysis of privacy leaks on twitter Workshop on Privacy in the Electronic Society. ACM, 1--12.
[31]
Frank McSherry and Ilya Mironov . 2009. Differentially private recommender systems: building privacy into the net The International ACN SIGKDD Conferences on Knowledge Discovery and Data Mining. 627--636.
[32]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean . 2013. Distributed representations of words and phrases and their compositionality NIPS. 3111--3119.
[33]
Tom M Mitchell . 1997. Machine learning. Burr Ridge, IL: McGraw Hill (1997).
[34]
Sandra Petronio . 2012. Boundaries of privacy: Dialectics of disclosure. Suny Press.
[35]
Lee Rainie, Sara Kiesler, Ruogu Kang, Mary Madden, Maeve Duggan, Stephanie Brown, and Laura Dabbish . 2013. Anonymity, privacy, and security online. Pew Research Center (2013).
[36]
Manya Sleeper, Justin Cranshaw, Patrick Gage Kelley, Blase Ur, Alessandro Acquisti, Lorrie Faith Cranor, and Norman Sadeh . 2013. I read my Twitter the next morning and was astonished: A conversational perspective on Twitter regrets. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3277--3286.
[37]
Xuemeng Song, Zhaoyan Ming, Liqiang Nie, Yi-Liang Zhao, and Tat-Seng Chua . 2016. Volunteerism Tendency Prediction via Harvesting Multiple Social Networks. ACM Transactions on Information System Vol. 34, 2 (2016), 10:1--10:27.
[38]
Xuemeng Song, Liqiang Nie, Luming Zhang, Mohammad Akbari, and Tat-Seng Chua . 2015 a. Multiple social network learning and its application in volunteerism tendency prediction. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 213--222.
[39]
Xuemeng Song, Liqiang Nie, Luming Zhang, Maofu Liu, and Tat-Seng Chua . 2015 b. Interest inference via structure-constrained multi-source multi-task learning International Joint Conference on Artificial Intelligence. AAAI Press, 2371--2377.
[40]
Yi Song, Daniel Dahlmeier, and Stephane Bressan . 2014. Not So Unique in the Crowd: a Simple and Effective Algorithm for Anonymizing Location Data The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 19.
[41]
Damiano Spina, Julio Gonzalo, and Enrique Amigó . 2014. Learning similarity functions for topic detection in online reputation monitoring The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 527--536.
[42]
Robert Tibshirani . 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) (1996), 267--288.
[43]
Asimina Vasalou, Alastair J Gill, Fadhila Mazanderani, Chrysanthi Papoutsi, and Adam Joinson . 2011. Privacy dictionary: A new resource for the automated content analysis of privacy. JASIST Vol. 62, 11 (2011), 2095--2105.
[44]
Yulu Wang, Garrick Sherman, Jimmy Lin, and Miles Efron . 2015. Assessor Differences and User Preferences in Tweet Timeline Generation International ACM SIGIR Conference on Research and Development in Information Retrieval. 615--624.
[45]
Simon S Woo and Harsha Manjunatha . 2015. Empirical Data Analysis on User Privacy and Sentiment in Personal Blogs The International ACM SIGIR Conference on Research and Development in Information Retrieval.
[46]
Sicong Zhang, Hui Yang, and Lisa Singh . 2014. Increased Information Leakage from Text. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 41--42.

Cited By

View all
  • (2024)A Systematic Mapping Study on Social Network Privacy: Threats and SolutionsACM Computing Surveys10.1145/364508656:7(1-29)Online publication date: 9-Apr-2024
  • (2024)When graph convolution meets double attention: online privacy disclosure detection with multi-label text classificationData Mining and Knowledge Discovery10.1007/s10618-023-00992-y38:3(1171-1192)Online publication date: 1-May-2024
  • (2023)Much Ado About GenderProceedings of the 2023 Conference on Human Information Interaction and Retrieval10.1145/3576840.3578316(269-279)Online publication date: 19-Mar-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
June 2018
1509 pages
ISBN:9781450356572
DOI:10.1145/3209978
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. boundary regulation
  2. privacy preserving
  3. social media.

Qualifiers

  • Research-article

Funding Sources

  • National Basic Research Program of China (973 Program)
  • The Project of Thousand Youth Talents 2016
  • National Natural Science Foundation of China

Conference

SIGIR '18
Sponsor:

Acceptance Rates

SIGIR '18 Paper Acceptance Rate 86 of 409 submissions, 21%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)6
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Systematic Mapping Study on Social Network Privacy: Threats and SolutionsACM Computing Surveys10.1145/364508656:7(1-29)Online publication date: 9-Apr-2024
  • (2024)When graph convolution meets double attention: online privacy disclosure detection with multi-label text classificationData Mining and Knowledge Discovery10.1007/s10618-023-00992-y38:3(1171-1192)Online publication date: 1-May-2024
  • (2023)Much Ado About GenderProceedings of the 2023 Conference on Human Information Interaction and Retrieval10.1145/3576840.3578316(269-279)Online publication date: 19-Mar-2023
  • (2022)Building a Personalized Model for Social Media Textual Content CensorshipProceedings of the ACM on Human-Computer Interaction10.1145/35556576:CSCW2(1-31)Online publication date: 11-Nov-2022
  • (2022)CrowdOptim: A Crowd-driven Neural Network Hyperparameter Optimization Approach to AI-based Smart Urban SensingProceedings of the ACM on Human-Computer Interaction10.1145/35555366:CSCW2(1-27)Online publication date: 11-Nov-2022
  • (2022)Collectives and Their Artifact EcologiesProceedings of the ACM on Human-Computer Interaction10.1145/35555336:CSCW2(1-26)Online publication date: 11-Nov-2022
  • (2022)Automated Detection of Doxing on TwitterProceedings of the ACM on Human-Computer Interaction10.1145/35551676:CSCW2(1-24)Online publication date: 11-Nov-2022
  • (2022)"Help! Can You Hear Me?": Understanding How Help-Seeking Posts are Overwhelmed on Social Media during a Natural DisasterProceedings of the ACM on Human-Computer Interaction10.1145/35551476:CSCW2(1-25)Online publication date: 11-Nov-2022
  • (2022)"For an App Supposed to Make Its Users Feel Better, It Sure is a Joke" - An Analysis of User Reviews of Mobile Mental Health ApplicationsProceedings of the ACM on Human-Computer Interaction10.1145/35551466:CSCW2(1-29)Online publication date: 11-Nov-2022
  • (2022)"Do You Ladies Relate?": Experiences of Gender Diverse People in Online Eating Disorder CommunitiesProceedings of the ACM on Human-Computer Interaction10.1145/35551456:CSCW2(1-32)Online publication date: 11-Nov-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media