research-article

A Personal Privacy Preserving Framework: I Let You Know Who Can See What

Authors:

Wei LiuAuthors Info & Claims

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Pages 295 - 304

https://doi.org/10.1145/3209978.3209995

Published: 27 June 2018 Publication History

Abstract

The booming of social networks has given rise to a large volume of user-generated contents (UGCs), most of which are free and publicly available. A lot of users' personal aspects can be extracted from these UGCs to facilitate personalized applications as validated by many previous studies. Despite their value, UGCs can place users at high privacy risks, which thus far remains largely untapped. Privacy is defined as the individual's ability to control what information is disclosed, to whom, when and under what circumstances. As people and information both play significant roles, privacy has been elaborated as a boundary regulation process, where individuals regulate interaction with others by altering the openness degree of themselves to others. In this paper, we aim to reduce users' privacy risks on social networks by answering the question of Who Can See What. Towards this goal, we present a novel scheme, comprising of descriptive, predictive and prescriptive components. In particular, we first collect a set of posts and extract a group of privacy-oriented features to describe the posts. We then propose a novel taxonomy-guided multi-task learning model to predict which personal aspects are uncovered by the posts. Lastly, we construct standard guidelines by the user study with 400 users to regularize users' actions for preventing their privacy leakage. Extensive experiments on a real-world dataset well verified our scheme.

References

[1]

Qingyao Ai, Yongfeng Zhang, Keping Bi, Xu Chen, and W. Bruce Croft . 2017. Learning a Hierarchical Embedding Model for Personalized Product Search Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 645--654.

Digital Library

[2]

Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil . 2008. Convex multi-task feature learning. Machine Learning Vol. 73, 3 (2008), 243--272.

Digital Library

[3]

Jing Bai, Ke Zhou, Guirong Xue, Hongyuan Zha, Gordon Sun, Belle Tseng, Zhaohui Zheng, and Yi Chang . 2009. Multi-task learning for learning to rank in web search The 24th ACM International Conference on Information and Knowledge Management. ACM, 1549--1552.

Digital Library

[4]

Amir Beck and Marc Teboulle . 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences Vol. 2, 1 (2009), 183--202.

Digital Library

[5]

Joanna Asia Biega, Rishiraj Saha Roy, and Gerhard Weikum . 2017. Privacy through Solidarity: A User-Utility-Preserving Framework to Counter Profiling. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 675--684.

Digital Library

[6]

Aylin Caliskan Islam, Jonathan Walsh, and Rachel Greenstadt . 2014. Privacy Detective: Detecting Private Information and Collective Privacy Behavior in a Large Social Network. In Workshop on Privacy in the Electronic Society. 35--46.

Digital Library

[7]

Rich Caruana . 1997. Multitask learning. Machine learning Vol. 28, 1 (1997), 41--75.

Digital Library

[8]

Chih-Chung Chang and Chih-Jen Lin . 2011. LIBSVM: A library for support vector machines. TIST Vol. 2, 3 (2011), 27.

Digital Library

[9]

Zhiyong Cheng, Jialie Shen, and Steven C. H. Hoi . 2016. On Effective Personalized Music Retrieval by Exploring Online User Behaviors Proceedings of the International ACM SIGIR conference on Research and Development in Information Retrieval. 125--134.

Digital Library

[10]

Zhiyong Cheng, Jialie Shen, Lei Zhu, Mohan S. Kankanhalli, and Liqiang Nie . 2017. Exploiting Music Play Sequence for Music Recommendation Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI. 3654--3660.

Digital Library

[11]

Corinna Cortes and Vladimir Vapnik . 1995. Support-vector networks. Machine learning Vol. 20, 3 (1995), 273--297.

Digital Library

[12]

Munmun De Choudhury, Scott Counts, and Eric Horvitz . 2013. Major life changes and behavioral markers in social media: case of childbirth Proceedings of the 2013 conference on Computer supported cooperative work. ACM, 1431--1442.

Digital Library

[13]

Valerian J Derlega and Alan L Chaikin . 1977. Privacy and self-disclosure in social relationships. Journal of Social Issues Vol. 33, 3 (1977), 102--115.

[14]

Jianping Fan, Yuli Gao, and Hangzai Luo . 2007 a. Hierarchical classification for automatic image annotation The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 111--118.

Digital Library

[15]

Jianping Fan, Yuli Gao, and Hangzai Luo . 2007 b. Hierarchical classification for automatic image annotation Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 111--118.

Digital Library

[16]

Hongliang Fei, Ruoyi Jiang, Yuhao Yang, Bo Luo, and Jun Huan . 2011. Content based social behavior prediction: a multi-task learning approach The ACM International Conference on Information and Knowledge Management. ACM, 995--1000.

Digital Library

[17]

Fuli Feng, Liqiang Nie, Xiang Wang, Richang Hong, and Tat-Seng Chua . 2017. Computational social indicators: a case study of chinese university ranking The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 455--464.

Digital Library

[18]

Joseph L Fleiss, Jacob Cohen, and B. S Everitt . 1969. Large sample standard errors of kappa and weighted kappa. Psychological Bulletin Vol. 72, 5 (1969), 323--327.

[19]

Yoav Freund, Robert E Schapire, et almbox. . 1996. Experiments with a new boosting algorithm. In International Conference on Machine Learning, Vol. Vol. 96. ACM, 148--156.

Digital Library

[20]

Debasis Ganguly, Dwaipayan Roy, Mandar Mitra, and Gareth JF Jones . 2015. Word Embedding based Generalized Language Model for Information Retrieval The International ACM SIGIR Conference on Research and Development in Information Retrieval. 795--798.

Digital Library

[21]

Shuguang Han, Daqing He, and Zhen Yue . 2014. Benchmarking the Privacy-Preserving People Search. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.

[22]

Xiangnan He and Tat-Seng Chua . 2017. Neural Factorization Machines for Sparse Predictive Analytics Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 355--364.

Digital Library

[23]

Roger A Horn and Charles R Johnson . 1991. Topics in matrix analysis. Cambridge University Presss, Cambridge Vol. 37 (1991), 39.

[24]

Lee Humphreys, Phillipa Gill, and Balachander Krishnamurthy . 2010. How much is too much? Privacy issues on Twitter. In Conference of International Communication Association, Singapore.

[25]

Lee Humphreys, Phillipa Gill, and Balachander Krishnamurthy . 2014. Twitter: a content analysis of personal information. Information, Communication & Society Vol. 17, 7 (2014), 843--857.

[26]

Melinda L Korzaan and Katherine T Boswell . 2008. The influence of personality traits and information privacy concerns on behavioral intentions. Journal of Computer Information Systems Vol. 48, 4 (2008), 15--24.

[27]

Abhishek Kumar and Hal Daumé III . 2012. Learning Task Grouping and Overlap in Multi-task Learning International Conference on Machine Learning. 1383--1390.

Digital Library

[28]

J Richard Landis and Gary G Koch . 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159--174.

[29]

Kun Liu and Evimaria Terzi . 2010. A framework for computing the privacy scores of users in online social networks. ACM Transactions on Knowledge Discovery from Data Vol. 5, 1 (2010), 6.

Digital Library

[30]

Huina Mao, Xin Shuai, and Apu Kapadia . 2011. Loose tweets: an analysis of privacy leaks on twitter Workshop on Privacy in the Electronic Society. ACM, 1--12.

Digital Library

[31]

Frank McSherry and Ilya Mironov . 2009. Differentially private recommender systems: building privacy into the net The International ACN SIGKDD Conferences on Knowledge Discovery and Data Mining. 627--636.

Digital Library

[32]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean . 2013. Distributed representations of words and phrases and their compositionality NIPS. 3111--3119.

Digital Library

[33]

Tom M Mitchell . 1997. Machine learning. Burr Ridge, IL: McGraw Hill (1997).

Digital Library

[34]

Sandra Petronio . 2012. Boundaries of privacy: Dialectics of disclosure. Suny Press.

[35]

Lee Rainie, Sara Kiesler, Ruogu Kang, Mary Madden, Maeve Duggan, Stephanie Brown, and Laura Dabbish . 2013. Anonymity, privacy, and security online. Pew Research Center (2013).

[36]

Manya Sleeper, Justin Cranshaw, Patrick Gage Kelley, Blase Ur, Alessandro Acquisti, Lorrie Faith Cranor, and Norman Sadeh . 2013. I read my Twitter the next morning and was astonished: A conversational perspective on Twitter regrets. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3277--3286.

Digital Library

[37]

Xuemeng Song, Zhaoyan Ming, Liqiang Nie, Yi-Liang Zhao, and Tat-Seng Chua . 2016. Volunteerism Tendency Prediction via Harvesting Multiple Social Networks. ACM Transactions on Information System Vol. 34, 2 (2016), 10:1--10:27.

Digital Library

[38]

Xuemeng Song, Liqiang Nie, Luming Zhang, Mohammad Akbari, and Tat-Seng Chua . 2015 a. Multiple social network learning and its application in volunteerism tendency prediction. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 213--222.

Digital Library

[39]

Xuemeng Song, Liqiang Nie, Luming Zhang, Maofu Liu, and Tat-Seng Chua . 2015 b. Interest inference via structure-constrained multi-source multi-task learning International Joint Conference on Artificial Intelligence. AAAI Press, 2371--2377.

Digital Library

[40]

Yi Song, Daniel Dahlmeier, and Stephane Bressan . 2014. Not So Unique in the Crowd: a Simple and Effective Algorithm for Anonymizing Location Data The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 19.

[41]

Damiano Spina, Julio Gonzalo, and Enrique Amigó . 2014. Learning similarity functions for topic detection in online reputation monitoring The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 527--536.

Digital Library

[42]

Robert Tibshirani . 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) (1996), 267--288.

[43]

Asimina Vasalou, Alastair J Gill, Fadhila Mazanderani, Chrysanthi Papoutsi, and Adam Joinson . 2011. Privacy dictionary: A new resource for the automated content analysis of privacy. JASIST Vol. 62, 11 (2011), 2095--2105.

Digital Library

[44]

Yulu Wang, Garrick Sherman, Jimmy Lin, and Miles Efron . 2015. Assessor Differences and User Preferences in Tweet Timeline Generation International ACM SIGIR Conference on Research and Development in Information Retrieval. 615--624.

Digital Library

[45]

Simon S Woo and Harsha Manjunatha . 2015. Empirical Data Analysis on User Privacy and Sentiment in Personal Blogs The International ACM SIGIR Conference on Research and Development in Information Retrieval.

[46]

Sicong Zhang, Hui Yang, and Lisa Singh . 2014. Increased Information Leakage from Text. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 41--42.

Cited By

Rodrigues AVillela MFeitosa E(2024)A Systematic Mapping Study on Social Network Privacy: Threats and SolutionsACM Computing Surveys10.1145/364508656:7(1-29)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3645086
Liang ZGuo JQiu WHuang ZLi S(2024)When graph convolution meets double attention: online privacy disclosure detection with multi-label text classificationData Mining and Knowledge Discovery10.1007/s10618-023-00992-y38:3(1171-1192)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1007/s10618-023-00992-y
Pinney CRaj AHanna AEkstrand M(2023)Much Ado About GenderProceedings of the 2023 Conference on Human Information Interaction and Retrieval10.1145/3576840.3578316(269-279)Online publication date: 19-Mar-2023
https://dl.acm.org/doi/10.1145/3576840.3578316
Show More Cited By

Index Terms

A Personal Privacy Preserving Framework: I Let You Know Who Can See What
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
2. Security and privacy
  1. Human and societal aspects of security and privacy
    1. Privacy protections

Recommendations

Privacy-preserving topic model for tagging recommender systems

Tagging recommender systems provide users the freedom to explore tags and obtain recommendations. The releasing and sharing of these tagging datasets will accelerate both commercial and research work on recommender systems. However, releasing the ...
Privacy preserving of trust management credentials based on trusted computing
ISPEC'10: Proceedings of the 6th international conference on Information Security Practice and Experience

Privacy disclosure of forward direction credentials and backward direction credentials is an important security defect in existing trust management systems. In this paper, a novel distributed privacy preserving scheme for trust management credentials is ...
A Review on Privacy-Preserving Data Mining
CIT '14: Proceedings of the 2014 IEEE International Conference on Computer and Information Technology

Data mining has been widely studied and applied into many fields such as Internet of Things (IoT) and business development. However, data mining techniques also occur serious challenges due to increased sensitive information disclosure and privacy ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

June 2018

1509 pages

ISBN:9781450356572

DOI:10.1145/3209978

General Chairs:
Kevyn Collins-Thompson
University of Michigan, United States
,
Qiaozhu Mei
University of Michigan, United States
,
Program Chairs:
Brian Davison
Lehigh University, United States
,
Yiqun Liu
Tsinghua University, China
,
Emine Yilmaz
University College London, United Kingdom

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Basic Research Program of China (973 Program)
The Project of Thousand Youth Talents 2016
National Natural Science Foundation of China

Conference

SIGIR '18

Sponsor:

SIGIR

SIGIR '18: The 41st International ACM SIGIR conference on research and development in Information Retrieval

July 8 - 12, 2018

MI, Ann Arbor, USA

Acceptance Rates

SIGIR '18 Paper Acceptance Rate 86 of 409 submissions, 21%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
532
Total Downloads

Downloads (Last 12 months)53
Downloads (Last 6 weeks)6

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Rodrigues AVillela MFeitosa E(2024)A Systematic Mapping Study on Social Network Privacy: Threats and SolutionsACM Computing Surveys10.1145/364508656:7(1-29)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3645086
Liang ZGuo JQiu WHuang ZLi S(2024)When graph convolution meets double attention: online privacy disclosure detection with multi-label text classificationData Mining and Knowledge Discovery10.1007/s10618-023-00992-y38:3(1171-1192)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1007/s10618-023-00992-y
Pinney CRaj AHanna AEkstrand M(2023)Much Ado About GenderProceedings of the 2023 Conference on Human Information Interaction and Retrieval10.1145/3576840.3578316(269-279)Online publication date: 19-Mar-2023
https://dl.acm.org/doi/10.1145/3576840.3578316
Liu BZhang PShu YGuan ZLu TGu HGu N(2022)Building a Personalized Model for Social Media Textual Content CensorshipProceedings of the ACM on Human-Computer Interaction10.1145/35556576:CSCW2(1-31)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555657
Zhang YZong RShang LKou ZZeng HWang D(2022)CrowdOptim: A Crowd-driven Neural Network Hyperparameter Optimization Approach to AI-based Smart Urban SensingProceedings of the ACM on Human-Computer Interaction10.1145/35555366:CSCW2(1-27)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555536
Korsgaard HLyle PSaad-Sulonen JKlokmose CNouwens MBødker S(2022)Collectives and Their Artifact EcologiesProceedings of the ACM on Human-Computer Interaction10.1145/35555336:CSCW2(1-26)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555533
Karimi YSquicciarini AWilson S(2022)Automated Detection of Doxing on TwitterProceedings of the ACM on Human-Computer Interaction10.1145/35551676:CSCW2(1-24)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555167
He CDeng YYang WLi B(2022)"Help! Can You Hear Me?": Understanding How Help-Seeking Posts are Overwhelmed on Social Media during a Natural DisasterProceedings of the ACM on Human-Computer Interaction10.1145/35551476:CSCW2(1-25)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555147
Haque MRubya S(2022)"For an App Supposed to Make Its Users Feel Better, It Sure is a Joke" - An Analysis of User Reviews of Mobile Mental Health ApplicationsProceedings of the ACM on Human-Computer Interaction10.1145/35551466:CSCW2(1-29)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555146
Feuston JDeVito MScheuerman MWeathington KBenitez MPerez BSondheim LBrubaker J(2022)"Do You Ladies Relate?": Experiences of Gender Diverse People in Online Eating Disorder CommunitiesProceedings of the ACM on Human-Computer Interaction10.1145/35551456:CSCW2(1-32)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555145
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents