Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3397271.3401260acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Towards Differentially Private Text Representations

Published: 25 July 2020 Publication History

Abstract

Most deep learning frameworks require users to pool their local data or model updates to a trusted server to train or maintain a global model. The assumption of a trusted server who has access to user information is ill-suited in many applications. To tackle this problem, we develop a new deep learning framework under an untrusted server setting, which includes three modules: (1) embedding module, (2) randomization module, and (3) classifier module. For the randomization module, we propose a novel local differentially private (LDP) protocol to reduce the impact of privacy parameter ε on accuracy, and provide enhanced flexibility in choosing randomization probabilities for LDP. Analysis and experiments show that our framework delivers comparable or even better performance than the non-private framework and existing LDP protocols, demonstrating the advantages of our LDP protocol.

Supplementary Material

MP4 File (3397271.3401260.mp4)
Video file

References

[1]
Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of CCS. ACM, 308--318.
[2]
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical Secure Aggregation for Privacy-Preserving Machine Learning. In Proceedings of CCS. ACM, 1175--1191.
[3]
Maximin Coavoux, Shashi Narayan, and Shay B Cohen. 2018. Privacy-preserving neural representations of text. In Proceedings of EMNLP. 1--10.
[4]
Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier, David Leroy, Clément Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut Lavril, et al. 2018. Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190 (2018).
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[6]
William B Dolan and Chris Brockett. 2005. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005) .
[7]
John C Duchi, Michael I Jordan, and Martin J Wainwright. 2013. Local privacy and statistical minimax rates. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science. IEEE, 429--438.
[8]
Cynthia Dwork and Aaron Roth. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, Vol. 9, 3--4 (2014), 211--407.
[9]
Dirk Hovy, Anders Johannsen, and Anders Søgaard. 2015. User review sites as a resource for large-scale sociolinguistic studies. In Proceedings of WWW. 452--461.
[10]
Dimitrios Kotzias, Misha Denil, Nando De Freitas, and Padhraic Smyth. 2015. From group to individual labels using deep features. In Proceedings of the SIGKDD. ACM, 597--606.
[11]
Yitong Li, Timothy Baldwin, and Trevor Cohn. 2018. Towards robust and privacy-preserving text representations. In Proceedings of ACL. 25--30.
[12]
Lingjuan Lyu, Han Yu, and Qiang Yang. 2020 a. Threats to Federated Learning: A Survey. arXiv preprint arXiv:2003.02133 (2020).
[13]
Lingjuan Lyu, Jiangshan Yu, Karthik Nandakumar, Yitong Li, Xingjun Ma, Jiong Jin, Han Yu, and Kee Siong Ng. 2020 b. Towards Fair and Privacy-Preserving Federated Deep Models. IEEE TPDS, Vol. 31, 11 (2020), 2524--2541.
[14]
H Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et al. 2017. Communication-efficient learning of deep networks from decentralized data. AISTATS (2017).
[15]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of EMNLP. 1532--1543.
[16]
Daniel Preoct iuc-Pietro, Vasileios Lampos, and Nikolaos Aletras. 2015. An analysis of the user occupational class through Twitter content. In Proceedings of ACL, Vol. 1. 1754--1764.
[17]
Reza Shokri and Vitaly Shmatikov. 2015. Privacy-preserving deep learning. In Proceedings of CCS. ACM, 1310--1321.
[18]
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R Bowman. 2018b. Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461 (2018b).
[19]
Tianhao Wang, Jeremiah Blocki, Ninghui Li, and Somesh Jha. 2017. Locally differentially private protocols for frequency estimation. In USENIX Security. 729--745.
[20]
Stanley L Warner. 1965. Randomized response: A survey technique for eliminating evasive answer bias. J. Amer. Statist. Assoc., Vol. 60, 309 (1965), 63--69.

Cited By

View all
  • (2025)Recent Advances of Differential Privacy in Centralized Deep Learning: A Systematic SurveyACM Computing Surveys10.1145/371200057:6(1-28)Online publication date: 21-Jan-2025
  • (2025)Differential Privacy Adversarial Learning with Diffusion Model for the Generation of Privacy-Protected Text RepresentationsProceedings of the 5th International Conference on Big Data Analytics for Cyber-Physical System in Smart City—Volume 110.1007/978-981-96-0208-7_3(25-37)Online publication date: 2-Feb-2025
  • (2024)DP-Poison: Poisoning Federated Learning under the Cover of Differential PrivacyACM Transactions on Privacy and Security10.1145/370232528:1(1-28)Online publication date: 2-Nov-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2020
2548 pages
ISBN:9781450380164
DOI:10.1145/3397271
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. natural language processing
  2. neural representations
  3. privacy-preserving

Qualifiers

  • Short-paper

Conference

SIGIR '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)2
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Recent Advances of Differential Privacy in Centralized Deep Learning: A Systematic SurveyACM Computing Surveys10.1145/371200057:6(1-28)Online publication date: 21-Jan-2025
  • (2025)Differential Privacy Adversarial Learning with Diffusion Model for the Generation of Privacy-Protected Text RepresentationsProceedings of the 5th International Conference on Big Data Analytics for Cyber-Physical System in Smart City—Volume 110.1007/978-981-96-0208-7_3(25-37)Online publication date: 2-Feb-2025
  • (2024)DP-Poison: Poisoning Federated Learning under the Cover of Differential PrivacyACM Transactions on Privacy and Security10.1145/370232528:1(1-28)Online publication date: 2-Nov-2024
  • (2024)FLSys: Toward an Open Ecosystem for Federated Learning Mobile AppsIEEE Transactions on Mobile Computing10.1109/TMC.2022.322357823:1(501-519)Online publication date: Jan-2024
  • (2024)Recent Technologies in Differential Privacy for NLP Applications2024 11th International Conference on Soft Computing & Machine Intelligence (ISCMI)10.1109/ISCMI63661.2024.10851615(242-246)Online publication date: 22-Nov-2024
  • (2024)PriMonitor: An adaptive tuning privacy-preserving approach for multimodal emotion detectionWorld Wide Web10.1007/s11280-024-01246-727:2Online publication date: 2-Feb-2024
  • (2024)Privacy in Federated Learning Natural Language ModelsHandbook of Trustworthy Federated Learning10.1007/978-3-031-58923-2_9(259-287)Online publication date: 10-May-2024
  • (2023)Locally Differentially Private Frequency Estimation Based on Convolution Framework2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179389(2208-2222)Online publication date: May-2023
  • (2023)Differential Privacy Framework using Secure Computing on Untrusted Servers2023 IEEE 6th International Conference on Industrial Cyber-Physical Systems (ICPS)10.1109/ICPS58381.2023.10128028(1-6)Online publication date: 8-May-2023
  • (2022)FedCTR: Federated Native Ad CTR Prediction with Cross-platform User Behavior DataACM Transactions on Intelligent Systems and Technology10.1145/350671513:4(1-19)Online publication date: 29-Jun-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media