short-paper

Towards Differentially Private Text Representations

Authors:

Tong XiaoAuthors Info & Claims

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 1813 - 1816

https://doi.org/10.1145/3397271.3401260

Published: 25 July 2020 Publication History

Abstract

Most deep learning frameworks require users to pool their local data or model updates to a trusted server to train or maintain a global model. The assumption of a trusted server who has access to user information is ill-suited in many applications. To tackle this problem, we develop a new deep learning framework under an untrusted server setting, which includes three modules: (1) embedding module, (2) randomization module, and (3) classifier module. For the randomization module, we propose a novel local differentially private (LDP) protocol to reduce the impact of privacy parameter ε on accuracy, and provide enhanced flexibility in choosing randomization probabilities for LDP. Analysis and experiments show that our framework delivers comparable or even better performance than the non-private framework and existing LDP protocols, demonstrating the advantages of our LDP protocol.

Supplementary Material

MP4 File (3397271.3401260.mp4)

Video file

Download
33.72 MB

References

[1]

Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of CCS. ACM, 308--318.

Digital Library

[2]

Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical Secure Aggregation for Privacy-Preserving Machine Learning. In Proceedings of CCS. ACM, 1175--1191.

Digital Library

[3]

Maximin Coavoux, Shashi Narayan, and Shay B Cohen. 2018. Privacy-preserving neural representations of text. In Proceedings of EMNLP. 1--10.

[4]

Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier, David Leroy, Clément Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut Lavril, et al. 2018. Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190 (2018).

[5]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[6]

William B Dolan and Chris Brockett. 2005. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005) .

[7]

John C Duchi, Michael I Jordan, and Martin J Wainwright. 2013. Local privacy and statistical minimax rates. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science. IEEE, 429--438.

Digital Library

[8]

Cynthia Dwork and Aaron Roth. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, Vol. 9, 3--4 (2014), 211--407.

[9]

Dirk Hovy, Anders Johannsen, and Anders Søgaard. 2015. User review sites as a resource for large-scale sociolinguistic studies. In Proceedings of WWW. 452--461.

Digital Library

[10]

Dimitrios Kotzias, Misha Denil, Nando De Freitas, and Padhraic Smyth. 2015. From group to individual labels using deep features. In Proceedings of the SIGKDD. ACM, 597--606.

Digital Library

[11]

Yitong Li, Timothy Baldwin, and Trevor Cohn. 2018. Towards robust and privacy-preserving text representations. In Proceedings of ACL. 25--30.

[12]

Lingjuan Lyu, Han Yu, and Qiang Yang. 2020 a. Threats to Federated Learning: A Survey. arXiv preprint arXiv:2003.02133 (2020).

[13]

Lingjuan Lyu, Jiangshan Yu, Karthik Nandakumar, Yitong Li, Xingjun Ma, Jiong Jin, Han Yu, and Kee Siong Ng. 2020 b. Towards Fair and Privacy-Preserving Federated Deep Models. IEEE TPDS, Vol. 31, 11 (2020), 2524--2541.

[14]

H Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et al. 2017. Communication-efficient learning of deep networks from decentralized data. AISTATS (2017).

[15]

Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of EMNLP. 1532--1543.

[16]

Daniel Preoct iuc-Pietro, Vasileios Lampos, and Nikolaos Aletras. 2015. An analysis of the user occupational class through Twitter content. In Proceedings of ACL, Vol. 1. 1754--1764.

[17]

Reza Shokri and Vitaly Shmatikov. 2015. Privacy-preserving deep learning. In Proceedings of CCS. ACM, 1310--1321.

[18]

Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R Bowman. 2018b. Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461 (2018b).

[19]

Tianhao Wang, Jeremiah Blocki, Ninghui Li, and Somesh Jha. 2017. Locally differentially private protocols for frequency estimation. In USENIX Security. 729--745.

[20]

Stanley L Warner. 1965. Randomized response: A survey technique for eliminating evasive answer bias. J. Amer. Statist. Assoc., Vol. 60, 309 (1965), 63--69.

Cited By

Demelius LKern RTrügler A(2025)Recent Advances of Differential Privacy in Centralized Deep Learning: A Systematic SurveyACM Computing Surveys10.1145/371200057:6(1-28)Online publication date: 21-Jan-2025
https://dl.acm.org/doi/10.1145/3712000
Guan Y(2025)Differential Privacy Adversarial Learning with Diffusion Model for the Generation of Privacy-Protected Text RepresentationsProceedings of the 5th International Conference on Big Data Analytics for Cyber-Physical System in Smart City—Volume 110.1007/978-981-96-0208-7_3(25-37)Online publication date: 2-Feb-2025
https://doi.org/10.1007/978-981-96-0208-7_3
Zheng HChen JLiu TCheng YWang ZWang YGao LJi SZhang X(2024)DP-Poison: Poisoning Federated Learning under the Cover of Differential PrivacyACM Transactions on Privacy and Security10.1145/370232528:1(1-28)Online publication date: 2-Nov-2024
https://dl.acm.org/doi/10.1145/3702325
Show More Cited By

Index Terms

Towards Differentially Private Text Representations
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Security and privacy
  1. Human and societal aspects of security and privacy
    1. Privacy protections
  2. Software and application security
    1. Domain-specific security and privacy architectures

Recommendations

Differentially Private Moving Object Database Publication in Location Tracking Service
MoMM '16: Proceedings of the 14th International Conference on Advances in Mobile Computing and Multi Media

Location tracking applications which receives frequent updates of a moving object's position, collect numerous moving objects' location data. Public transit agencies can make use of tracking data to optimize traffic control strategies. While improper ...
Differentially Private Knowledge Distillation for Mobile Analytics
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

The increasing demand for on-device deep learning necessitates the deployment of deep models on mobile devices. However, directly deploying deep models on mobile devices presents both capacity bottleneck and prohibitive privacy risk. To address these ...
Differentially Private Recurrent Variational Autoencoder For Text Privacy Preservation
Abstract
Deep learning techniques have been widely used in natural language processing (NLP) tasks and have made remarkable progress. However, training the deep learning model relies on a large amount of data which may involve sensitive information like ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2020

2548 pages

ISBN:9781450380164

DOI:10.1145/3397271

General Chairs:
Jimmy Huang
York University, Canada
,
Yi Chang
Jilin University, China
,
Xueqi Cheng
Chinese Academy of Sciences, China
,
Program Chairs:
Jaap Kamps
University of Amsterdam, Netherlands
,
Vanessa Murdock
Amazon, U.S.A.
,
Ji-Rong Wen
Renmin University of China, China
,
Yiqun Liu
Tsinghua University, China

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

SIGIR '20

Sponsor:

SIGIR

SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval

July 25 - 30, 2020

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
440
Total Downloads

Downloads (Last 12 months)40
Downloads (Last 6 weeks)2

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Demelius LKern RTrügler A(2025)Recent Advances of Differential Privacy in Centralized Deep Learning: A Systematic SurveyACM Computing Surveys10.1145/371200057:6(1-28)Online publication date: 21-Jan-2025
https://dl.acm.org/doi/10.1145/3712000
Guan Y(2025)Differential Privacy Adversarial Learning with Diffusion Model for the Generation of Privacy-Protected Text RepresentationsProceedings of the 5th International Conference on Big Data Analytics for Cyber-Physical System in Smart City—Volume 110.1007/978-981-96-0208-7_3(25-37)Online publication date: 2-Feb-2025
https://doi.org/10.1007/978-981-96-0208-7_3
Zheng HChen JLiu TCheng YWang ZWang YGao LJi SZhang X(2024)DP-Poison: Poisoning Federated Learning under the Cover of Differential PrivacyACM Transactions on Privacy and Security10.1145/370232528:1(1-28)Online publication date: 2-Nov-2024
https://dl.acm.org/doi/10.1145/3702325
Jiang XHu HOn TLai PMayyuri VChen AShila DLarmuseau AJin RBorcea CPhan N(2024)FLSys: Toward an Open Ecosystem for Federated Learning Mobile AppsIEEE Transactions on Mobile Computing10.1109/TMC.2022.322357823:1(501-519)Online publication date: Jan-2024
https://doi.org/10.1109/TMC.2022.3223578
Wu YXie XXiao ZZhang JXu ZMai Z(2024)Recent Technologies in Differential Privacy for NLP Applications2024 11th International Conference on Soft Computing & Machine Intelligence (ISCMI)10.1109/ISCMI63661.2024.10851615(242-246)Online publication date: 22-Nov-2024
https://doi.org/10.1109/ISCMI63661.2024.10851615
Yin LLin SSun ZWang SLi RHe Y(2024)PriMonitor: An adaptive tuning privacy-preserving approach for multimodal emotion detectionWorld Wide Web10.1007/s11280-024-01246-727:2Online publication date: 2-Feb-2024
https://doi.org/10.1007/s11280-024-01246-7
Lai PAriel Pinto C(2024)Privacy in Federated Learning Natural Language ModelsHandbook of Trustworthy Federated Learning10.1007/978-3-031-58923-2_9(259-287)Online publication date: 10-May-2024
https://doi.org/10.1007/978-3-031-58923-2_9
Fang HChen LLiu YGao Y(2023)Locally Differentially Private Frequency Estimation Based on Convolution Framework2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179389(2208-2222)Online publication date: May-2023
https://doi.org/10.1109/SP46215.2023.10179389
Jia JNishi H(2023)Differential Privacy Framework using Secure Computing on Untrusted Servers2023 IEEE 6th International Conference on Industrial Cyber-Physical Systems (ICPS)10.1109/ICPS58381.2023.10128028(1-6)Online publication date: 8-May-2023
https://doi.org/10.1109/ICPS58381.2023.10128028
Wu CWu FLyu LHuang YXie X(2022)FedCTR: Federated Native Ad CTR Prediction with Cross-platform User Behavior DataACM Transactions on Intelligent Systems and Technology10.1145/350671513:4(1-19)Online publication date: 29-Jun-2022
https://dl.acm.org/doi/10.1145/3506715
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten