Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3459637.3482281acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Open access

Natural Language Understanding with Privacy-Preserving BERT

Published: 30 October 2021 Publication History

Abstract

Privacy preservation remains a key challenge in data mining and Natural Language Understanding (NLU). Previous research shows that the input text or even text embeddings can leak private information. This concern motivates our research on effective privacy preservation approaches for pretrained Language Models (LMs). We investigate the privacy and utility implications of applying dχ-privacy, a variant of Local Differential Privacy, to BERT fine-tuning in NLU applications. More importantly, we further propose privacy-adaptive LM pretraining methods and show that our approach can boost the utility of BERT dramatically while retaining the same level of privacy protection. We also quantify the level of privacy preservation and provide guidance on privacy configuration. Our experiments and findings lay the groundwork for future explorations of privacy-preserving NLU with pretrained LMs.

References

[1]
M. Abadi, Andy Chu, Ian J. Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and L. Zhang. 2016. Deep Learning with Differential Privacy. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (2016).
[2]
B. Anandan, C. Clifton, Wenxin Jiang, M. Murugesan, Pedro Pastrana-Camacho, and L. Si. 2012. t-Plausibility: Generalizing Words to Desensitize Text. Trans. Data Priv., Vol. 5 (2012), 505--534.
[3]
Michael Bendersky, X. Wang, Donald Metzler, and Marc Najork. 2017. Learning from User Interactions in Personal Search via Attribute Parameterization. In Proceedings of the 10th ACM International Conference on Web Search and Data Mining. 791--799.
[4]
Abhishek Bhowmick, John C. Duchi, J. Freudiger, G. Kapoor, and Ryan Rogers. 2018. Protection Against Reconstruction and Its Applications in Private Federated Learning. ArXiv, Vol. abs/1812.00984 (2018).
[5]
Vincent Bindschaedler, R. Shokri, and Carl A. Gunter. 2017. Plausible Deniability for Privacy-Preserving Data Synthesis. ArXiv, Vol. abs/1708.07975 (2017).
[6]
T. Brown, B. Mann, Nick Ryder, Melanie Subbiah, J. Kaplan, P. Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, G. Krüger, Tom Henighan, R. Child, Aditya Ramesh, D. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, E. Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, J. Clark, Christopher Berner, Sam McCandlish, A. Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. ArXiv, Vol. abs/2005.14165 (2020).
[7]
K. Chatzikokolakis, M. Andrés, N. E. Bordenabe, and C. Palamidessi. 2013. Broadening the Scope of Differential Privacy Using Metrics. In Proceedings of the 13th International Symposium on Privacy Enhancing Technologies. 82--102.
[8]
Maximin Coavoux, Shashi Narayan, and Shay B. Cohen. 2018. Privacy-preserving Neural Representations of Text. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1--10.
[9]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171--4186.
[10]
C. Dwork, F. McSherry, Kobbi Nissim, and A. D. Smith. 2006. Calibrating Noise to Sensitivity in Private Data Analysis. In Proceedings of the 3rd Theory of Cryptography Conference. 265--284.
[11]
Natasha Fernandes, M. Dras, and A. McIver. 2019. Generalised Differential Privacy for Text Document Processing. In Proceedings of the 8th International Conference on Principles of Security and Trust. 123--148.
[12]
Oluwaseyi Feyisetan, Borja Balle, Thomas Drake, and Tom Diethe. 2020. Privacy- and Utility-Preserving Textual Analysis via Calibrated Multivariate Perturbations. In Proceedings of the 13th International Conference on Web Search and Data Mining. 178--186.
[13]
S. Kasiviswanathan, H. Lee, K. Nissim, Sofya Raskhodnikova, and A. D. Smith. 2008. What Can We Learn Privately? Proceddings of the 49th Annual IEEE Symposium on Foundations of Computer Science (2008), 531--540.
[14]
Mathias Lécuyer, Vaggelis Atlidakis, Roxana Geambasu, D. Hsu, and Suman Jana. 2019. Certified Robustness to Adversarial Examples with Differential Privacy. Proceedings of the 2019 IEEE Symposium on Security and Privacy (2019), 656--672.
[15]
Cheng Li, Mingyang Zhang, Michael Bendersky, H. Deng, Donald Metzler, and Marc Najork. 2019. Multi-view Embedding-based Synonyms for Email Search. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 575--584.
[16]
Yitong Li, Timothy Baldwin, and Trevor Cohn. 2018. Towards Robust and Privacy-preserving Text Representations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 25--30.
[17]
Y. Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv, Vol. abs/1907.11692 (2019).
[18]
L. Lyu, Xuanli He, and Yitong Li. 2020a. Differentially Private Representation for NLP: Formal Guarantee and An Empirical Study on Privacy and Fairness. In Findings of the Association for Computational Linguistics: EMNLP 2020. 2355--2365.
[19]
Lingjuan Lyu, Yitong Li, Xuanli He, and Tong Xiao. 2020b. Towards Differentially Private Text Representations. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1813--1816.
[20]
Lingjuan Lyu, Han Yu, and Q. Yang. 2020c. Threats to Federated Learning: A Survey. ArXiv, Vol. abs/2003.02133 (2020).
[21]
Lingjuan Lyu, Jiangshan Yu, Karthik Nandakumar, Yitong Li, Xingjun Ma, Jiong Jin, H. Yu, and K. S. Ng. 2020 d. Towards Fair and Privacy-Preserving Federated Deep Models. IEEE Transactions on Parallel and Distributed Systems, Vol. 31, 11 (2020), 2524--2541.
[22]
H. McMahan, Eider Moore, D. Ramage, S. Hampson, and Blaise Agüera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics.
[23]
H. McMahan, D. Ramage, Kunal Talwar, and L. Zhang. 2018. Learning Differentially Private Recurrent Language Models. In Proceedings of the 6th International Conference on Learning Representations.
[24]
Tomas Mikolov, Ilya Sutskever, Kai Chen, G. S. Corrado, and J. Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems. 3111--3119.
[25]
Xudong Pan, Mi Zhang, Shouling Ji, and Min Yang. 2020. Privacy Risks of General-Purpose Language Models. Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP) (2020), 1314--1331.
[26]
Chen Qu, Liu Yang, Minghui Qiu, W. Croft, Yongfeng Zhang, and Mohit Iyyer. 2019. BERT with History Answer Embedding for Conversational Question Answering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1133--1136.
[27]
Reza Shokri and Vitaly Shmatikov. 2015. Privacy-preserving deep learning. Proceedings of the 53rd Annual Allerton Conference on Communication, Control, and Computing (2015), 909--910.
[28]
R. Socher, Alex Perelygin, J. Wu, Jason Chuang, Christopher D. Manning, A. Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1631--1642.
[29]
Congzheng Song and Ananth Raghunathan. 2020. Information Leakage in Embedding Models. ArXiv, Vol. abs/2004.00053 (2020).
[30]
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2018. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. ArXiv, Vol. abs/1804.07461 (2018).
[31]
Tianhao Wang, J. Blocki, N. Li, and S. Jha. 2017. Locally Differentially Private Protocols for Frequency Estimation. In Proceedings of the 26th USENIX Conference on Security Symposium. 729--745.
[32]
Benjamin Weggenmann and Florian Kerschbaum. 2018. SynTF: Synthetic and Differentially Private Term Frequency Vectors for Privacy-Preserving Text Mining. In Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 305--314.
[33]
Xi Wu, Fengan Li, A. Kumar, K. Chaudhuri, S. Jha, and J. Naughton. 2017. Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics. Proceedings of the 2017 ACM International Conference on Management of Data (2017), 1307--1322.
[34]
Y. Wu, Mike Schuster, Z. Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, M. Krikun, Yuan Cao, Q. Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, M. Johnson, X. Liu, L. Kaiser, S. Gouws, Y. Kato, Taku Kudo, H. Kazawa, K. Stevens, G. Kurian, Nishant Patil, W. Wang, C. Young, J. Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, G. S. Corrado, Macduff Hughes, and J. Dean. 2016. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. ArXiv, Vol. abs/1609.08144 (2016).
[35]
Z. Yang, Zihang Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Advances in Neural Information Processing Systems. 5753--5763.
[36]
Hamed Zamani, M. Dehghani, W. Croft, E. Learned-Miller, and J. Kamps. 2018. From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 497--506.
[37]
Hongfei Zhang, Xia Song, Chenyan Xiong, C. Rosset, P. Bennett, Nick Craswell, and Saurabh Tiwary. 2019. Generic Intent Representation in Web Search. In SIGIR.
[38]
X. Zhang, J. Zhao, and Y. LeCun. 2015. Character-level Convolutional Networks for Text Classification. In Advances in Neural Information Processing Systems. 649--657.
[39]
Zhe Zhao, T. Liu, Shen Li, Bofang Li, and X. Du. 2017. Ngram2vec: Learning Improved Word Representations from Ngram Co-occurrence Statistics. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 244--253.
[40]
Y. Zhu, Ryan Kiros, R. Zemel, R. Salakhutdinov, R. Urtasun, A. Torralba, and S. Fidler. 2015. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books. Proceedings of the 2015 IEEE International Conference on Computer Vision (2015), 19--27.

Cited By

View all
  • (2024)PPTIF: Privacy-Preserving Transformer Inference Framework for Language TranslationIEEE Access10.1109/ACCESS.2024.338426812(48881-48897)Online publication date: 2024
  • (2024)Resource Provider Determination in Cloud EnvironmentsAdvances in Information and Communication10.1007/978-3-031-53963-3_45(646-653)Online publication date: 17-Mar-2024
  • (2024)Privacy Preservation of Large Language Models in the Metaverse Era: Research Frontiers, Categorical Comparisons, and Future DirectionsInternational Journal of Network Management10.1002/nem.2292Online publication date: 29-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. language model pretraining
  2. local privacy constraints
  3. natural language understanding

Qualifiers

  • Research-article

Conference

CIKM '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)649
  • Downloads (Last 6 weeks)64
Reflects downloads up to 02 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)PPTIF: Privacy-Preserving Transformer Inference Framework for Language TranslationIEEE Access10.1109/ACCESS.2024.338426812(48881-48897)Online publication date: 2024
  • (2024)Resource Provider Determination in Cloud EnvironmentsAdvances in Information and Communication10.1007/978-3-031-53963-3_45(646-653)Online publication date: 17-Mar-2024
  • (2024)Privacy Preservation of Large Language Models in the Metaverse Era: Research Frontiers, Categorical Comparisons, and Future DirectionsInternational Journal of Network Management10.1002/nem.2292Online publication date: 29-Jul-2024
  • (2023)Privformer: Privacy-preserving Transformer with MPC2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P)10.1109/EuroSP57164.2023.00031(392-410)Online publication date: Jul-2023
  • (2023)Summary and OutlookFoundation Models for Natural Language Processing10.1007/978-3-031-23190-2_8(383-419)Online publication date: 27-Feb-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media