Abstract
Machine learning models often overfit to the training data and do not learn general patterns like humans do. This allows an attacker to learn private membership or attributes about the training data, simply by having access to the machine learning model. We argue that this vulnerability of current machine learning models makes them indirect stores of the personal data used for training and therefore, corresponding data protection regulations must apply to machine learning models as well. In this position paper, we specifically analyze how the “right-to-be-forgotten” provided by the European Union General Data Protection Regulation can be implemented on current machine learning models and which techniques can be used to build future models that can forget. This document also serves as a call-to-action for researchers and policy-makers to identify other technologies that can be used for this purpose.
Supported by Symantec Corporation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Equifax identifies additional 2.4 million customers hit by data breach (2018). https://www.nbcnews.com/business/business-news/equifax-identifies-additional-2-4-million-customers-hit-data-breach-n852226
Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS 2016, pp. 308–318. ACM, New York (2016)
Abowd, J.M.: The US Census Bureau adopts differential privacy. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, pp. 2867–2867 (2018)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: ACM SIGMOD Record, vol. 29, pp. 439–450. ACM (2000)
Bittau, A., et al.: PROCHLO: strong privacy for analytics in the crowd. In: Proceedings of the 26th Symposium on Operating Systems Principles (SOSP), pp. 441–459. ACM (2017)
Cadwalladr, C., Graham-Harrison, E.: Revealed: 50 million facebook profiles harvested for cambridge analytica in major data breach (2018). https://www.theguardian.com/news/2018/mar/17/cambridge-analytica-facebook-influence-us-election
Cao, Y., Yang, J.: Towards making systems forget with machine unlearning. In: 2015 IEEE Symposium on Security and Privacy, pp. 463–480, May 2015. https://doi.org/10.1109/SP.2015.35
Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12(Mar), 1069–1109 (2011)
Ding, Z., Wang, Y., Wang, G., Zhang, D., Kifer, D.: Detecting violations of differential privacy. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, pp. 475–489. ACM, New York (2018)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). Official Journal of the European Union L119, pp. 1–88, May 2016. http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ:L:2016:119:TOC
Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. Inf. Syst. 29(4), 343–364 (2004)
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, CCS 2015, pp. 1322–1333. ACM, New York (2015). https://doi.org/10.1145/2810103.2813677
Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. In: International Conference on Machine Learning (2017)
McDonald, A.M., Cranor, L.F.: The cost of reading privacy policies. ISJLP 4, 543 (2008)
Veale, M., Binns, R., Edwards, L.: Algorithms that remember: model inversion attacks and data protection law. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 376(2133), 20180083 (2018)
Papernot, N., Song, S., Mironov, I., Raghunathan, A., Talwar, K., Erlingsson, Ú.: Scalable private learning with pate. CoRR abs/1802.08908 (2018)
Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference (2015)
Salem, A., Zhang, Y., Humbert, M., Berrang, P., Fritz, M., Backes, M.: ML-leaks: model and data independent membership inference attacks and defenses on machine learning models. In: 26th Annual Network and Distributed System Security Symposium (NDSS 2019), February 2019. https://publications.cispa.saarland/2754/
Sheffet, O.: Private approximations of the 2nd-moment matrix using existing techniques in linear regression. arXiv preprint arXiv:1507.00056 (2015)
Shokri, R., Stronati, M., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18 (2017)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
Tang, J., Korolova, A., Bai, X., Wang, X., Wang, X.: Privacy loss in apple’s implementation of differential privacy on MacOS 10.12. CoRR (2017). http://arxiv.org/abs/1709.02753
Valentino-DeVries, J., Singer, N., Keller, M.H., Krolik, A.: Your apps know where you were last night, and they’re not keeping it secret (2018). https://www.nytimes.com/interactive/2018/12/10/business/location-data-privacy-apps.html?module=inline
Yeom, S., Giacomelli, I., Fredrikson, M., Jha, S.: Privacy risk in machine learning: analyzing the connection to overfitting. In: 2018 IEEE 31st Computer Security Foundations Symposium (CSF), pp. 268–282 (2018)
Zhu, T., Li, G., Zhou, W., Yu, P.S.: Differentially private deep learning. Differential Privacy and Applications. AIS, vol. 69, pp. 67–82. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62004-6_7
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Shintre, S., Roundy, K.A., Dhaliwal, J. (2019). Making Machine Learning Forget. In: Naldi, M., Italiano, G., Rannenberg, K., Medina, M., Bourka, A. (eds) Privacy Technologies and Policy. APF 2019. Lecture Notes in Computer Science(), vol 11498. Springer, Cham. https://doi.org/10.1007/978-3-030-21752-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-21752-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21751-8
Online ISBN: 978-3-030-21752-5
eBook Packages: Computer ScienceComputer Science (R0)