Abstract
Deep Neural Networks (DNNs) have been widely applied to diverse real life applications and dominated in most cases. Considering the hardware consumption for DNN and large amount of labeled training data to support the performance, machine-learning-as-a-service (MLaaS) came into being. However, malicious attacker takes the opportunity to launch possible deep model stealing attacks via black-box access, leading to a great security threat to the interests of the model agency. Addressing to the problem, defensive methods are designed, mainly categorized to truncated-based and perturbation-based, to reduce the stealing efficiency or increase the attack cost, i.e. more queries. Essentially, it is still a challenge to fully defend the deep model stealing attack. In the paper, we propose a novel defense algorithm based on adaptive softmax transformation by introducing posterior probability perturbation, namely DAS-AST. We evaluate the proposed defense against several state-of-the-art attack strategies, and compare the performance with other defense methods. The experiment results show that our defense is effective across a wide range of challenging datasets and performs better than the existing defenses. More specifically, it can degrade the average accuracy of the stolen model at least 30%, without affect the accuracy of target DNN model on original tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
‘Amazon machine learning,’ https://aws.amazon.com/aml/.
- 2.
‘Azure machine learning,’ https://azure.microsoft.com/en-us/overview/machine-learning.
- 3.
MNIST can be download at: http://yann.lecun.com/exdb/mnist/.
- 4.
Fashion-MNIST can be download at: https://www.worldlink.com.cn/en/osdir/fashion-mnist.html.
- 5.
CIFAR10 can be download at: https://www.cs.toronto.edu/ kriz/cifar.html.
References
Ateniese, G., Mancini, L.V., Spognardi, A., Villani, A., Vitali, D., Felici, G.: Hacking smart machines with smarter ones: how to extract meaningful data from machine learning classifiers. Int. J. Secur. Netw. 10(3), 137–150 (2015)
Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420 (2018)
Correia-Silva, J.R., Berriel, R.F., Badue, C., de Souza, A.F., Oliveira-Santos, T.: Copycat CNN: stealing knowledge by persuading confession with random non-labeled data. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2018)
Deng, L.: The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 29(6), 141–142 (2012)
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333 (2015)
Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing. In: 23rd \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 2014), pp. 17–32 (2014)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2016)
Höhna, S., Coghill, L.M., Mount, G.G., Thomson, R.C., Brown, J.M.: P3: Phylogenetic posterior prediction in RevBayes. Mol. Biol. Evol. 35(4), 1028–1034 (2018)
Juuti, M., Szyller, S., Marchal, S., Asokan, N.: Prada: protecting against dnn model stealing attacks. In: 2019 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 512–527. IEEE (2019)
Kesarwani, M., Mukhoty, B., Arya, V., Mehta, S.: Model extraction warning in MLaaS paradigm. In: Proceedings of the 34th Annual Computer Security Applications Conference, pp. 371–380 (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks (2012)
LeCun, Y., et al.: LeNet-5, convolutional neural networks, vol. 20, no. 5, p. 14 (2015). http://yann.lecun.com/exdb/lenet
Lee, T., Edwards, B., Molloy, I., Su, D.: Defending against machine learning model stealing attacks using deceptive perturbations. arXiv preprint arXiv:1806.00054 (2018)
Lowd, D., Meek, C.: Adversarial learning. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 641–647 (2005)
Milli, S., Schmidt, L., Dragan, A.D., Hardt, M.: Model reconstruction from model explanations. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 1–9 (2019)
Murphy, G.C., Notkin, D.: Lightweight source model extraction. ACM SIGSOFT Softw. Eng. Notes 20(4), 116–127 (1995)
Nelson, B., et al.: Misleading learners: co-opting your spam filter. In: Yu, P.S., Tsai, J.J.P. (eds.) Machine Learning in Cyber Trust, pp. 17–51. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-88735-7_2
Oh, S.J., Schiele, B., Fritz, M.: Towards reverse-engineering black-box neural networks. In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R. (eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS (LNAI), vol. 11700, pp. 121–144. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28954-6_7
Orekondy, T., Schiele, B., Fritz, M.: Knockoff nets: stealing functionality of black-box models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4954–4963 (2019)
Orekondy, T., Schiele, B., Fritz, M.: Prediction poisoning: towards defenses against DNN model stealing attacks. In: International Conference on Learning Representations (2019)
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519 (2017)
Papernot, N., McDaniel, P., Sinha, A., Wellman, M.: Towards the science of security and privacy in machine learning. arXiv preprint arXiv:1611.03814 (2016)
Salem, A., Zhang, Y., Humbert, M., Berrang, P., Fritz, M., Backes, M.: ML-leaks: model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246 (2018)
Selvaraju, R.R., Das, A., Vedantam, R., Cogswell, M., Parikh, D., Batra, D.: Grad-CAM: why did you say that? Visual explanations from deep networks via gradient-based localization (2016)
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE (2017)
Siciliano, R., Aria, M., D’Ambrosio, A.: Posterior prediction modelling of optimal trees. In: Brito, P. (ed.) COMPSTAT 2008, pp. 323–334. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-7908-2084-3_27
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations, May 2015
Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIs. In: 25th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 2016), pp. 601–618 (2016)
Wang, B., Gong, N.Z.: Stealing hyperparameters in machine learning. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 36–52. IEEE (2018)
Yoshida, K., Kubota, T., Shiozaki, M., Fujino, T.: Model-extraction attack against FPGA-DNN accelerator utilizing correlation electromagnetic analysis. In: 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 318–318. IEEE (2019)
Zheng, H., Ye, Q., Hu, H., Fang, C., Shi, J.: BDPL: a boundary differentially private layer against machine learning model extraction attacks. In: Sako, K., Schneider, S., Ryan, P.Y.A. (eds.) ESORICS 2019. LNCS, vol. 11735, pp. 66–83. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29959-0_4
Acknowledgments
This research was supported by the Zhejiang Provincial Natural Science Foundation of China under Grant No. LY19F020025, the Major Special Funding for Science and Technology Innovation 2025 in Ningbo under Grant No. 2018B10063, the National Key Research and Development Program of China under Grant No. 2018AAA0100800.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, J., Wu, C., Shen, S., Zhang, X., Chen, J. (2021). DAS-AST: Defending Against Model Stealing Attacks Based on Adaptive Softmax Transformation. In: Wu, Y., Yung, M. (eds) Information Security and Cryptology. Inscrypt 2020. Lecture Notes in Computer Science(), vol 12612. Springer, Cham. https://doi.org/10.1007/978-3-030-71852-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-71852-7_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71851-0
Online ISBN: 978-3-030-71852-7
eBook Packages: Computer ScienceComputer Science (R0)