DAS-AST: Defending Against Model Stealing Attacks Based on Adaptive Softmax Transformation

Chen, Jinyin; Wu, Changan; Shen, Shijing; Zhang, Xuhong; Chen, Jianhao

doi:10.1007/978-3-030-71852-7_2

Jinyin Chen¹⁰,
Changan Wu¹⁰,
Shijing Shen¹⁰,
Xuhong Zhang¹⁰ &
…
Jianhao Chen¹¹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12612))

Included in the following conference series:

International Conference on Information Security and Cryptology

1024 Accesses
1 Citations

Abstract

Deep Neural Networks (DNNs) have been widely applied to diverse real life applications and dominated in most cases. Considering the hardware consumption for DNN and large amount of labeled training data to support the performance, machine-learning-as-a-service (MLaaS) came into being. However, malicious attacker takes the opportunity to launch possible deep model stealing attacks via black-box access, leading to a great security threat to the interests of the model agency. Addressing to the problem, defensive methods are designed, mainly categorized to truncated-based and perturbation-based, to reduce the stealing efficiency or increase the attack cost, i.e. more queries. Essentially, it is still a challenge to fully defend the deep model stealing attack. In the paper, we propose a novel defense algorithm based on adaptive softmax transformation by introducing posterior probability perturbation, namely DAS-AST. We evaluate the proposed defense against several state-of-the-art attack strategies, and compare the performance with other defense methods. The experiment results show that our defense is effective across a wide range of challenging datasets and performs better than the existing defenses. More specifically, it can degrade the average accuracy of the stolen model at least 30%, without affect the accuracy of target DNN model on original tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Adversarial Attack, Defense, and Applications with Deep Learning Frameworks

Towards Minimising Perturbation Rate for Adversarial Machine Learning with Pruning

Free Adversarial Training with Layerwise Heuristic Learning

Notes

1.
‘Amazon machine learning,’ https://aws.amazon.com/aml/.
2.
‘Azure machine learning,’ https://azure.microsoft.com/en-us/overview/machine-learning.
3.
MNIST can be download at: http://yann.lecun.com/exdb/mnist/.
4.
Fashion-MNIST can be download at: https://www.worldlink.com.cn/en/osdir/fashion-mnist.html.
5.
CIFAR10 can be download at: https://www.cs.toronto.edu/ kriz/cifar.html.

References

Ateniese, G., Mancini, L.V., Spognardi, A., Villani, A., Vitali, D., Felici, G.: Hacking smart machines with smarter ones: how to extract meaningful data from machine learning classifiers. Int. J. Secur. Netw. 10(3), 137–150 (2015)
Article Google Scholar
Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420 (2018)
Correia-Silva, J.R., Berriel, R.F., Badue, C., de Souza, A.F., Oliveira-Santos, T.: Copycat CNN: stealing knowledge by persuading confession with random non-labeled data. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2018)
Google Scholar
Deng, L.: The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 29(6), 141–142 (2012)
Article Google Scholar
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333 (2015)
Google Scholar
Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing. In: 23rd $\{$USENIX$\}$ Security Symposium ($\{$USENIX$\}$ Security 2014), pp. 17–32 (2014)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2016)
Google Scholar
Höhna, S., Coghill, L.M., Mount, G.G., Thomson, R.C., Brown, J.M.: P3: Phylogenetic posterior prediction in RevBayes. Mol. Biol. Evol. 35(4), 1028–1034 (2018)
Article Google Scholar
Juuti, M., Szyller, S., Marchal, S., Asokan, N.: Prada: protecting against dnn model stealing attacks. In: 2019 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 512–527. IEEE (2019)
Google Scholar
Kesarwani, M., Mukhoty, B., Arya, V., Mehta, S.: Model extraction warning in MLaaS paradigm. In: Proceedings of the 34th Annual Computer Security Applications Conference, pp. 371–380 (2018)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks (2012)
Google Scholar
LeCun, Y., et al.: LeNet-5, convolutional neural networks, vol. 20, no. 5, p. 14 (2015). http://yann.lecun.com/exdb/lenet
Lee, T., Edwards, B., Molloy, I., Su, D.: Defending against machine learning model stealing attacks using deceptive perturbations. arXiv preprint arXiv:1806.00054 (2018)
Lowd, D., Meek, C.: Adversarial learning. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 641–647 (2005)
Google Scholar
Milli, S., Schmidt, L., Dragan, A.D., Hardt, M.: Model reconstruction from model explanations. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 1–9 (2019)
Google Scholar
Murphy, G.C., Notkin, D.: Lightweight source model extraction. ACM SIGSOFT Softw. Eng. Notes 20(4), 116–127 (1995)
Article Google Scholar
Nelson, B., et al.: Misleading learners: co-opting your spam filter. In: Yu, P.S., Tsai, J.J.P. (eds.) Machine Learning in Cyber Trust, pp. 17–51. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-88735-7_2
Chapter Google Scholar
Oh, S.J., Schiele, B., Fritz, M.: Towards reverse-engineering black-box neural networks. In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R. (eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS (LNAI), vol. 11700, pp. 121–144. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28954-6_7
Chapter Google Scholar
Orekondy, T., Schiele, B., Fritz, M.: Knockoff nets: stealing functionality of black-box models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4954–4963 (2019)
Google Scholar
Orekondy, T., Schiele, B., Fritz, M.: Prediction poisoning: towards defenses against DNN model stealing attacks. In: International Conference on Learning Representations (2019)
Google Scholar
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519 (2017)
Google Scholar
Papernot, N., McDaniel, P., Sinha, A., Wellman, M.: Towards the science of security and privacy in machine learning. arXiv preprint arXiv:1611.03814 (2016)
Salem, A., Zhang, Y., Humbert, M., Berrang, P., Fritz, M., Backes, M.: ML-leaks: model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246 (2018)
Selvaraju, R.R., Das, A., Vedantam, R., Cogswell, M., Parikh, D., Batra, D.: Grad-CAM: why did you say that? Visual explanations from deep networks via gradient-based localization (2016)
Google Scholar
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE (2017)
Google Scholar
Siciliano, R., Aria, M., D’Ambrosio, A.: Posterior prediction modelling of optimal trees. In: Brito, P. (ed.) COMPSTAT 2008, pp. 323–334. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-7908-2084-3_27
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations, May 2015
Google Scholar
Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIs. In: 25th $\{$USENIX$\}$ Security Symposium ($\{$USENIX$\}$ Security 2016), pp. 601–618 (2016)
Google Scholar
Wang, B., Gong, N.Z.: Stealing hyperparameters in machine learning. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 36–52. IEEE (2018)
Google Scholar
Yoshida, K., Kubota, T., Shiozaki, M., Fujino, T.: Model-extraction attack against FPGA-DNN accelerator utilizing correlation electromagnetic analysis. In: 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 318–318. IEEE (2019)
Google Scholar
Zheng, H., Ye, Q., Hu, H., Fang, C., Shi, J.: BDPL: a boundary differentially private layer against machine learning model extraction attacks. In: Sako, K., Schneider, S., Ryan, P.Y.A. (eds.) ESORICS 2019. LNCS, vol. 11735, pp. 66–83. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29959-0_4
Chapter Google Scholar

Download references

Acknowledgments

This research was supported by the Zhejiang Provincial Natural Science Foundation of China under Grant No. LY19F020025, the Major Special Funding for Science and Technology Innovation 2025 in Ningbo under Grant No. 2018B10063, the National Key Research and Development Program of China under Grant No. 2018AAA0100800.

Author information

Authors and Affiliations

College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
Jinyin Chen, Changan Wu, Shijing Shen & Xuhong Zhang
College of Control Science and Engineering, Zhejiang University, Hangzhou, China
Jianhao Chen

Authors

Jinyin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Changan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shijing Shen
View author publications
You can also search for this author in PubMed Google Scholar
Xuhong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jianhao Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinyin Chen .

Editor information

Editors and Affiliations

Jinan University, Guangzhou, China
Yongdong Wu
Computer Science Department, Columbia University, New York, NY, USA
Moti Yung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, J., Wu, C., Shen, S., Zhang, X., Chen, J. (2021). DAS-AST: Defending Against Model Stealing Attacks Based on Adaptive Softmax Transformation. In: Wu, Y., Yung, M. (eds) Information Security and Cryptology. Inscrypt 2020. Lecture Notes in Computer Science(), vol 12612. Springer, Cham. https://doi.org/10.1007/978-3-030-71852-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-71852-7_2
Published: 13 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71851-0
Online ISBN: 978-3-030-71852-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

DAS-AST: Defending Against Model Stealing Attacks Based on Adaptive Softmax Transformation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Adversarial Attack, Defense, and Applications with Deep Learning Frameworks

Towards Minimising Perturbation Rate for Adversarial Machine Learning with Pruning

Free Adversarial Training with Layerwise Heuristic Learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

DAS-AST: Defending Against Model Stealing Attacks Based on Adaptive Softmax Transformation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Adversarial Attack, Defense, and Applications with Deep Learning Frameworks

Towards Minimising Perturbation Rate for Adversarial Machine Learning with Pruning

Free Adversarial Training with Layerwise Heuristic Learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation