research-article

Public Access

SecDeep: Secure and Performant On-device Deep Learning Inference Framework for Mobile and IoT Devices

Authors:

Mani SrivastavaAuthors Info & Claims

IoTDI '21: Proceedings of the International Conference on Internet-of-Things Design and Implementation

Pages 67 - 79

https://doi.org/10.1145/3450268.3453524

Published: 18 May 2021 Publication History

Abstract

There is an increasing emphasis on securing deep learning (DL) inference pipelines for mobile and IoT applications with privacy-sensitive data. Prior works have shown that privacy-sensitive data can be secured throughout deep learning inferences on cloud-offloaded models through trusted execution environments such as Intel SGX. However, prior solutions do not address the fundamental challenges of securing the resource-intensive inference tasks on low-power, low-memory devices (e.g., mobile and IoT devices), while achieving high performance. To tackle these challenges, we propose SecDeep, a low-power DL inference framework demonstrating that both security and performance of deep learning inference on edge devices are well within our reach. Leveraging TEEs with limited resources, SecDeep guarantees full confidentiality for input and intermediate data, as well as the integrity of the deep learning model and framework. By enabling and securing neural accelerators, SecDeep is the first of its kind to provide trusted and performant DL model inferencing on IoT and mobile devices. We implement and validate SecDeep by interfacing the ARM NN DL framework with ARM TrustZone. Our evaluation shows that we can securely run inference tasks with 16× to 172× faster performance than no acceleration approaches by leveraging edge-available accelerators.

References

[1]

J. Amacher and V. Schiavoni. On the performance of arm trustzone. In Distributed Applications and Interoperable Systems, 2019.

[2]

Android. Android neural networks api. https://developer.android.com/ndk/guides/neuralnetworks.

[3]

Apple. Machine learning on ios. https://developer.apple.com/machine-learning/.

[4]

W. A. Arbaugh, D.J. Farber, and J. M. Smith. A secure and reliable bootstrap architecture. In Proceedings. 1997 IEEE Symposium on Security and Privacy (Cat. No.97CB36097), pages 65--71, 1997.

[5]

ARM. https://developer.arm.com/ip-products/security-ip/trustzone.

[6]

ARM NN (Neural Network). https://github.com/ARM-software/armnn.

[7]

A. M. Azab, P. Ning, J. Shah, Q. Chen, R. Bhutkar, G. Ganesh, J. Ma, and W. Shen. Hypervision across worlds: Real-time kernel protection from the arm trustzone secure world. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS '14, 2014.

Digital Library

[8]

S. P. Bayerl, T. Frassetto, P. Jauernig, K. Riedhammer, A. R. Sadeghi, T. Schneider, E. Stapf, and C. Weinert. Offline model guard: Secure and private ml on mobile devices. In 2020 Design, Automation Test in Europe Conference Exhibition (DATE), pages 460--465, 2020.

[9]

M. Bellare, T. Ristenpart, P. Rogaway, and T. Stegers. Format-preserving encryption. In Selected Areas in Cryptography, 2009.

Digital Library

[10]

J. V. Bulck, N. Weichbrodt, R. Kapitza, F. Piessens, and R. Strackx. Telling your secrets without page faults: Stealthy page table-based attacks on enclaved execution. In 26th USENIX Security Symposium (USENIX Security 17), 2017.

Digital Library

[11]

Caffe2 on Smartphone. https://caffe2.ai/docs/mobile-integration.html.

[12]

W. Dai, H. Jin, D. Zou, S. Xu, W. Zheng, L. Shi, and L. T. Yang. Tee: A virtual drtm based execution environment for secure cloud-end computing. Future Generation Computer Systems, 49:47 - 57, 2015.

[13]

DL4J. https://deeplearning4j.org/.

[14]

H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger. Neural acceleration for general-purpose approximate programs. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pages 449--460. IEEE, 2012.

Digital Library

[15]

Z. Gu, H. Huang, J. Zhang, D. Su, A. Lamba, D. Pendarakis, and I. Molloy. Securing input data of deep learning inference systems via partitioned enclave execution. arXiv preprint arXiv:1807.00969, 2018.

[16]

L. Guan, P. Liu, X. Xing, X. Ge, S. Zhang, M. Yu, and T. Jaeger. Trustshadow: Secure execution of unmodified applications with arm trustzone. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. Association for Computing Machinery, 2017.

Digital Library

[17]

S. Han, H. Mao, and W.J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.

[18]

K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.

[19]

Y. He, J. Lin, Z. Liu, H. Wang, L.-J. Li, and S. Han. Amc: Automl for model compression and acceleration on mobile devices. In Proceedings of the European Conference on Computer Vision (ECCV), pages 784--800, 2018.

Digital Library

[20]

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861, 2017.

[21]

F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally, and K. Keutzer. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <1mb model size. CoRR, abs/1602.07360, 2016.

[22]

S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167, 2015.

Digital Library

[23]

D. Jang, H. Lee, M. Kim, D. Kim, D. Kim, and B. B. Kang. Atra: Address translation redirection attack against hardware-based external monitors. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security.

[24]

K. N. Khasawneh, E. M. Koruyeh, C. Song, D. Evtyushkin, D. Ponomarev, and N. Abu-Ghazaleh. Safespec: Banishing the spectre of a meltdown with leakage-free speculation. In 2019 56th ACM/IEEE Design Automation Conference (DAC).

[25]

E. M. Koruyeh, K. N. Khasawneh, C. Song, and N. Abu-Ghazaleh. Spectre returns! speculation attacks using the return stack buffer. In 12th USENIX Workshop on Offensive Technologies (WOOT 18), Baltimore, MD.

[26]

T. Lee, Z. Lin, S. Pushp, C. Li, Y. Liu, Y. Lee, F. Xu, C. Xu, L. Zhang, and J. Song. Occlumency: Privacy-preserving remote deep-learning inference using sgx. In The 25th Annual International Conference on Mobile Computing and Networking.

[27]

E. Li, Z. Zhou, and X. Chen. Edge intelligence: On-demand deep learning model co-inference with device-edge synergy. In Proceedings of the 2018 Workshop on Mobile Edge Communications, pages 31--36, 2018.

Digital Library

[28]

H. Li, K. Ota, and M. Dong. Learning iot in edge: Deep learning for the internet of things with edge computing. IEEE network, 32(1):96-101, 2018.

[29]

R. Liu and M. Srivastava. Virtsense: Virtualize sensing through arm trustzone on internet-of-things. In Proceedings of the 3rd Workshop on System Software for Trusted Execution.

[30]

R. Liu and M. Srivastava. Protc: Protecting drone's peripherals through arm trustzone. In Proceedings of the 3rd Workshop on Micro Aerial Vehicle Networks, Systems, and Applications, DroNet '17, 2017.

Digital Library

[31]

S. S. Ogden and T. Guo. MODI: Mobile deep inference made efficient by edge computing. In USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18).

[32]

O. Ohrimenko, F. Schuster, C. Fournet, A. Mehta, S. Nowozin, K. Vaswani, and M. Costa. Oblivious multi-party machine learning on trusted processors. In 25th USENIX Security Symposium (USENIX Security 16).

[33]

PyTorch on Android. https://pytorch.org/mobile/android/.

[34]

J. Redmon and A. Farhadi. YOLO9000: better, faster, stronger. CoRR, abs/1612.08242, 2016.

[35]

M. Sandler, A. G. Howard, M. Zhu, A. Zhmoginov, and L. Chen. Inverted residuals and linear bottlenecks: Mobile networks for classification, detection and segmentation. CoRR, abs/1801.04381, 2018.

[36]

C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1--9, 2015.

[37]

R. Tartler, A. Kurmus, B. Heinloth, V. Rothberg, A. Ruprecht, D. Dorneanu, R. Kapitza, W. Schröder-Preikschat, and D. Lohmann. Automatic OS kernel TCB reduction by leveraging compile-time configurability. In Eighth Workshop on Hot Topics in System Dependability (HotDep 12), Hollywood, CA, 2012. USENIX Association.

[38]

Tensorflow Lite on Android. https://www.tensorflow.org/lite/guide/android.

[39]

S. Tople, K. Grover, S. Shinde, R. Bhagwan, and R. Ramjee. Privado: Practical and secure DNN inference. CoRR, abs/1810.00602, 2018.

[40]

F. Tramer and D. Boneh. Slalom: Fast, verifiable and private execution of neural networks in trusted hardware. In International Conference on Learning Representations, 2019.

[41]

S. Volos, K. Vaswani, and R. Bruno. Graviton: Trusted execution environments on gpus. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 681--696, Carlsbad, CA, Oct. 2018. USENIX Association.

[42]

C. Watt, J. Renner, N. Popescu, S. Cauligi, and D. Stefan. Ct-wasm: Type-driven secure cryptography for the web ecosystem. Proceedings of the ACM on Programming Languages, 3(POPL):1-29, 2019.

Digital Library

[43]

C.-J. Wu, D. Brooks, K. Chen, D. Chen, S. Choudhury, M. Dukhan, K. Hazelwood, E. Isaac, Y. Jia, B. Jia, et al. Machine learning at facebook: Understanding inference at the edge. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 331--344. IEEE, 2019.

[44]

X. Xu, Y. Ding, S. X. Hu, M. Niemier, J. Cong, Y. Hu, and Y. Shi. Scaling for edge inference of deep neural networks. Nature Electronics, 1(4):216-222, 2018.

[45]

M. Yu, V. D. Gligor, and Z. Zhou. Trusted display on untrusted commodity platforms. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security.

[46]

M. H. Yun and L. Zhong. Ginseng: Keeping secrets in registers when you distrust the operating system. In 23rd Network and Distributed Security Symposium (NDSS 2019), San Diego, CA, 2019.

[47]

Z. Zheng and A. N. Reddy. Towards improving data validity of cyber-physical systems through path redundancy. In Proceedings of the 3rd ACM Workshop on Cyber-Physical System Security, CPSS '17, page 91--102, New York, NY, USA, 2017. Association for Computing Machinery.

Digital Library

[48]

G. Zhu, D. Liu, Y. Du, C. You, J. Zhang, and K. Huang. Toward an intelligent edge: wireless communication meets machine learning. IEEE Communications Magazine, 58(1):19--25, 2020.

Digital Library

Cited By

Mo FTarkhani ZHaddadi H(2024)Machine Learning with Confidential Computing: A Systematization of KnowledgeACM Computing Surveys10.1145/367000756:11(1-40)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1145/3670007
Wang CDeng YNing ZLeach KLi JYan SHe ZCao JZhang F(2024)Building a Lightweight Trusted Execution Environment for Arm GPUsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.3334277(1-16)Online publication date: 2024
https://doi.org/10.1109/TDSC.2023.3334277
Khalili HChien HHass ASehatbakhsh N(2024)Context-Aware Hybrid Encoding for Privacy-Preserving Computation in IoT DevicesIEEE Internet of Things Journal10.1109/JIOT.2023.328852311:1(1054-1064)Online publication date: 1-Jan-2024
https://doi.org/10.1109/JIOT.2023.3288523
Show More Cited By

Index Terms

SecDeep: Secure and Performant On-device Deep Learning Inference Framework for Mobile and IoT Devices
1. Security and privacy
  1. Software and application security
    1. Software security engineering
  2. Systems security
    1. Operating systems security
      1. Mobile platform security
      2. Trusted computing

Recommendations

Internet Censorship
Mean-field variational approximate Bayesian inference for latent variable models

The ill-posed nature of missing variable models offers a challenging testing ground for new computational techniques. This is the case for the mean-field variational Bayesian inference. The behavior of this approach in the setting of the Bayesian probit ...
Latent-Space Variational Bayes

Variational Bayesian Expectation-Maximization (VBEM), an approximate inference method for probabilistic models based on factorizing over latent variables and model parameters, has been a standard technique for practical Bayesian inference. In this paper,...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IoTDI '21: Proceedings of the International Conference on Internet-of-Things Design and Implementation

May 2021

288 pages

ISBN:9781450383547

DOI:10.1145/3450268

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 May 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

IoTDI '21

Sponsor:

SIGBED

IoTDI '21: International Conference on Internet-of-Things Design and Implementation

May 18 - 21, 2021

VA, Charlottesvle, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
1,006
Total Downloads

Downloads (Last 12 months)356
Downloads (Last 6 weeks)53

Reflects downloads up to 10 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Mo FTarkhani ZHaddadi H(2024)Machine Learning with Confidential Computing: A Systematization of KnowledgeACM Computing Surveys10.1145/367000756:11(1-40)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1145/3670007
Wang CDeng YNing ZLeach KLi JYan SHe ZCao JZhang F(2024)Building a Lightweight Trusted Execution Environment for Arm GPUsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.3334277(1-16)Online publication date: 2024
https://doi.org/10.1109/TDSC.2023.3334277
Khalili HChien HHass ASehatbakhsh N(2024)Context-Aware Hybrid Encoding for Privacy-Preserving Computation in IoT DevicesIEEE Internet of Things Journal10.1109/JIOT.2023.328852311:1(1054-1064)Online publication date: 1-Jan-2024
https://doi.org/10.1109/JIOT.2023.3288523
Xie XWang HJian ZLi TWang WXu ZWang G(2024)Memory-Efficient and Secure DNN Inference on TrustZone-enabled Consumer IoT DevicesIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621088(2009-2018)Online publication date: 20-May-2024
https://doi.org/10.1109/INFOCOM52122.2024.10621088
A.V. GT. MD. PE. U(2024)Multimodal Emotion Recognition with Deep LearningInformation Fusion10.1016/j.inffus.2023.102218105:COnline publication date: 16-May-2024
https://dl.acm.org/doi/10.1016/j.inffus.2023.102218
González-Gómez JCordero-Zuñiga KBauer LHenkel J(2023)The First Concept and Real-world Deployment of a GPU-based Thermal Covert Channel: Attack and Countermeasures2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137090(1-6)Online publication date: Apr-2023
https://doi.org/10.23919/DATE56975.2023.10137090
Hu BWang YCheng JZhao TXie YGuo XChen Y(2023)Secure and Efficient Mobile DNN Using Trusted Execution EnvironmentsProceedings of the 2023 ACM Asia Conference on Computer and Communications Security10.1145/3579856.3582820(274-285)Online publication date: 10-Jul-2023
https://dl.acm.org/doi/10.1145/3579856.3582820
Chien HKhalili HHass ASehatbakhsh NCosta XAl Hassanieh HAsadi ACox LPerino DWidmer JGiustiniano D(2023)Enc2: Privacy-Preserving Inference for Tiny IoTs via Encoding and EncryptionProceedings of the 29th Annual International Conference on Mobile Computing and Networking10.1145/3570361.3592501(1-16)Online publication date: 2-Oct-2023
https://dl.acm.org/doi/10.1145/3570361.3592501
Babar MHasan M(2023)Trusted Deep Neural Execution—A SurveyIEEE Access10.1109/ACCESS.2023.327419011(45736-45748)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3274190
Costa MGomes TCabral JMonteiro JTavares APinto S(2023)SecureQNN: Introducing a Privacy-Preserving Framework for QNNs at the Deep EdgeData Science and Artificial Intelligence10.1007/978-981-99-7969-1_1(3-17)Online publication date: 17-Nov-2023
https://doi.org/10.1007/978-981-99-7969-1_1
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents