Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3131672.3131675acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article
Public Access

DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework

Published: 06 November 2017 Publication History

Abstract

Recent advances in deep learning motivate the use of deep neutral networks in sensing applications, but their excessive resource needs on constrained embedded devices remain an important impediment. A recently explored solution space lies in compressing (approximating or simplifying) deep neural networks in some manner before use on the device. We propose a new compression solution, called DeepIoT, that makes two key contributions in that space. First, unlike current solutions geared for compressing specific types of neural networks, DeepIoT presents a unified approach that compresses all commonly used deep learning structures for sensing applications, including fully-connected, convolutional, and recurrent neural networks, as well as their combinations. Second, unlike solutions that either sparsify weight matrices or assume linear structure within weight matrices, DeepIoT compresses neural network structures into smaller dense matrices by finding the minimum number of non-redundant hidden elements, such as filters and dimensions required by each layer, while keeping the performance of sensing applications the same. Importantly, it does so using an approach that obtains a global view of parameter redundancies, which is shown to produce superior compression. The compressed model generated by DeepIoT can directly use existing deep learning libraries that run on embedded and mobile systems without further modifications. We conduct experiments with five different sensing-related tasks on Intel Edison devices. DeepIoT outperforms all compared baseline algorithms with respect to execution time and energy consumption by a significant margin. It reduces the size of deep neural networks by 90% to 98.9%. It is thus able to shorten execution time by 71.4% to 94.5%, and decrease energy consumption by 72.2% to 95.7%. These improvements are achieved without loss of accuracy. The results underscore the potential of DeepIoT for advancing the exploitation of deep neural networks on resource-constrained embedded devices.

References

[1]
{n. d.}. Intel Edison Compute Module. http://www.intel.com/content/dam/support/us/en/documents/edison/sb/edison-module_HG_331189.pdf. ({n. d.}).
[2]
{n. d.}. Loading Debian (Ubilinux) on the Edison. https://learn.sparkfun.com/tutorials/loading-debian-ubilinux-on-the-edison. ({n. d.}).
[3]
Mohammad Abu Alsheikh, Shaowei Lin, Dusit Niyato, and Hwee-Pink Tan. 2014. Machine learning in wireless sensor networks: Algorithms, strategies, and applications. IEEE Communications Surveys & Tutorials 16, 4 (2014), 1996--2018.
[4]
Sourav Bhattacharya and Nicholas D Lane. 2016. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM. ACM, 176--189.
[5]
Niels Brouwers, Marco Zuniga, and Koen Langendoen. 2014. Incremental wi-fi scanning for energy-efficient localization. In Pervasive Computing and Communications (PerCom), 2014 IEEE International Conference on. IEEE, 156--162.
[6]
Licia Capra, Wolfgang Emmerich, and Cecilia Mascolo. 2003. Carisma: Context-aware reflective middleware system for mobile applications. IEEE Transactions on software engineering 29, 10 (2003), 929--945.
[7]
Eunjoon Cho, Kevin Wong, Omprakash Gnawali, Martin Wicke, and Leonidas Guibas. 2011. Inferring mobile trajectories using a network of binary proximity sensors. In Sensor, Mesh and Ad Hoc Communications and Networks (SECON), 2011 8th Annual IEEE Communications Society Conference on. IEEE, 188--196.
[8]
Ronald Clark, Sen Wang, Hongkai Wen, Andrew Markham, and Niki Trigoni. 2017. VINet: Visual Inertial Odometry as a Sequence to Sequence Learning Problem. In AAAI Conference on Artificial Intelligence (AAAI).
[9]
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2014. Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024 (2014).
[10]
Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830 (2016).
[11]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 248--255.
[12]
Emily L Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, and Rob Fergus. 2014. Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in Neural Information Processing Systems. 1269--1277.
[13]
Federico Ferrari, Marco Zimmerling, Luca Mottola, and Lothar Thiele. 2012. Low-power wireless bus. In Proceedings of the 10th ACM Conference on Embedded Network Sensor Systems. ACM, 1--14.
[14]
Yarin Gal and Zoubin Ghahramani. 2015. Bayesian convolutional neural networks with Bernoulli approximate variational inference. arXiv preprint arXiv:1506.02158 (2015).
[15]
Yarin Gal and Zoubin Ghahramani. 2016. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. (2016).
[16]
Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. Audio Set: An ontology and human-labeled dataset for audio events. In Proc. IEEE ICASSP 2017. New Orleans, LA.
[17]
Peter W Glynn. 1990. Likelihood ratio gradient estimation for stochastic systems. Commun. ACM 33, 10 (1990), 75--84.
[18]
Yunchao Gong, Liu Liu, Ming Yang, and Lubomir Bourdev. 2014. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115 (2014).
[19]
Georgios Goumas, Kornilios Kourtis, Nikos Anastopoulos, Vasileios Karakasis, and Nectarios Koziris. 2008. Understanding the performance of sparse matrix-vector multiplication. In Parallel, Distributed and Network-Based Processing, 2008. PDP 2008. 16th Euromicro Conference on. IEEE, 283--292.
[20]
Alex Graves and Navdeep Jaitly. 2014. Towards End-To-End Speech Recognition with Recurrent Neural Networks. In ICML, Vol. 14. 1764--1772.
[21]
Shixiang Gu, Sergey Levine, Ilya Sutskever, and Andriy Mnih. 2015. MuProp: Unbiased Backpropagation for Stochastic Neural Networks. arXiv preprint arXiv:1511.05176 (2015).
[22]
Yiwen Guo, Anbang Yao, and Yurong Chen. 2016. Dynamic Network Surgery for Efficient DNNs. In Advances In Neural Information Processing Systems. 1379--1387.
[23]
Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep Learning with Limited Numerical Precision. In ICML. 1737--1746.
[24]
Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. CoRR, abs/1510.00149 2 (2015).
[25]
Josiah Hester, Travis Peters, Tianlong Yun, Ronald Peterson, Joseph Skinner, Bhargav Golla, Kevin Storer, Steven Hearndon, Kevin Freeman, Sarah Lord, et al. 2016. Amulet: An Energy-Efficient, Multi-Application Wearable Platform. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM. ACM, 216--229.
[26]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
[27]
Enamul Hoque, Robert F Dickerson, and John A Stankovic. 2014. Vocal-diary: A voice command based ground truth collection system for activity recognition. In Proceedings of the Wireless Health 2014 on National Institutes of Health. ACM, 1--6.
[28]
Vijay R Konda and John N Tsitsiklis. 1999. Actor-Critic Algorithms. In NIPS, Vol. 13. 1008--1014.
[29]
Branislav Kusy, Akos Ledeczi, and Xenofon Koutsoukos. 2007. Tracking mobile nodes using rf doppler shifts. In Proceedings of the 5th international conference on Embedded networked sensor systems. ACM, 29--42.
[30]
Nicholas D Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. Deepx: A software accelerator for low-power deep learning inference on mobile devices. In Information Processing in Sensor Networks (IPSN), 2016 15th ACM/IEEE International Conference on. IEEE, 1--12.
[31]
Nicholas D Lane, Petko Georgiev, and Lorena Qendro. 2015. DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, 283--294.
[32]
Andriy Mnih and Karol Gregor. 2014. Neural variational inference and learning in belief networks. (2014).
[33]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin A. Riedmiller, Andreas Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518 (2015), 529--533.
[34]
Shahriar Nirjon, Robert F Dickerson, Qiang Li, Philip Asare, John A Stankovic, Dezhi Hong, Ben Zhang, Xiaofan Jiang, Guobin Shen, and Feng Zhao. 2012. Musicalheart: A hearty way of listening to music. In Proceedings of the 10th ACM Conference on Embedded Network Sensor Systems. ACM, 43--56.
[35]
Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. 2015. Librispeech: an ASR corpus based on public domain audio books. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 5206--5210.
[36]
Jan Peters and Stefan Schaal. 2006. Policy gradient methods for robotics. In 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2219--2225.
[37]
Valentin Radu, Nicholas D Lane, Sourav Bhattacharya, Cecilia Mascolo, Mahesh K Marina, and Fahim Kawsar. 2016. Towards multimodal deep learning for activity recognition on mobile devices. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct. ACM, 185--188.
[38]
Stefano Rosa, Xiaoxuan Lu, Hongkai Wen, and Niki Trigoni. 2017. Leveraging User Activities and Mobile Robots for Semantic Mapping and User Localization. In Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction. ACM, 267--268.
[39]
Anthony Rowe, Mario Berges, and Raj Rajkumar. 2010. Contactless sensing of appliance state transitions through variations in electromagnetic fields. In Proceedings of the 2nd ACM workshop on embedded sensing systems for energy-efficiency in building. ACM, 19--24.
[40]
Abusayeed Saifullah, Mahbubur Rahman, Dali Ismail, Chenyang Lu, Ranveer Chandra, and Jie Liu. 2016. SNOW: Sensor network over white spaces. In Proceedings of the International Conference on Embedded Networked Sensor Systems (ACM SenSys).
[41]
Markus Schuss, Carlo Alberto Boano, Manuel Weber, and Kay Roemer. 2017. A Competition to Push the Dependability of Low-Power Wireless Protocols to the Edge. In Proceedings of the 14th International Conference on Embedded Wireless Systems and Networks (EWSN). Uppsala, Sweden.
[42]
Yiran Shen, Wen Hu, Mingrui Yang, Bo Wei, Simon Lucey, and Chun Tung Chou. 2014. Face recognition on smartphones via optimised sparse representation classification. In Information Processing in Sensor Networks, IPSN-14 Proceedings of the 13th International Symposium on. IEEE, 237--248.
[43]
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484--489.
[44]
Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958.
[45]
Allan Stisen, Henrik Blunck, Sourav Bhattacharya, Thor Siiger Prentow, Mikkel Baun Kjærgaard, Anind Dey, Tobias Sonne, and Mads Móller Jensen. 2015. Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems. ACM, 127--140.
[46]
Zheng Sun, Aveek Purohit, Kaifei Chen, Shijia Pan, Trevor Pering, and Pei Zhang. 2011. PANDAA: physical arrangement detection of networked devices through ambient-sound awareness. In Proceedings of the 13th international conference on Ubiquitous computing. ACM, 425--434.
[47]
Cheng Tai, Tong Xiao, Yi Zhang, Xiaogang Wang, et al. 2015. Convolutional neural networks with low-rank regularization. arXiv preprint arXiv:1511.06067 (2015).
[48]
Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688 (May 2016). http://arxiv.org/abs/1605.02688
[49]
Shuai Wang, Song Min Kim, Yunhuai Liu, Guang Tan, and Tian He. 2013. Corlayer: A transparent link correlation layer for energy efficient broadcast. In Proceedings of the 19th annual international conference on Mobile computing & networking. ACM, 51--62.
[50]
Shangxing Wang, Hanpeng Liu, Pedro Henrique Gomes, and Bhaskar Krishnamachari. 2017. Deep Reinforcement Learning for Dynamic Multichannel Access. (2017).
[51]
Yunhe Wang, Chang Xu, Shan You, Dacheng Tao, and Chao Xu. 2016. CNNpack: packing convolutional neural networks in the frequency domain. In Advances in Neural Information Processing Systems. 253--261.
[52]
Hongkai Wen, Shangxing Wang, Ronald Clark, and Niki Trigoni. 2017. DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks. International Conference on Robotics and Automation (2017).
[53]
Joey Wilson and Neal Patwari. 2011. See-through walls: Motion tracking using variance-based radio tomography networks. IEEE Transactions on Mobile Computing 10, 5 (2011), 612--621.
[54]
Jie Yang, Simon Sidhom, Gayathri Chandrasekaran, Tam Vu, Hongbo Liu, Nicolae Cecan, Yingying Chen, Marco Gruteser, and Richard P Martin. 2011. Detecting driver phone use leveraging car speakers. In Proceedings of the 17th annual international conference on Mobile computing and networking. ACM, 97--108.
[55]
Shuochao Yao, Md Tanvir Amin, Lu Su, Shaohan Hu, Shen Li, Shiguang Wang, Yiran Zhao, Tarek Abdelzaher, Lance Kaplan, Charu Aggarwal, et al. 2016. Recursive ground truth estimator for social data streams. In Information Processing in Sensor Networks (IPSN), 2016 15th ACM/IEEE International Conference on. IEEE, 1--12.
[56]
Shuochao Yao, Shaohan Hu, Yiran Zhao, Aston Zhang, and Tarek Abdelzaher. 2017. DeepSense: a Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee.
[57]
Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. 2014. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014).
[58]
Chao Zhang, Keyang Zhang, Quan Yuan, Haoruo Peng, Yu Zheng, Tim Hanratty, Shaowen Wang, and Jiawei Han. 2017. Regions, periods, activities: Uncovering urban dynamics via cross-modal representation learning. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 361--370.
[59]
Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav S Sukhatme. 2009. Ocrdroid: A framework to digitize text using mobile phones. In International Conference on Mobile Computing, Applications, and Services. Springer, 273--292.

Cited By

View all
  • (2025)Real-Time Semantic Segmentation via Spatial-Detail Guided Context PropagationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.315444336:3(4042-4053)Online publication date: Mar-2025
  • (2025)Joint Optimization of Device Placement and Model Partitioning for Cooperative DNN Inference in Heterogeneous Edge ComputingIEEE Transactions on Mobile Computing10.1109/TMC.2024.345779324:1(210-226)Online publication date: Jan-2025
  • (2025)Discretized-Isolation Forest: Memory- and Compute-Efficient Unsupervised Anomaly Detection for Resource-Constrained Internet of Things Edge DevicesIEEE Internet of Things Journal10.1109/JIOT.2024.346895012:2(1699-1717)Online publication date: 15-Jan-2025
  • Show More Cited By

Index Terms

  1. DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SenSys '17: Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems
        November 2017
        490 pages
        ISBN:9781450354592
        DOI:10.1145/3131672
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 06 November 2017

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Deep Learning
        2. Internet of Things
        3. Mobile Computing
        4. Model Compression
        5. Structure Compression

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Funding Sources

        • NSF
        • Army Research Laboratory

        Conference

        Acceptance Rates

        Overall Acceptance Rate 198 of 990 submissions, 20%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)289
        • Downloads (Last 6 weeks)37
        Reflects downloads up to 25 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)Real-Time Semantic Segmentation via Spatial-Detail Guided Context PropagationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.315444336:3(4042-4053)Online publication date: Mar-2025
        • (2025)Joint Optimization of Device Placement and Model Partitioning for Cooperative DNN Inference in Heterogeneous Edge ComputingIEEE Transactions on Mobile Computing10.1109/TMC.2024.345779324:1(210-226)Online publication date: Jan-2025
        • (2025)Discretized-Isolation Forest: Memory- and Compute-Efficient Unsupervised Anomaly Detection for Resource-Constrained Internet of Things Edge DevicesIEEE Internet of Things Journal10.1109/JIOT.2024.346895012:2(1699-1717)Online publication date: 15-Jan-2025
        • (2024)A Flexible, Large-Scale Sensing Array with Low-Power In-Sensor IntelligenceResearch10.34133/research.04977Online publication date: 13-Nov-2024
        • (2024)Combining Machine Learning and Edge Computing: Opportunities, Challenges, Platforms, Frameworks, and Use CasesElectronics10.3390/electronics1303064013:3(640)Online publication date: 3-Feb-2024
        • (2024)A Systematic Evaluation of Recurrent Neural Network Models for Edge Intelligence and Human Activity Recognition ApplicationsAlgorithms10.3390/a1703010417:3(104)Online publication date: 28-Feb-2024
        • (2024)SpotOn: Adversarially Robust Keyword Spotting on Resource-Constrained IoT PlatformsProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637668(684-699)Online publication date: 1-Jul-2024
        • (2024)PArtNNer: Platform-Agnostic Adaptive Edge-Cloud DNN Partitioning for Minimizing End-to-End LatencyACM Transactions on Embedded Computing Systems10.1145/363026623:1(1-38)Online publication date: 10-Jan-2024
        • (2024)Designing and Training of Lightweight Neural Networks on Edge Devices using Early Halting in Knowledge DistillationIEEE Transactions on Mobile Computing10.1109/TMC.2023.3297026(1-12)Online publication date: 2024
        • (2024)Beyond Federated Learning: Survival-Critical Machine Learning2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00056(483-489)Online publication date: 4-Dec-2024
        • Show More Cited By

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media