Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3356250.3360030acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article

Neuro.ZERO: a zero-energy neural network accelerator for embedded sensing and inference systems

Published: 10 November 2019 Publication History

Abstract

We introduce Neuro.ZERO---a co-processor architecture consisting of a main microcontroller (MCU) that executes scaled-down versions of a deep neural network1 (DNN) inference task, and an accelerator microcontroller that is powered by harvested energy and follows the intermittent computing paradigm [76]. The goal of the accelerator is to enhance the inference performance of the DNN that is running on the main microcontroller. Neuro.ZERO opportunistically accelerates the run-time performance of a DNN via one of its four acceleration modes: extended inference, expedited inference, ensemble inference, and latent training. To enable these modes, we propose two sets of algorithms: (1) energy and intermittence-aware DNN inference and training algorithms, and (2) a fast and high-precision adaptive fixed-point arithmetic that beats existing floating-point and fixed-point arithmetic in terms of speed and precision, respectively, and achieves the best of both. To evaluate Neuro.ZERO, we implement low-power image and audio recognition applications and demonstrate that their inference speedup increases by 1.6× and 1.7×, respectively, and the inference accuracy increases by 10% and 16%, respectively, when compared to battery-powered single-MCU systems.

References

[1]
National Highway Traffic Safety Administration. 2013. Traffic Safety Facts. https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/812124. (2013).
[2]
SF Anderson, JG Earle, R Et Goldschmidt, and DM Powers. 1967. The IBM system/360 model 91: floating-point execution unit. IBM Journal of research and development 11, 1 (1967), 34--53.
[3]
Sajid Anwar, Kyuyeon Hwang, and Wonyong Sung. 2015. Fixed point optimization of deep convolutional neural networks for object recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 1131--1135.
[4]
Apple. 2017. Neural Engine. https://www.apple.com/iphone-xs/a12-bionic/. (2017).
[5]
Arm. 2013. big.LITTLE technology. https://www.arm.com/files/pdf/bigLITTLETechnologytheFutueofMobile.pdf. (2013).
[6]
Nii O Attoh-Okine. 1999. Analysis of learning rate and momentum term in backpropagation neural network algorithm trained to predict pavement performance. Advances in Engineering Software 30, 4 (1999), 291--302.
[7]
David Audet, Leandro Collares De Oliveira, Neil MacMillan, Dimitri Marinakis, and Kui Wu. 2011. Scheduling recurring tasks in energy harvesting sensors. In 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE, 277--282.
[8]
Abdulrahman Baknina and Sennur Ulukus. 2017. Online scheduling for energy harvesting channels with processing costs. IEEE Transactions on Green Communications and Networking 1, 3 (2017), 281--293.
[9]
Yoshua Bengio. 2012. Practical recommendations for gradient-based training of deep architectures. In Neural networks: Tricks of the trade. Springer, 437--478.
[10]
Sourav Bhattacharya and Nicholas D Lane. 2016. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM. ACM, 176--189.
[11]
Dudley Allen Buck. 1952. Ferroelectrics for Digital Information Storage and Switching. Technical Report. MASSACHUSETTS INST OF TECH CAMBRIDGE DIGITAL COMPUTER LAB.
[12]
Michael Buettner, Ben Greenstein, and David Wetherall. 2011. Dewdrop: an energy-aware runtime for computational RFID. In Proc. USENIX NSDI. 197--210.
[13]
Erik Cambria and Bebo White. 2014. Jumping NLP curves: A review of natural language processing research. IEEE Computational intelligence magazine 9, 2 (2014), 48--57.
[14]
Lukas Cavigelli, Michele Magno, and Luca Benini. 2015. Accelerating real-time embedded scene labeling with convolutional networks. In Proceedings of the 52nd Annual Design Automation Conference. ACM, 108.
[15]
Jagmohan Chauhan, Suranga Seneviratne, Yining Hu, Archan Misra, Aruna Seneviratne, and Youngki Lee. 2018. Breathing-Based Authentication on Resource-Constrained IoT Devices using Recurrent Neural Networks. Computer 51, 5 (2018), 60--67.
[16]
Liang-Bi Chen, Wan-Jung Chang, Jian-Ping Su, Ji-Yi Ciou, Yi-Jhan Ciou, Cheng-Chin Kuo, and Katherine Shu-Min Li. 2016. A wearable-glasses-based drowsiness-fatigue-detection system for improving road safety. In 2016 IEEE 5th Global Conference on Consumer Electronics. IEEE, 1--2.
[17]
Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM Sigplan Notices 49, 4 (2014), 269--284.
[18]
Maryline Chetto and Hussein El Ghor. 2019. Scheduling and power management in energy harvesting computing systems with real-time constraints. Journal of Systems Architecture (2019).
[19]
Alexei Colin and Brandon Lucia. 2016. Chain: tasks and channels for reliable intermittent programs. ACM SIGPLAN Notices 51, 10 (2016), 514--530.
[20]
Alexei Colin, Emily Ruppel, and Brandon Lucia. 2018. A Reconfigurable Energy Storage Architecture for Energy-harvesting Devices. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 767--781.
[21]
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2014. Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024 (2014).
[22]
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in neural information processing systems. 3123--3131.
[23]
George Cybenko. 1989. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems 2, 4 (1989), 303--314.
[24]
Cypress. 2017. CY15B104Q. http://www.cypress.com/file/209146/download. (2017).
[25]
Nicolaas Govert de Bruijn. 1975. Acknowledgement of priority to C. Flye Sainte-Marie on the counting of circular arrangements of 2n zeros and ones that show each n-letter word exactly once. Department of Mathematics, Technological University.
[26]
Li Deng. 2014. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing 3 (2014).
[27]
Li Deng, Geoffrey Hinton, and Brian Kingsbury. 2013. New types of deep neural network learning for speech recognition and related applications: An overview. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 8599--8603.
[28]
Sorin Draghici. 2002. On the capabilities of neural networks using limited precision weights. Neural networks 15, 3 (2002), 395--414.
[29]
Aysegul Dundar, Jonghoon Jin, Berin Martini, and Eugenio Culurciello. 2017. Embedded streaming deep neural networks accelerator with applications. IEEE transactions on neural networks and learning systems 28, 7 (2017), 1572--1583.
[30]
Clément Farabet, Berin Martini, Benoit Corda, Polina Akselrod, Eugenio Culurciello, and Yann LeCun. 2011. Neuflow: A runtime reconfigurable dataflow processor for vision. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on. IEEE, 109--116.
[31]
National Science Foundation. 2019. Real-Time Machine Learning (RTML). https://www.nsf.gov/pubs/2019/nsf19566/nsf19566.htm?WT.mcid=USNSF25&WT.mcev=click. (2019).
[32]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. 249--256.
[33]
Graham Gobieski, Nathan Beckmann, and Brandon Lucia. 2018. Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems. arXiv preprint arXiv:1810.07751 (2018).
[34]
Graham Gobieski, Nathan Beckmann, and Brandon Lucia. 2018. Intermittent Deep Neural Network Inference. SysML (2018).
[35]
Graham Gobieski, Nathan Beckmann, and Brandon Lucia. 2019. Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[36]
Vinayak Gokhale, Jonghoon Jin, Aysegul Dundar, Berin Martini, and Eugenio Culurciello. 2014. A 240 g-ops/s mobile coprocessor for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 682--687.
[37]
Yunchao Gong, Liu Liu, Ming Yang, and Lubomir Bourdev. 2014. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115 (2014).
[38]
Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep learning. Vol. 1. MIT press Cambridge.
[39]
Google. 2018. Google Clips. https://store.google.com/us/product/googleclips?hl=en-US. (2018).
[40]
Stefanie Günther, Lars Ruthotto, Jacob B Schroder, EC Cyr, and Nicolas R Gauger. 2018. Layer-parallel training of deep residual neural networks. arXiv preprint arXiv:1812.04352 (2018).
[41]
Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In International Conference on Machine Learning. 1737--1746.
[42]
Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. 2016. EIE: efficient inference engine on compressed deep neural network. In Computer Architecture (ISCA), 2016 ACM/IEEE 43rd Annual International Symposium on. IEEE, 243--254.
[43]
Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).
[44]
Boris Hanin. 2017. Universal function approximation by deep neural nets with bounded width and relu activations. arXiv preprint arXiv:1708.02691 (2017).
[45]
Lars Kai Hansen and Peter Salamon. 1990. Neural network ensembles. IEEE Transactions on Pattern Analysis & Machine Intelligence 10 (1990), 993--1001.
[46]
Andrew Hard, Kanishka Rao, Rajiv Mathews, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2018. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018).
[47]
Douglas M Hawkins. 2004. The problem of overfitting. Journal of chemical information and computer sciences 44, 1 (2004), 1--12.
[48]
Jibo He, William Choi, Yan Yang, Junshi Lu, Xiaohui Wu, and Kaiping Peng. 2017. Detection of driver drowsiness using wearable devices: A feasibility study of the proximity sensor. Applied ergonomics 65 (2017), 473--480.
[49]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[50]
Josiah Hester, Lanny Sitanayah, and Jacob Sorber. 2015. Tragedy of the coulombs: Federating energy storage for tiny, intermittently-powered sensors. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems. ACM, 5--16.
[51]
Josiah Hester, Kevin Storer, and Jacob Sorber. 2017. Timely Execution on Intermi ently Powered Ba eryless Sensors. (2017).
[52]
Embedded Intelligence Lab (UNC Chapel Hill). 2019. Neuro.ZERO open source project. https://github.com/learning1234embed/Neuro.ZERO. (2019).
[53]
Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, and Jürgen Schmidhuber. 2001. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. (2001).
[54]
Jordan L Holi and J-N Hwang. 1993. Finite precision error analysis of neural network hardware implementations. IEEE Trans. Comput. 3 (1993), 281--290.
[55]
Kurt Hornik. 1991. Approximation capabilities of multilayer feedforward networks. Neural networks 4, 2 (1991), 251--257.
[56]
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2017. Quantized neural networks: Training neural networks with low precision weights and activations. The Journal of Machine Learning Research 18, 1 (2017), 6869--6898.
[57]
Kyuyeon Hwang and Wonyong Sung. 2014. Fixed-point feedforward deep neural network design using weights+ 1, 0, and- 1. In Signal Processing Systems (SiPS), 2014 IEEE Workshop on. IEEE, 1--6.
[58]
Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016).
[59]
Shubham Kamdar and Neha Kamdar. 2015. big. LITTLE architecture: Heterogeneous multicore processing. International Journal of Computer Applications 119, 1 (2015).
[60]
Jack Kiefer and Jacob Wolfowitz. 1952. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics 23, 3 (1952), 462--466.
[61]
Jonghong Kim, Kyuyeon Hwang, and Wonyong Sung. 2014. X1000 real-time phoneme recognition VLSI using feed-forward deep neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 7510--7514.
[62]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[63]
Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report. Citeseer.
[64]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[65]
Anders Krogh and Jesper Vedelsby. 1995. Neural network ensembles, cross validation, and active learning. In Advances in neural information processing systems. 231--238.
[66]
Nicholas D Lane, Sourav Bhattacharya, Akhil Mathur, Petko Georgiev, Claudio Forlivesi, and Fahim Kawsar. 2017. Squeezing deep learning into mobile and embedded devices. IEEE Pervasive Computing 3 (2017), 82--88.
[67]
Steve Lawrence, C Lee Giles, and Ah Chung Tsoi. 1997. Lessons in neural network training: Overfitting may be harder than expected. In AAAI/IAAI. Citeseer, 540--545.
[68]
Steve Lawrence, C Lee Giles, and Ah Chung Tsoi. 1998. What size neural network gives optimal generalization? Convergence properties of backpropagation. Technical Report.
[69]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature 521, 7553 (2015), 436.
[70]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
[71]
Dong-Hyun Lee. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on Challenges in Representation Learning, ICML, Vol. 3. 2.
[72]
Mu Li, Tong Zhang, Yuqiang Chen, and Alexander J Smola. 2014. Efficient mini-batch training for stochastic optimization. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 661--670.
[73]
Zhouhan Lin, Matthieu Courbariaux, Roland Memisevic, and Yoshua Bengio. 2015. Neural networks with few multiplications. arXiv preprint arXiv:1510.03009 (2015).
[74]
Hong Lu, AJ Bernheim Brush, Bodhi Priyantha, Amy K Karlson, and Jie Liu. 2011. Speakersense: Energy efficient unobtrusive speaker identification on mobile phones. In International conference on pervasive computing. Springer, 188--205.
[75]
Zhou Lu, Hongming Pu, Feicheng Wang, Zhiqiang Hu, and Liwei Wang. 2017. The expressive power of neural networks: A view from the width. In Advances in Neural Information Processing Systems. 6231--6239.
[76]
Brandon Lucia, Vignesh Balaji, Alexei Colin, Kiwan Maeng, and Emily Ruppel. 2017. Intermittent Computing: Challenges and Opportunities. In LIPIcs-Leibniz International Proceedings in Informatics, Vol. 71. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
[77]
Brandon Lucia and Benjamin Ransford. 2015. A simpler, safer programming and execution model for intermittent systems. ACM SIGPLAN Notices 50, 6 (2015), 575--585.
[78]
Yubo Luo and Shahriar Nirjon. 2019. SpotON: Just-in-Time Active Event Detection on Energy Autonomous Sensing Systems. In Proceedings of the 25th IEEE RealTime and Embedded Technology and Applications Symposium (RTAS WIP Session). IEEE, Montreal, Canada.
[79]
Kiwan Maeng, Alexei Colin, and Brandon Lucia. 2017. Alpaca: intermittent execution without checkpoints. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017), 96.
[80]
Franco Manessi, Alessandro Rozza, Simone Bianco, Paolo Napoletano, and Raimondo Schettini. 2018. Automated pruning for deep neural network compression. In 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 657--664.
[81]
Dominic Masters and Carlo Luschi. 2018. Revisiting small batch training for deep neural networks. arXiv preprint arXiv:1804.07612 (2018).
[82]
Paul A Merolla, John V Arthur, Rodrigo Alvarez-Icaza, Andrew S Cassidy, Jun Sawada, Filipp Akopyan, Bryan L Jackson, Nabil Imam, Chen Guo, and Yutaka Nakamura. 2014. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345, 6197 (2014), 668--673.
[83]
Milad Mohammadi and Subhasis Das. 2016. SNN: stacked neural networks. arXiv preprint arXiv:1605.08512 (2016).
[84]
Clemens Moser, Davide Brunelli, Lothar Thiele, and Luca Benini. 2007. Real-time scheduling for energy harvesting sensor nodes. Real-Time Systems 37, 3 (2007), 233--260.
[85]
Saman Naderiparizi, Pengyu Zhang, Matthai Philipose, Bodhi Priyantha, Jie Liu, and Deepak Ganesan. 2017. Glimpse: A programmable early-discard camera architecture for continuous mobile vision. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 292--305.
[86]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading digits in natural images with unsupervised feature learning. (2011).
[87]
Shahriar Nirjon. 2018. Lifelong Learning on Harvested Energy. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 500--501.
[88]
Erick L Oberstar. 2007. Fixed-point representation & fractional math. Oberstar Consulting (2007), 9.
[89]
Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22, 10 (2009), 1345--1359.
[90]
Mohammad Peikari, Sherine Salama, Sharon Nofech-Mozes, and Anne L Martel. 2018. A cluster-then-label semi-supervised learning approach for pathology image classification. Scientific reports 8, 1 (2018), 7193.
[91]
Powercast. 2016. Powercast p2110b. http://www.powercastco.com/wp-content/uploads/2016/12/P2110B-Datasheet-Rev-3.pdf. (2016).
[92]
Powercast. 2016. Powercaster transmitter. http://www.powercastco.com/wp-content/uploads/2016/11/User-Manual-TX-915-01-Rev-A-4.pdf. (2016).
[93]
Jiantao Qiu, Jie Wang, Song Yao, Kaiyuan Guo, Boxun Li, Erjin Zhou, Jincheng Yu, Tianqi Tang, Ningyi Xu, and Sen Song. 2016. Going deeper with embedded fpga platform for convolutional neural network. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 26--35.
[94]
Qualcomm. 2017. Snapdragon 845 Mobile Platform. https://www.qualcomm.com/media/documents/files/snapdragon-845-mobile-platform-product-brief.pdf. (2017).
[95]
Qualcomm. 2018. Qualcomm Snapdragon 820E Processor (APQ8096SGE). https://developer.qualcomm.com/download/sd820e/qualcomm-snapdragon-820e-processor-apq8096sge-device-specification.pdf. (2018).
[96]
Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision. Springer, 525--542.
[97]
Herbert Robbins and Sutton Monro. 1985. A stochastic approximation method. In Herbert Robbins Selected Papers. Springer, 102--109.
[98]
Mathieu ROUAUD. 2012. Probabilités, statistiques et analyses multicritères. (2012).
[99]
David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. 1988. Learning representations by back-propagating errors. Cognitive modeling 5, 3 (1988), 1.
[100]
Tara Sainath and Carolina Parada. 2015. Convolutional neural networks for small-footprint keyword spotting. (2015).
[101]
Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural networks 61 (2015), 85--117.
[102]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 815--823.
[103]
Khurram Shahzad and Bengt Oelmann. 2014. Investigating Energy Consumption of an SRAM-based FPGA for Duty-Cycle Applications. In International Conference on Parallel Computing-ParCo 2013, 10-13 Sept, Munich. 548--559.
[104]
Rajiv Ranjan Singh. 2007. Preventing Road Accidents with Wearable Biosensors and Innovative Architectural Design. In 2nd ISSS National Conference on MEMS, Pilani, India. 1--8.
[105]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929--1958.
[106]
Johannes Stallkamp, Marc Schlipsing, Jan Salmen, and Christian Igel. 2011. The German Traffic Sign Recognition Benchmark: A multi-class classification competition. In IEEE International Joint Conference on Neural Networks. 1453--1460.
[107]
TexasInstruments. 2018. MSP430FR5994. http://www.ti.com/product/MSP430FR5994. (2018).
[108]
Lisa Torrey and Jude Shavlik. 2010. Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques. IGI Global, 242--264.
[109]
Varuna Tyagi and Anju Mishra. 2014. A survey on ensemble combination schemes of neural network. International Journal of Computer Applications 95, 16 (2014).
[110]
James Victor Uspensky. 1937. Introduction to mathematical probability. (1937).
[111]
Vincent Vanhoucke, Andrew Senior, and Mark Z Mao. 2011. Improving the speed of neural networks on CPUs. In Proc. Deep Learning and Unsupervised Feature Learning NIPS Workshop, Vol. 1. Citeseer, 4.
[112]
Chao Wang, Qi Yu, Lei Gong, Xi Li, Yuan Xie, and Xuehai Zhou. 2016. DLAU: A scalable deep learning accelerator unit on FPGA. arXiv preprint arXiv:1605.06894 (2016).
[113]
Lipo Wang, Hou Chai Quek, Keng Hoe Tee, Nina Zhou, and Chunru Wan. 2005. Optimal size of a feedforward neural network: How much does it matter?. In Joint International Conference on Autonomic and Autonomous Systems and International Conference on Networking and Services-(icas-isns' 05). IEEE, 69--69.
[114]
Yue Wang, Tan Nguyen, Yang Zhao, Zhangyang Wang, Yingyan Lin, and Richard Baraniuk. 2018. EnergyNet: Energy-Efficient Dynamic Inference. (2018).
[115]
P. Warden. 2018. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. ArXiv e-prints (April 2018). arXiv:cs.CL/1804.03209 https://arxiv.org/abs/1804.03209
[116]
Alan R Weiss. 2002. Dhrystone benchmark: History, analysis, scores and recommendations. (2002).
[117]
Paul J Werbos. 1990. Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 10 (1990), 1550--1560.
[118]
Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. (2017). arXiv:cs.LG/1708.07747
[119]
Xilinx. 2011. Spartan-6 Family Overview. https://www.xilinx.com/support/documentation/datasheets/ds160.pdf. (2011).
[120]
Shuochao Yao, Yiran Zhao, Aston Zhang, Shaohan Hu, Huajie Shao, Chao Zhang, Lu Su, and Tarek Abdelzaher. 2018. Deep Learning for the Internet of Things. Computer 51, 5 (2018), 32--41.
[121]
Tom Young, Devamanyu Hazarika, Soujanya Poria, and Erik Cambria. 2018. Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine 13, 3 (2018), 55--75.
[122]
Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing fpga-based accelerator design for deep convolutional neural networks. In Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 161--170.
[123]
Aojun Zhou, Anbang Yao, Kuan Wang, and Yurong Chen. 2018. Explicit Loss-Error-Aware Quantization for Low-Bit Deep Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9426--9435.
[124]
Xiaojin Zhu and Andrew B Goldberg. 2009. Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning 3, 1 (2009), 1--130.

Cited By

View all
  • (2024)Fast-Inf: Ultra-Fast Embedded Intelligence on the Batteryless EdgeProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699335(239-252)Online publication date: 4-Nov-2024
  • (2024)CRAM-Based Acceleration for Intermittent Computing of Parallelizable TasksIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.329342612:1(48-59)Online publication date: Jan-2024
  • (2024)Unsupervised Joint Domain Adaptation for Decoding Brain Cognitive States From tfMRI ImagesIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2023.334813028:3(1494-1503)Online publication date: Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SenSys '19: Proceedings of the 17th Conference on Embedded Networked Sensor Systems
November 2019
472 pages
ISBN:9781450369503
DOI:10.1145/3356250
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accelerator
  2. batteryless
  3. deep neural networks
  4. zero energy

Qualifiers

  • Research-article

Conference

Acceptance Rates

Overall Acceptance Rate 174 of 867 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)146
  • Downloads (Last 6 weeks)24
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Fast-Inf: Ultra-Fast Embedded Intelligence on the Batteryless EdgeProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699335(239-252)Online publication date: 4-Nov-2024
  • (2024)CRAM-Based Acceleration for Intermittent Computing of Parallelizable TasksIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.329342612:1(48-59)Online publication date: Jan-2024
  • (2024)Unsupervised Joint Domain Adaptation for Decoding Brain Cognitive States From tfMRI ImagesIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2023.334813028:3(1494-1503)Online publication date: Mar-2024
  • (2024)LACTJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2024.103213153:COnline publication date: 1-Aug-2024
  • (2023)DaCapo: An On-Device Learning Scheme for Memory-Constrained Embedded SystemsACM Transactions on Embedded Computing Systems10.1145/360912122:5s(1-23)Online publication date: 9-Sep-2023
  • (2023)Fine-grained Hardware Acceleration for Efficient Batteryless Intermittent Inference on the EdgeACM Transactions on Embedded Computing Systems10.1145/360847522:5(1-19)Online publication date: 10-Jul-2023
  • (2023)SoundSieve: Seconds-Long Audio Event Recognition on Intermittently-Powered SystemsProceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services10.1145/3581791.3596859(28-41)Online publication date: 18-Jun-2023
  • (2023)BOBBER A Prototyping Platform for Batteryless Intermittent AcceleratorsProceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays10.1145/3543622.3573046(221-228)Online publication date: 12-Feb-2023
  • (2022)ProteanProceedings of the 20th ACM Conference on Embedded Networked Sensor Systems10.1145/3560905.3568561(207-221)Online publication date: 6-Nov-2022
  • (2022)Adaptive Intelligence for Batteryless Sensors Using Software-Accelerated Tsetlin MachinesProceedings of the 20th ACM Conference on Embedded Networked Sensor Systems10.1145/3560905.3568512(236-249)Online publication date: 6-Nov-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media