research-article

Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural Networks

Authors:

Dong-Wan ChoiAuthors Info & Claims

SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data

Pages 2244 - 2252

https://doi.org/10.1145/3448016.3457326

Published: 18 June 2021 Publication History

Abstract

In spite of the great success of deep learning technologies, training and delivery of a practically serviceable model is still a highly time-consuming process. Furthermore, a resulting model is usually too generic and heavyweight, and hence essentially goes through another expensive model compression phase to fit in a resource-limited device like embedded systems. Inspired by the fact that a machine learning task specifically requested by mobile users is often much simpler than it is supported by a massive generic model, this paper proposes a framework, called Pool of Experts (PoE), that instantly builds a lightweight and task-specific model without any training process. For a realtime model querying service, PoE first extracts a pool of primitive components, called experts, from a well-trained and sufficiently generic network by exploiting a novel conditional knowledge distillation method, and then performs our train-free knowledge consolidation to quickly combine necessary experts into a lightweight network for a target task. Thanks to this train-free property, in our thorough empirical study, PoE can build a fairly accurate yet compact model in a realtime manner, whereas it takes a few minutes per query for the other training methods to achieve a similar level of the accuracy.

Supplementary Material

MP4 File (3448016.3457326.mp4)

In spite of the great success of deep learning technologies, training and delivery of a practically serviceable model is still a highly time-consuming process. Furthermore, a resulting model is usually too generic and heavyweight, and hence essentially goes through another expensive model compression phase to fit in a resource-limited device like embedded systems. Inspired by the fact that a machine learning task specifically requested by mobile users is often much simpler than it is supported by a massive generic model, this paper proposes a framework, called Pool of Experts (PoE), that instantly builds a lightweight and task-specific model without any training process. For a realtime model querying service, PoE first extracts a pool of primitive components, called experts, from a well-trained and sufficiently generic network by exploiting a novel conditional knowledge distillation method, and then performs our train-free knowledge consolidation to quickly combine necessary experts into a lightweight network for a target task. Thanks to this train-free property, in our thorough empirical study, PoE can build a fairly accurate yet compact model in a realtime manner, whereas it takes a few minutes per query for the other training methods to achieve a similar level of the accuracy.

Download
28.49 MB

References

[1]

Cristian Bucila, Rich Caruana, and Alexandru Niculescu-Mizil. 2006. Model compression. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, August 20--23, 2006. ACM, 535--541.

Digital Library

[2]

Wenlin Chen, James T. Wilson, Stephen Tyree, Kilian Q. Weinberger, and Yixin Chen. 2015. Compressing Neural Networks with the Hashing Trick. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6--11 July 2015 (JMLR Workshop and Conference Proceedings, Vol. 37). JMLR.org, 2285--2294.

[3]

Yang Feng, Futang Peng, Xu Zhang, Wei Zhu, Shanfeng Zhang, Howard Zhou, Zhen Li, Tom Duerig, Shih-Fu Chang, and Jiebo Luo. 2020. Unifying Specialist Image Embedding into Universal Image Embedding. CoRR, Vol. abs/2003.03701 (2020).

[4]

Song Han, Jeff Pool, John Tran, and William J. Dally. 2015. Learning both Weights and Connections for Efficient Neural Network. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7--12, 2015, Montreal, Quebec, Canada, Corinna Cortes, Neil D. Lawrence, Daniel D. Lee, Masashi Sugiyama, and Roman Garnett (Eds.). 1135--1143.

[5]

Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services, MobiSys 2016, Singapore, June 26--30, 2016. ACM, 123--136.

Digital Library

[6]

Yang He, Ping Liu, Ziwei Wang, Zhilan Hu, and Yi Yang. 2019. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019 . Computer Vision Foundation / IEEE, 4340--4349.

[7]

Dan Hendrycks and Kevin Gimpel. 2017. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net.

[8]

Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the Knowledge in a Neural Network. CoRR, Vol. abs/1503.02531 (2015).

[9]

Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bod'i k, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B. Gibbons, and Onur Mutlu. 2018. Focus: Querying Large Video Datasets with Low Latency and Low Cost. In 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018, Carlsbad, CA, USA, October 8--10, 2018. USENIX Association, 269--286.

[10]

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew G. Howard, Hartwig Adam, and Dmitry Kalenichenko. 2018. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. IEEE Computer Society, 2704--2713.

[11]

Daniel Kang, Peter Bailis, and Matei Zaharia. 2019. BlazeIt: Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. Proc. VLDB Endow., Vol. 13, 4 (2019), 533--546.

Digital Library

[12]

Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. NoScope: Optimizing Deep CNN-Based Queries over Video Streams at Scale. Proc. VLDB Endow., Vol. 10, 11 (2017), 1586--1597.

Digital Library

[13]

Jack Kiefer, Jacob Wolfowitz, et almbox. 1952. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, Vol. 23, 3 (1952), 462--466.

[14]

Josef Kittler, Mohamad Hatef, Robert P. W. Duin, and Jiri Matas. 1998. On Combining Classifiers. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 20, 3 (1998), 226--239.

Digital Library

[15]

Nick Koudas, Raymond Li, and Ioannis Xarchakos. 2020. Video Monitoring Queries. In 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, TX, USA, April 20--24, 2020 . IEEE, 1285--1296.

[16]

Alex Krizhevsky, Geoffrey Hinton, et almbox. 2009. Learning multiple layers of features from tiny images. (2009).

[17]

Ludmila I. Kuncheva. 2004. Combining Pattern Classifiers: Methods and Algorithms .Wiley.

Digital Library

[18]

Ya Le and Xuan Yang. 2015. Tiny imagenet visual recognition challenge. CS 231N, Vol. 7 (2015).

[19]

Jian-Hao Luo, Jianxin Wu, and Weiyao Lin. 2017. ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. IEEE Computer Society, 5068--5076.

[20]

Ravi Teja Mullapudi, Steven Chen, Keyi Zhang, Deva Ramanan, and Kayvon Fatahalian. 2019. Online Model Distillation for Efficient Video Inference. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. IEEE, 3572--3581.

[21]

Anh Mai Nguyen, Jason Yosinski, and Jeff Clune. 2015. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7--12, 2015 . IEEE Computer Society, 427--436.

[22]

German Ignacio Parisi, Ronald Kemker, Jose L. Part, Christopher Kanan, and Stefan Wermter. 2019. Continual lifelong learning with neural networks: A review. Neural Networks, Vol. 113 (2019), 54--71.

Digital Library

[23]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kö pf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8--14 December 2019, Vancouver, BC, Canada. 8024--8035.

[24]

Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part IV (Lecture Notes in Computer Science, Vol. 9908), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer, 525--542.

[25]

Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2015. FitNets: Hints for Thin Deep Nets. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015, Conference Track Proceedings .

[26]

Haichen Shen, Seungyeop Han, Matthai Philipose, and Arvind Krishnamurthy. 2017. Fast Video Classification via Adaptive Cascading of Deep Models. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017. IEEE Computer Society, 2197--2205.

[27]

Jayakorn Vongkulbhisal, Phongtharin Vinayavekhin, and Marco Visentini Scarzanella. 2019. Unifying Heterogeneous Classifiers With Distillation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019. Computer Vision Foundation / IEEE, 3175--3184.

[28]

Jian Xue, Jinyu Li, Dong Yu, Mike Seltzer, and Yifan Gong. 2014. Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4--9, 2014. IEEE, 6359--6363.

[29]

Junho Yim, Donggyu Joo, Ji-Hoon Bae, and Junmo Kim. 2017. A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017 . IEEE Computer Society, 7130--7138.

[30]

Sergey Zagoruyko and Nikos Komodakis. 2016. Wide Residual Networks. In Proceedings of the British Machine Vision Conference 2016, BMVC 2016, York, UK, September 19--22, 2016 . BMVA Press.

[31]

Sergey Zagoruyko and Nikos Komodakis. 2017. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net.

[32]

Junting Zhang, Jie Zhang, Shalini Ghosh, Dawei Li, Serafettin Tasci, Larry P. Heck, Heming Zhang, and C.-C. Jay Kuo. 2020. Class-incremental Learning via Deep Model Consolidation. In IEEE Winter Conference on Applications of Computer Vision, WACV 2020, Snowmass Village, CO, USA, March 1--5, 2020. IEEE, 1120--1129.

[33]

Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and Yurong Chen. 2017. Incremental Network Quantization: Towards Lossless CNNs with Low-precision Weights. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings . OpenReview.net.

Cited By

Ma Q(2022)Design of High-Confidence Embedded Operating System based on Artificial Intelligence and Smart Chips2022 Second International Conference on Artificial Intelligence and Smart Energy (ICAIS)10.1109/ICAIS53314.2022.9742917(58-62)Online publication date: 23-Feb-2022
https://doi.org/10.1109/ICAIS53314.2022.9742917

Index Terms

Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural Networks
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information systems applications
    1. Data mining
    2. Mobile information processing systems

Recommendations

Knowledge Distillation based Online Learning Methodology using Unlabeled Data Stream
MLMI '18: Proceedings of the 2018 International Conference on Machine Learning and Machine Intelligence

In supervised learning, the performance of the learning model decreases with the change of time step due to concept drift caused by overfitting of the training data. As a methodology to mitigate such concept drift, an online learning methodology has ...
Compression of Deep Learning Models for NLP
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

In recent years, the fields of NLP and information retrieval have made tremendous progress thanks to deep learning models like RNNs and LSTMs, and Transformer[35] based models like BERT[9]. But these models are humongous in size. Real world applications ...
Optimizing User Experience in Wearable Cognitive Assistance through Model Specialization
SmartWear '23: Proceedings of the 2nd Workshop on Smart Wearable Systems and Applications

Wearable Cognitive Assistance (WCA) is a rapidly evolving application that relies on accurate computer vision models for optimal performance and user experience. However, adapting these models to varying user workstation backgrounds can be challenging, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data

June 2021

2969 pages

ISBN:9781450383431

DOI:10.1145/3448016

General Chairs:
Guoliang Li
Tsinghua University (China)
,
Zhanhuai Li
Northwestern Polytechnical University (China)
,
Program Chairs:
Stratos Idreos
Harvard University (USA)
,
Divesh Srivastava
AT&T (USA)

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGMOD/PODS '21

Sponsor:

SIGMOD

SIGMOD/PODS '21: International Conference on Management of Data

June 20 - 25, 2021

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
210
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)2

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ma Q(2022)Design of High-Confidence Embedded Operating System based on Artificial Intelligence and Smart Chips2022 Second International Conference on Artificial Intelligence and Smart Energy (ICAIS)10.1109/ICAIS53314.2022.9742917(58-62)Online publication date: 23-Feb-2022
https://doi.org/10.1109/ICAIS53314.2022.9742917

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents