Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

From Server-Based to Client-Based Machine Learning: A Comprehensive Survey

Published: 02 January 2021 Publication History
  • Get Citation Alerts
  • Abstract

    In recent years, mobile devices have gained increasing development with stronger computation capability and larger storage space. Some of the computation-intensive machine learning tasks can now be run on mobile devices. To exploit the resources available on mobile devices and preserve personal privacy, the concept of client-based machine learning has been proposed. It leverages the users’ local hardware and local data to solve machine learning sub-problems on mobile devices and only uploads computation results rather than the original data for the optimization of the global model. Such an architecture can not only relieve computation and storage burdens on servers but also protect the users’ sensitive information. Another benefit is the bandwidth reduction because various kinds of local data can be involved in the training process without being uploaded. In this article, we provide a literature review on the progressive development of machine learning from server based to client based. We revisit a number of widely used server-based and client-based machine learning methods and applications. We also extensively discuss the challenges and future directions in this area. We believe that this survey will give a clear overview of client-based machine learning and provide guidelines on applying client-based machine learning to practice.

    References

    [1]
    Martín Abadi, Ashish Agarwal, Paul Barham, et al. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Retrieved from https://www.tensorflow.org/. Software available from tensorflow.org.
    [2]
    Sharif Abuadbba, Kyuyeon Kim, Minki Kim, Chandra Thapa, Seyit A. Camtepe, Yansong Gao, Hyoungshick Kim, and Surya Nepal. 2020. Can we use split learning on 1D CNN models for privacy preserving training? arXiv preprint arXiv:2003.12365 (2020).
    [3]
    Naman Agarwal, Ananda Theertha Suresh, Felix Xinnan X. Yu, Sanjiv Kumar, and Brendan McMahan. 2018. cpSGD: Communication-efficient and differentially-private distributed SGD. In Advances in Neural Information Processing Systems 31 (NeurIPS'18). Curran Associates, Inc., 7564--7575.
    [4]
    Amr Ahmed, Moahmed Aly, Joseph Gonzalez, Shravan Narayanamurthy, and Alexander J. Smola. 2012. Scalable inference in latent variable models. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM'12). 123--132.
    [5]
    Kareem Amin, Alex Kulesza, Andres Munoz, and Sergei Vassilvtiskii. 2019. Bounding user contributions: A bias-variance trade-off in differential privacy. In Proceedings of the 36th International Conference on Machine Learning (ICML'19), Vol. 97. 263--271.
    [6]
    Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E. Dahl, and Geoffrey E. Hinton. 2018. Large scale distributed neural network training through online distillation. arXiv preprint arXiv:1804.03235 (2018).
    [7]
    Apple. 2017. Core ML. Retrieved January 20, 2020, from https://developer.apple.com/documentation/coreml.
    [8]
    Haim Avron, Alex Druinsky, and Anshul Gupta. 2015. Revisiting asynchronous linear solvers: Provable convergence rate through randomization. Journal of the ACM (JACM) 62, 6 (2015), 51.
    [9]
    Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov. 2018. How to backdoor federated learning. arXiv preprint arXiv:1807.00459 (2018).
    [10]
    Baidu. 2019. Paddle Lite. Retrieved January 20, 2020, from https://github.com/PaddlePaddle/Paddle-Lite.
    [11]
    Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.
    [12]
    Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloé Kiddon, Jakub Konecný, Stefano Mazzocchi, H. Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards federated learning at scale: System design. arXiv preprint arXiv:1902.01046 (2019).
    [13]
    Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS'17). 1175--1191.
    [14]
    Léon Bottou, Frank E. Curtis, and Jorge Nocedal. 2018. Optimization methods for large-scale machine learning. Siam Review 60, 2 (2018), 223--311.
    [15]
    Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning 3, 1 (2011), 1--122.
    [16]
    Sebastian Caldas, Jakub Konečny, H. Brendan McMahan, and Ameet Talwalkar. 2018. Expanding the reach of federated learning by reducing client resource requirements. arXiv preprint arXiv:1812.07210 (2018).
    [17]
    Sebastian Caldas, Peter Wu, Tian Li, Jakub Konečnỳ, H. Brendan McMahan, Virginia Smith, and Ameet Talwalkar. 2018. Leaf: A benchmark for federated settings. arXiv preprint arXiv:1812.01097 (2018).
    [18]
    Qingqing Cao, Noah Weber, Niranjan Balasubramanian, and Aruna Balasubramanian. 2019. DeQA: On-device question answering. In Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'19). 27--40.
    [19]
    Tim Capes, Paul Coles, Alistair Conkie, et al. 2017. Siri on-device deep learning-guided unit selection text-to-speech system. In Proceedings of the 18th Annual Conference of the International Speech Communication Association (INTERSPEECH'17). 4011--4015.
    [20]
    Niu Chaoyue, Wu Fan, Tang Shaojie, Hua Lifeng, Jia Rongfei, Lv Chengfei, Wu Zhihua, and Chen Guihai. 2020. Billion-scale federated learning on mobile clients: A submodel design with tunable privacy. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking (MobiCom'20). 405--418.
    [21]
    Fei Chen, Zhenhua Dong, Zhenguo Li, and Xiuqiang He. 2018. Federated meta-learning for recommendation. arXiv preprint arXiv:1802.07876 (2018).
    [22]
    Mingqing Chen, Rajiv Mathews, Tom Ouyang, and Françoise Beaufays. 2019. Federated learning of out-of-vocabulary words. arXiv preprint arXiv:1903.10635 (2019).
    [23]
    Kewei Cheng, Tao Fan, Yilun Jin, Yang Liu, Tianjian Chen, and Qiang Yang. 2019. SecureBoost: A lossless federated learning framework. arXiv preprint arXiv:1901.08755 (2019).
    [24]
    Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc'aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Quoc V. Le, and Andrew Y. Ng. 2012. Large scale distributed deep networks. In Advances in Neural Information Processing Systems 25 (NeurIPS'12). Curran Associates, Inc., 1223--1231.
    [25]
    Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.
    [26]
    John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12(Jul 2011), 2121--2159.
    [27]
    Hubert Eichner, Tomer Koren, Brendan McMahan, Nathan Srebro, and Kunal Talwar. 2019. Semi-cyclic stochastic gradient descent. In Proceedings of the 36th International Conference on Machine Learning (ICML'19), Vol. 97. 1764--1773.
    [28]
    Facebook. 2017. PyTorch. Retrieved January 20, 2020, from https://pytorch.org/.
    [29]
    Biyi Fang, Xiao Zeng, and Mi Zhang. 2018. NestDNN: Resource-aware multi-tenant on-device deep learning for continuous mobile vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking (MobiCom'18). 115--127.
    [30]
    Pedro A. Forero, Alfonso Cano, and Georgios B. Giannakis. 2010. Consensus-based distributed support vector machines. Journal of Machine Learning Research 11, May (2010), 1663--1707.
    [31]
    Clement Fung, Chris J. M. Yoon, and Ivan Beschastnikh. 2018. Mitigating sybils in federated learning poisoning. arXiv preprint arXiv:1808.04866 (2018).
    [32]
    Petko Georgiev, Nicholas D. Lane, Cecilia Mascolo, and David Chu. 2017. Accelerating mobile audio sensing algorithms through on-chip GPU offloading. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'17). 306--318.
    [33]
    A. Ghoting, R. Krishnamurthy, E. Pednault, B. Reinwald, V. Sindhwani, S. Tatikonda, Y. Tian, and S. Vaithyanathan. 2011. SystemML: Declarative machine learning on MapReduce. In 2011 IEEE 27th International Conference on Data Engineering (ICDE'11). 231--242.
    [34]
    Andrew Gibiansky. 2017. Bringing HPC Techniques to Deep Learning. Retrieved June 10, 2020, from https://andrew.gibiansky.com/blog/machine-learning/baidu-allreduce/.
    [35]
    Google. 2017. TensorFlow Lite. Retrieved January 20, 2020, from https://www.tensorflow.org/lite/.
    [36]
    Otkrist Gupta and Ramesh Raskar. 2018. Distributed learning of deep neural network over multiple agents. Journal of Network and Computer Applications 116 (2018), 1--8.
    [37]
    Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2018. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018).
    [38]
    Lie He, An Bian, and Martin Jaggi. 2018. COLA: Communication-efficient decentralized linear learning. arXiv preprint arXiv:1808.04883 (2018).
    [39]
    Geoffrey E. Hinton. 2007. Learning multiple layers of representation. Trends in Cognitive Sciences 11, 10 (2007), 428--434.
    [40]
    Qirong Ho, James Cipar, Henggang Cui, Seunghak Lee, Jin Kyu Kim, Phillip B. Gibbons, Garth A. Gibson, Greg Ganger, and Eric P. Xing. 2013. More effective distributed ML via a stale synchronous parallel parameter server. In Advances in Neural Information Processing Systems (NeurIPS'13). 1223--1231.
    [41]
    Loc N. Huynh, Youngki Lee, and Rajesh Krishna Balan. 2017. DeepMon: Mobile GPU-based deep learning framework for continuous vision applications. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'17). 82--95.
    [42]
    Martin Isaksson and Karl Norrman. 2020. Secure federated learning in 5G mobile networks. arXiv preprint arXiv:2004.06700 (2020).
    [43]
    Martin Jaggi, Virginia Smith, Martin Takác, Jonathan Terhorst, Sanjay Krishnan, Thomas Hofmann, and Michael I. Jordan. 2014. Communication-efficient distributed dual coordinate ascent. In Advances in Neural Information Processing Systems (NeurIPS'14). 3068--3076.
    [44]
    Xiaotang Jiang, Huan Wang, Yiliu Chen, Ziqi Wu, Lichuan Wang, Bin Zou, Yafeng Yang, Zongyang Cui, Yu Cai, Tianhang Yu, Chengfei Lv, and Zhihua Wu. 2020. MNN: A universal and efficient inference engine. In Proceedings of Machine Learning and Systems 2020 (MLSys'20), Vol. 2. 1--13.
    [45]
    Thorsten Joachims. 2006. Training linear SVMs in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06). 217--226.
    [46]
    Rie Johnson and Tong Zhang. 2013. Accelerating stochastic gradient descent using predictive variance reduction. In Advances in Neural Information Processing Systems (NeurIPS'13). 315--323.
    [47]
    Peter Kairouz, H. Brendan McMahan, Brendan Avent, et al. 2019. Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019).
    [48]
    Jiawen Kang, Zehui Xiong, Dusit Niyato, Shengli Xie, and Junshan Zhang. 2019. Incentive mechanism for reliable federated learning: A joint optimization approach to combining reputation and contract theory. IEEE Internet of Things Journal 6, 6 (2019), 10700--10714.
    [49]
    Jiawen Kang, Zehui Xiong, Dusit Niyato, Yuze Zou, Yang Zhang, and Mohsen Guizani. 2020. Reliable federated learning for mobile networks. IEEE Wireless Communications 27, 2 (2020), 72--80.
    [50]
    Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
    [51]
    Anastasia Koloskova, Tao Lin, Sebastian U. Stich, and Martin Jaggi. 2019. Decentralized deep learning with arbitrary communication compression. arXiv preprint arXiv:1907.09356 (2019).
    [52]
    Anastasia Koloskova, Sebastian U. Stich, and Martin Jaggi. 2019. Decentralized stochastic optimization and gossip algorithms with compressed communication. arXiv preprint arXiv:1902.00340 (2019).
    [53]
    Jakub Konecnỳ. 2017. Stochastic, distributed and federated optimization for machine learning. arXiv preprint arXiv:1707.01155 (2017).
    [54]
    Jakub Konečnỳ, Brendan McMahan, and Daniel Ramage. 2015. Federated optimization: Distributed optimization beyond the datacenter. arXiv preprint arXiv:1511.03575 (2015).
    [55]
    Jakub Konečnỳ, H. Brendan McMahan, Daniel Ramage, and Peter Richtárik. 2016. Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527 (2016).
    [56]
    Jakub Konečnỳ, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).
    [57]
    Nicholas D. Lane, Petko Georgiev, and Lorena Qendro. 2015. DeepEar: Robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp'15). 283--294.
    [58]
    Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436.
    [59]
    Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.
    [60]
    David D. Lewis, Yiming Yang, Tony G. Rose, and Fan Li. 2004. Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research 5 (Apr 2004), 361--397.
    [61]
    Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, and Bor-Yiing Su. 2014. Scaling distributed machine learning with the parameter server. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 583--598.
    [62]
    Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. 2019. Federated learning: Challenges, methods, and future directions. arXiv preprint arXiv:1908.07873 (2019).
    [63]
    Xiangru Lian, Yijun Huang, Yuncheng Li, and Ji Liu. 2015. Asynchronous parallel stochastic gradient for nonconvex optimization. In Advances in Neural Information Processing Systems (NeurIPS'15). 2737--2745.
    [64]
    Yang Liu, Tianjian Chen, and Qiang Yang. 2018. Secure federated transfer learning. arXiv preprint arXiv:1812.03337 (2018).
    [65]
    Dumitrel Loghin, Shaofeng Cai, Gang Chen, Tien Tuan Anh Dinh, Feiyi Fan, Qian Lin, Janice Ng, Beng Chin Ooi, Xutao Sun, Quang-Trung Ta, et al. 2020. The disruptions of 5G on data-driven technologies and applications. IEEE Transactions on Knowledge and Data Engineering 32, 6 (2020), 1179--1198.
    [66]
    Chenxin Ma, Virginia Smith, Martin Jaggi, Michael I. Jordan, Peter Richtárik, and Martin Takáč. 2015. Adding vs. averaging in distributed primal-dual optimization. arXiv preprint arXiv:1502.03508 (2015).
    [67]
    Chenxin Ma and Martin Takáč. 2016. Distributed inexact damped newton method: Data partitioning and load-balancing. arXiv preprint arXiv:1603.05191 (2016).
    [68]
    Akhil Mathur, Nicholas D. Lane, Sourav Bhattacharya, Aidan Boran, Claudio Forlivesi, and Fahim Kawsar. 2017. DeepEye: Resource efficient local execution of multiple deep vision models using wearable commodity hardware. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'17). 68--81.
    [69]
    I. McGraw, R. Prabhavalkar, R. Alvarez, M. G. Arenas, K. Rao, D. Rybach, O. Alsharif, H. Sak, A. Gruenstein, F. Beaufays, and C. Parada. 2016. Personalized speech recognition on mobile devices. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’16). 5955--5959.
    [70]
    H. Brendan McMahan, Eider Moore, Daniel Ramage, and Blaise Agüera y Arcas. 2016. Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629 (2016).
    [71]
    H. Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2017. Learning differentially private language models without losing accuracy. arXiv preprint arXiv:1710.06963 (2017).
    [72]
    Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, D. B. Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, and Ameet Talwalkar. 2016. MLlib: Machine learning in apache spark. Journal of Machine Learning Research 17, 34 (2016), 1--7.
    [73]
    Gaurav Mittal, Kaushal B. Yagnik, Mohit Garg, and Narayanan C. Krishnan. 2016. SpotGarbage: Smartphone app to detect garbage using deep learning. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp'16). 940--945.
    [74]
    Mehryar Mohri, Gary Sivek, and Ananda Theertha Suresh. 2019. Agnostic federated learning. In Proceedings of the 36th International Conference on Machine Learning (ICML'19), Vol. 97. 4615--4625.
    [75]
    Solmaz Niknam, Harpreet S. Dhillon, and Jeffery H. Reed. 2019. Federated learning for wireless communications: Motivation, opportunities and challenges. arXiv preprint arXiv:1908.06847 (2019).
    [76]
    Pitch Patarasuk and Xin Yuan. 2009. Bandwidth optimal all-reduce algorithms for clusters of workstations. Journal Parallel and Distributed Computing 69, 2 (2009), 117--124.
    [77]
    Diego Peteiro-Barral and Bertha Guijarro-Berdiñas. 2013. A survey of methods for distributed machine learning. Progress in Artificial Intelligence 2, 1 (2013), 1--11.
    [78]
    Maarten G. Poirot, Praneeth Vepakomma, Ken Chang, Jayashree Kalpathy-Cramer, Rajiv Gupta, and Ramesh Raskar. 2019. Split learning for collaborative deep learning in healthcare. arXiv preprint arXiv:1912.12115 (2019).
    [79]
    Raluca Ada Popa. 2014. Building Practical Systems that Compute on Encrypted Data. Ph.D. Dissertation. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science.
    [80]
    Raluca Ada Popa, Catherine Redfield, Nickolai Zeldovich, and Hari Balakrishnan. 2011. CryptDB: Protecting confidentiality with encrypted query processing. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP'11). 85--100.
    [81]
    Raluca Ada Popa, Catherine Redfield, Nickolai Zeldovich, and Hari Balakrishnan. 2012. CryptDB: Processing queries on an encrypted database. Communications of the ACM 55, 9 (2012), 103--111.
    [82]
    Foster J. Provost and Daniel N. Hennessy. 1996. Scaling up: Distributed machine learning with cooperation. In Proceedings of the Thirteenth National Conference on Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conference (AAAI/IAAI'96), Vol. 1. 74--79.
    [83]
    Ning Qian. 1999. On the momentum term in gradient descent learning algorithms. Neural Networks 12, 1 (1999), 145--151.
    [84]
    X. Ran, H. Chen, X. Zhu, Z. Liu, and J. Chen. 2018. DeepDecision: A mobile deep learning framework for edge video analytics. In IEEE IEEE Conference on Computer Communications (INFOCOM’18). 1421--1429.
    [85]
    Sujith Ravi. 2019. Efficient on-device models using neural projections. In Proceedings of the 36th International Conference on Machine Learning (ICML'19), Vol. 97. 5370--5379.
    [86]
    Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. 2011. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems (NeurIPS'11). 693--701.
    [87]
    Jason Rennie. 2007. 20 Newsgroups. Retrieved June 22, 2020, from http://qwone.com/ jason/20Newsgroups/.
    [88]
    Peter Richtárik and Martin Takáč. 2016. Distributed coordinate descent method for learning with big data. Journal of Machine Learning Research 17, 1 (2016), 2657--2681.
    [89]
    Herbert Robbins and Sutton Monro. 1951. A stochastic approximation method. Annals of Mathematical Statistics 22, 3 (1951), 400--407.
    [90]
    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252.
    [91]
    Sumudu Samarakoon, Mehdi Bennis, Walid Saad, and Merouane Debbah. 2018. Federated learning for ultra-reliable low-latency V2V communications. In 2018 IEEE Global Communications Conference (GLOBECOM’18). IEEE, 1--7.
    [92]
    J. Reddi Sashank, Kale Satyen, and Kumar Sanjiv. 2018. On the convergence of Adam and beyond. In Proceedings of the 6th International Conference on Learning Representations (ICLR'18).
    [93]
    Felix Sattler, Simon Wiedemann, Klaus-Robert Müller, and Wojciech Samek. 2019. Robust and communication-efficient federated learning from non-IID data. arXiv preprint arXiv:1903.02891 (2019).
    [94]
    Stefano Savazzi, Monica Nicoli, and Vittorio Rampa. 2020. Federated learning with cooperating devices: A consensus approach for massive IoT networks. IEEE Internet of Things Journal 7, 5 (2020), 4641--4654.
    [95]
    Alexander Sergeev and Mike Del Balso. 2018. Horovod: Fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018).
    [96]
    Ohad Shamir, Nati Srebro, and Tong Zhang. 2014. Communication-efficient distributed optimization using an approximate newton-type method. In Proceedings of the 31st International Conference on Machine Learning (ICML'14), Vol. 32. 1000--1008.
    [97]
    Vivek Sharma, Praneeth Vepakomma, Tristan Swedish, Ken Chang, Jayashree Kalpathy-Cramer, and Ramesh Raskar. 2019. ExpertMatcher: Automating ML model selection for clients using hidden representations. arXiv preprint arXiv:1910.03731 (2019).
    [98]
    Muhammad Shayan, Clement Fung, Chris J. M. Yoon, and Ivan Beschastnikh. 2018. Biscotti: A ledger for private and secure peer-to-peer machine learning. arXiv preprint arXiv:1811.09904 (2018).
    [99]
    Micah J. Sheller, G. Anthony Reina, Brandon Edwards, Jason Martin, and Spyridon Bakas. 2018. Multi-institutional deep learning modeling without sharing patient data: A feasibility study on brain tumor segmentation. In International MICCAI Brainlesion Workshop. 92--104.
    [100]
    Reza Shokri and Vitaly Shmatikov. 2015. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS'15). 1310--1321.
    [101]
    Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership inference attacks against machine learning models. In IEEE Symposium on Security and Privacy (SP’17). 3--18.
    [102]
    Liu Sicong, Zhou Zimu, Du Junzhao, Shangguan Longfei, Jun Han, and Xin Wang. 2017. UbiEar: Bringing location-independent sound awareness to the hard-of-hearing people with smartphones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (UbiComp'17) 1, 2 (2017), 1--21.
    [103]
    Abhishek Singh, Praneeth Vepakomma, Otkrist Gupta, and Ramesh Raskar. 2019. Detailed comparison of communication efficiency of split learning and federated learning. arXiv preprint arXiv:1909.09145 (2019).
    [104]
    Siri Team. 2017. Deep learning for Siri’s voice: On-device deep mixture density networks for hybrid unit selection synthesis. Retrieved Nov. 13, 2020, from https://machinelearning.apple.com/research/siri-voices.
    [105]
    Virginia Smith, Chao-Kai Chiang, Maziar Sanjabi, and Ameet S Talwalkar. 2017. Federated multi-task learning. In Advances in Neural Information Processing Systems 30 (NeurIPS'17). Curran Associates, Inc., 4424--4434.
    [106]
    Virginia Smith, Simone Forte, Chenxin Ma, Martin Takáč, Michael I. Jordan, and Martin Jaggi. 2017. CoCoA: A general framework for communication-efficient distributed optimization. Journal of Machine Learning Research 18, 1 (2017), 8590--8638.
    [107]
    Alexander Smola and Shravan Narayanamurthy. 2010. An architecture for parallel topic models. Proceedings of the VLDB Endowment 3, 1–2 (2010), 703--710.
    [108]
    Boyd Stephen, Parikh Neal, Chu Eric, Peleato Borja, and Eckstein Jonathan. 2010. MPI example for alternating direction method of multipliers. Retrieved June 26, 2020, from https://stanford.edu/ boyd/papers/admm/mpi/.
    [109]
    Shizhao Sun, Wei Chen, Jiang Bian, Xiaoguang Liu, and Tie-Yan Liu. 2017. Ensemble-compression: A new method for parallel training of deep neural networks. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD'17). 187--202.
    [110]
    Tencent. 2017. ncnn. Retrieved January 20, 2020, from https://github.com/Tencent/ncnn.
    [111]
    T. Tieleman and Geoffrey Hinton. 2012. Neural networks for machine learning. Retrieved from https://www.cs.toronto.edu/tijmen/csc321/slides/lecture_slides_lec6.pdf.
    [112]
    Praneeth Vepakomma, Otkrist Gupta, Abhimanyu Dubey, and Ramesh Raskar. 2019. Reducing leakage in distributed deep learning for sensitive health data. arXiv preprint arXiv:1812.00564 (2019).
    [113]
    Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, and Ramesh Raskar. 2018. Split learning for health: Distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564 (2018).
    [114]
    Praneeth Vepakomma, Tristan Swedish, Ramesh Raskar, Otkrist Gupta, and Abhimanyu Dubey. 2018. No peek: A survey of private distributed deep learning. arXiv preprint arXiv:1812.03288 (2018).
    [115]
    Ashia C. Wilson, Rebecca Roelofs, Mitchell Stern, Nati Srebro, and Benjamin Recht. 2017. The marginal value of adaptive gradient methods in machine learning. In Advances in Neural Information Processing Systems 30 (NeurIPS'17). Curran Associates, Inc., 4148--4158.
    [116]
    Xindong Wu, Vipin Kumar, J. Ross Quinlan, et al. 2008. Top 10 algorithms in data mining. Knowledge and Information Systems 14, 1 (2008), 1--37.
    [117]
    Mengwei Xu, Jiawei Liu, Yuanqiang Liu, Felix Xiaozhu Lin, Yunxin Liu, and Xuanzhe Liu. 2019. A first look at deep learning apps on smartphones. In The World Wide Web Conference (WWW'19). 2125--2136.
    [118]
    Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 12.
    [119]
    Timothy Yang, Galen Andrew, Hubert Eichner, Haicheng Sun, Wei Li, Nicholas Kong, Daniel Ramage, and Françoise Beaufays. 2018. Applied federated learning: Improving Google keyboard query suggestions. arXiv preprint arXiv:1812.02903 (2018).
    [120]
    Hao Yu, Rong Jin, and Sen Yang. 2019. On the linear speedup analysis of communication efficient momentum SGD for distributed non-convex optimization. In Proceedings of the 36th International Conference on Machine Learning (ICML'19), Vol. 97. 7184--7193.
    [121]
    Hao Yu, Sen Yang, and Shenghuo Zhu. 2019. Parallel restarted SGD with faster convergence and less communication: Demystifying why model averaging works for deep learning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI'19), Vol. 33. 5693--5700.
    [122]
    Xiao Zeng, Kai Cao, and Mi Zhang. 2017. MobileDeepPill: A small-footprint mobile deep learning system for recognizing unconstrained pill images. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'17). 56--67.
    [123]
    Sixin Zhang, Anna E. Choromanska, and Yann LeCun. 2015. Deep learning with elastic averaging SGD. In Advances in Neural Information Processing Systems (NeurIPS'15). 685--693.
    [124]
    Yuchen Zhang and Xiao Lin. 2015. DiSCO: Distributed optimization for self-concordant empirical loss. In International Conference on Machine Learning (ICML'15), Vol. 37. 362--370.
    [125]
    Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, and Vikas Chandra. 2018. Federated learning with non-IID data. arXiv preprint arXiv:1806.00582 (2018).
    [126]
    Shuxin Zheng, Qi Meng, Taifeng Wang, Wei Chen, Nenghai Yu, Zhi-Ming Ma, and Tie-Yan Liu. 2017. Asynchronous stochastic gradient descent with delay compensation. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. 4120--4129.
    [127]
    Martin Zinkevich, Markus Weimer, Lihong Li, and Alex J. Smola. 2010. Parallelized stochastic gradient descent. In Advances in Neural Information Processing Systems (NeurIPS'10). 2595--2603.

    Cited By

    View all
    • (2024)Talaria: Interactively Optimizing Machine Learning Models for Efficient InferenceProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642628(1-19)Online publication date: 11-May-2024
    • (2024)Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning ExperiencesProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642109(1-18)Online publication date: 11-May-2024
    • (2023)Offloading Machine Learning to Programmable Data Planes: A Systematic SurveyACM Computing Surveys10.1145/360515356:1(1-34)Online publication date: 26-Aug-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Computing Surveys
    ACM Computing Surveys  Volume 54, Issue 1
    January 2022
    844 pages
    ISSN:0360-0300
    EISSN:1557-7341
    DOI:10.1145/3446641
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 January 2021
    Accepted: 01 September 2020
    Revised: 01 July 2020
    Received: 01 January 2020
    Published in CSUR Volume 54, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Mobile intelligence
    2. decentralized training
    3. distributed system
    4. federated learning
    5. machine learning

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • National Key R&D Program of China
    • National Natural Science Foundation of China
    • Joint Scientific Research Foundation of the State Education Ministry
    • Alibaba Innovation Research Program

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)108
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Talaria: Interactively Optimizing Machine Learning Models for Efficient InferenceProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642628(1-19)Online publication date: 11-May-2024
    • (2024)Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning ExperiencesProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642109(1-18)Online publication date: 11-May-2024
    • (2023)Offloading Machine Learning to Programmable Data Planes: A Systematic SurveyACM Computing Surveys10.1145/360515356:1(1-34)Online publication date: 26-Aug-2023
    • (2023)Scheduling Algorithms for Federated Learning With Minimal Energy ConsumptionIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.324083334:4(1215-1226)Online publication date: 1-Apr-2023
    • (2023)Collaborative Machine Learning: Schemes, Robustness, and PrivacyIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.316934734:12(9625-9642)Online publication date: Dec-2023
    • (2023)Edge Intelligence in Intelligent Transportation Systems: A SurveyIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.327574124:9(8919-8944)Online publication date: 1-Sep-2023
    • (2023)Medicine Recommendation for Pharmacists at Drug Stores based on Data Analysis of Conditional Knowledge2023 8th International Conference on Business and Industrial Research (ICBIR)10.1109/ICBIR57571.2023.10147477(1199-1204)Online publication date: 18-May-2023
    • (2023)FedCPSO: Federated Learning with Combined Particle Swarm Optimization2023 China Automation Congress (CAC)10.1109/CAC59555.2023.10451632(3817-3822)Online publication date: 17-Nov-2023
    • (2023)TongueMobile: automated tongue segmentation and diagnosis on smartphonesNeural Computing and Applications10.1007/s00521-023-08902-535:28(21259-21274)Online publication date: 4-Aug-2023
    • (2022)Cyberattacks Defense in Digital Music Streaming Platforms by Mobile Distributed Machine LearningComputational Intelligence and Neuroscience10.1155/2022/17012662022Online publication date: 1-Jan-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media