research-article

Complexity vs. performance: empirical analysis of machine learning as a service

Authors:

Bimal Viswanath,

Haitao Zheng, and

Ben Y. ZhaoAuthors Info & Claims

IMC '17: Proceedings of the 2017 Internet Measurement Conference

November 2017

Pages 384 - 397

https://doi.org/10.1145/3131365.3131372

Published: 01 November 2017 Publication History

Abstract

Machine learning classifiers are basic research tools used in numerous types of network analysis and modeling. To reduce the need for domain expertise and costs of running local ML classifiers, network researchers can instead rely on centralized Machine Learning as a Service (MLaaS) platforms.

In this paper, we evaluate the effectiveness of MLaaS systems ranging from fully-automated, turnkey systems to fully-customizable systems, and find that with more user control comes greater risk. Good decisions produce even higher performance, and poor decisions result in harsher performance penalties. We also find that server side optimizations help fully-automated systems outperform default settings on competitors, but still lag far behind well-tuned MLaaS systems which compare favorably to standalone ML libraries. Finally, we find classifier choice is the dominating factor in determining model performance, and that users can approximate the performance of an optimal classifier choice by experimenting with a small subset of random classifiers. While network researchers should approach MLaaS systems with caution, they can achieve results comparable to standalone classifiers if they have sufficient insight into key decisions like classifiers and feature selection.

References

[1]

Bhavish Aggarwal, Ranjita Bhagwan, Tathagata Das, Siddharth Eswaran, Venkata N. Padmanabhan, and Geoffrey M. Voelker. 2009. NetPrints: Diagnosing home network misconfigurations using shared knowledge. In Proc. of NSDI.

Digital Library

[2]

Jesüs Alcalá-Fdez, Luciano Sánchez, Salvador Garcia, Maria Jose del Jesus, Sebastian Ventura, Josep M. Garrell, José Otero, Cristóbal Romero, Jaume Bacardit, Victor M. Rivas, et al. 2009. KEEL: A software tool to assess evolutionary algorithms for data mining problems. Soft Computing-A Fusion of Foundations, Methodologies and Applications 13, 3 (2009), 307--318.

Digital Library

[3]

Arthur Asuncion and David Newman. 2007. UCI machine learning repository. http://archive.ics.uci.edu/ml. (2007).

[4]

Rémi Bardenet, Mátyás Brendel, Balázs Kégl, and Michele Sebag. 2013. Collaborative hyperparameter tuning. In Proc. of ICML.

Digital Library

[5]

Fabrício Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgílio Almeida. 2010. Detecting spammers on Twitter. In Proc. of CEAS.

[6]

James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, Feb (2012), 281--305.

Digital Library

[7]

James S. Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for hyper-parameter optimization. In Proc. of NIPS.

Digital Library

[8]

Peter Bodik, Moises Goldszmidt, Armando Fox, Dawn B. Woodard, and Hans Andersen. 2010. Fingerprinting the datacenter: Automated classification of performance crises. In Proc. of EuroSys.

Digital Library

[9]

Léon Bottou and Chih-Jen Lin. 2007. Support vector machine solvers. Large scale kernel machines (2007), 301--320.

[10]

Pavel B. Brazdil, Carlos Soares, and Joaquim Pinto Da Costa. 2003. Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning 50, 3 (2003), 251--277.

Digital Library

[11]

Leo Breiman. 1996. Bagging predictors. Machine learning 24, 2 (1996), 123--140.

Digital Library

[12]

Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5--32.

Digital Library

[13]

Matthijs C. Brouwer, Allan R. Tunkel, and Diederik van de Beek. 2010. Epidemiology, diagnosis, and antimicrobial treatment of acute bacterial meningitis. Clinical microbiology reviews 23, 3 (2010), 467--492.

[14]

Rich Caruana, Nikos Karampatziakis, and Ainur Yessenalina. 2008. An empirical evaluation of supervised learning in high dimensions. In Proc. of ICML.

Digital Library

[15]

Rich Caruana and Alexandru Niculescu-Mizil. 2006. An empirical comparison of supervised learning algorithms. In Proc. of ICML.

Digital Library

[16]

Simon Chan, Thomas Stone, Kit Pang Szeto, and Ka Hou Chan. 2013. PredictionIO: a distributed machine learning server for practical software development. In Proc. of CIKM.

Digital Library

[17]

Helen Costa, Fabricio Benevenuto, and Luiz H.C. Merschmann. 2013. Detecting tip spam in location-based social networks. In Proc. of SAC.

Digital Library

[18]

Helen Costa, Luiz Henrique de Campos Merschmann, Fabricio Barth, and Fabricio Benevenuto. 2014. Pollution, Bad-mouthing, and Local Marketing: The underground of location-based social networks. Elsevier Information Sciences (2014).

[19]

Janez Demšar. 2006. Statistical comparisons of classifiers over multiple data sets. Journal of Machine learning research 7, Jan (2006), 1--30.

Digital Library

[20]

Thomas G. Dietterich. 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation 10, 7 (1998), 1895--1923.

Digital Library

[21]

Katharina Eggensperger, Matthias Feurer, Frank Hutter, James Bergstra, Jasper Snoek, Holger Hoos, and Kevin Leyton-Brown. 2013. Towards an empirical foundation for assessing bayesian optimization of hyperparameters. In Proc. of NIPS.

[22]

Manuel J.A. Eugster, Torsten Hothorn, and Friedrich Leisch. 2016. Domain-based benchmark experiments: Exploratory and inferential analysis. Austrian Journal of Statistics 41, 1 (2016), 5--26.

[23]

Manuel Fernández-Delgado, Eva Cernadas, Senén Barro, and Dinani Amorim. 2014. Do we need hundreds of classifiers to solve real world classification problems. Journal of Machine Learning Research 15, 1 (2014), 3133--3181.

Digital Library

[24]

Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and robust automated machine learning. In Proc. of NIPS.

Digital Library

[25]

Matthias Feurer, Jost Tobias Springenberg, and Frank Hutter. 2015. Initializing bayesian hyperparameter optimization via meta-learning. In Proc. of AAAI.

Digital Library

[26]

Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In Proc. of CCS.

Digital Library

[27]

Yoav Freund and Robert E. Schapire. 1999. Large margin classification using the perceptron algorithm. Machine learning 37, 3 (1999), 277--296.

Digital Library

[28]

Jerome H. Friedman. 2002. Stochastic gradient boosting. Computational Statistics and Data Analysis 38, 4 (2002), 367--378.

Digital Library

[29]

Salvador Garcia and Francisco Herrera. 2008. An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. Journal of Machine Learning Research 9, Dec (2008), 2677--2694.

[30]

Michael Goebel and Le Gruenwald. 1999. A survey of data mining and knowledge discovery software tools. ACM SIGKDD explorations newsletter 1, 1 (1999), 20--33.

Digital Library

[31]

Peter Haider and Tobias Scheffer. 2014. Finding botnets using minimal graph clusterings. In Proc. of ICML.

Digital Library

[32]

Frank E. Harrell. 2002. Very low birth weight infants dataset.

[33]

Frank E. Harrell. 2006. VA lung cancer dataset.

[34]

Ralf Herbrich, Thore Graepel, and Colin Campbell. 2001. Bayes point machines. Journal of Machine Learning Research 1, Aug (2001), 245--279.

Digital Library

[35]

Robert C. Holte. 1993. Very simple classification rules perform well on most commonly used datasets. Machine learning 11, 1 (1993), 63--90.

Digital Library

[36]

Torsten Hothorn, Friedrich Leisch, Achim Zeileis, and Kurt Hornik. 2005. The design and analysis of benchmark experiments. Journal of Computational and Graphical Statistics 14, 3 (2005), 675--699.

[37]

Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential model-based optimization for general algorithm configuration. In Proc. of LION.

Digital Library

[38]

Eamonn Keogh and Shruti Kasetty. 2003. On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Mining and knowledge discovery 7, 4 (2003), 349--371.

Digital Library

[39]

Lars Kotthoff, Chris Thornton, Holger H. Hoos, Frank Hutter, and Kevin Leyton-Brown. 2016. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research 17 (2016), 1--5.

Digital Library

[40]

Rui Leite, Pavel Brazdil, and Joaquin Vanschoren. 2012. Selecting classification algorithms with active testing. In Proc. of MLDM.

Digital Library

[41]

Zhijing Li, Ana Nika, Xinyi Zhang, Yanzi Zhu, Yuanshun Yao, Ben Y. Zhao, and Haitao Zheng. 2017. Identifying value in crowdsourced wireless signal measurements. In Proc. of WWW.

Digital Library

[42]

Dapeng Liu, Youjian Zhao, Haowen Xu, Yongqian Sun, Dan Pei, Jiao Luo, Xiaowei Jing, and Mei Feng. 2015. Opprentice: Towards practical and automatic anomaly detection through machine learning. In Proc. of IMC.

Digital Library

[43]

Qingyun Liu, Shiliang Tang, Xinyi Zhang, Xiaohan Zhao, Ben Y. Zhao, and Haitao Zheng. 2016. Network growth and link prediction through an empirical lens. In Proc. of IMC.

Digital Library

[44]

Julián Luengo and Francisco Herrera. 2015. An automatic extraction method of the domains of competence for learning classifiers using data complexity measures. Knowledge and Information Systems 42, 1 (2015), 147--180.

Digital Library

[45]

Núria Macià and Ester Bernadó-Mansilla. 2014. Towards UCI+: A mindful repository design. Information Sciences 261 (2014), 237--262.

Digital Library

[46]

Núria Macià, Ester Bernadó-Mansilla, Albert Orriols-Puig, and Tin Kam Ho. 2013. Learner excellence biased by data set selection: A case for data characterisation and artificial data sets. Pattern Recognition 46, 3 (2013), 1054--1066.

Digital Library

[47]

Richard Maclin and David Opitz. 1997. An empirical evaluation of bagging and boosting. In Proc. of AAAI.

Digital Library

[48]

Laura Morán-Fernández, Verónica Bolón-Canedo, and Amparo Alonso-Betanzos. 2016. Can classification performance be predicted by complexity measures? A study using microarray data. Knowledge and Information Systems (2016), 1--24.

[49]

David Opitz and Richard Maclin. 1999. Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research 11 (1999), 169--198.

Digital Library

[50]

Claudia Perlich, Foster Provost, and Jeffrey S. Simonoff. 2003. Tree induction vs. logistic regression: A learning-curve analysis. Journal of Machine Learning Research 4, Jun (2003), 211--255.

Digital Library

[51]

Mauro Ribeiro, Katarina Grolinger, and Miriam A.M. Capretz. 2015. MLaaS: Machine learning as a service. In Proc. of ICMLA.

[52]

David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1988. Learning representations by back-propagating errors. Cognitive modeling 5, 3 (1988), 1.

[53]

Steven L. Salzberg. 1997. On comparing classifiers: Pitfalls to avoid and a recommended approach. Data mining and knowledge discovery 1, 3 (1997), 317--328.

Digital Library

[54]

Purnamrita Sarkar, Deepayan Chakrabarti, and Michael I. Jordan. 2012. Nonparametric link prediction in dynamic networks. In Proc. of ICML.

Digital Library

[55]

David J. Sheskin. 2003. Handbook of parametric and nonparametric statistical procedures. CRC Press.

Digital Library

[56]

Shaohuai Shi, Qiang Wang, Pengfei Xu, and Xiaowen Chu. 2016. Benchmarking state-of-the-art deep learning software tools. International Conference on Cloud Computing and Big Data (2016), 99--104.

[57]

Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership inference attacks against machine learning models. In Proc. of IEEE S&P.

[58]

Jamie Shotton, Toby Sharp, Pushmeet Kohli, Sebastian Nowozin, John Winn, and Antonio Criminisi. 2013. Decision jungles: Compact and rich models for classification. In Proc. of NIPS.

Digital Library

[59]

Anirudh Sivaraman, Keith Winstein, Pratiksha Thaker, and Hari Balakrishnan. 2014. An experimental study of the learnability of congestion control. In Proc. of SIGCOMM.

Digital Library

[60]

Michael R. Smith, Logan Mitchell, Christophe Giraud-Carrier, and Tony Martinez. 2014. Recommending learning algorithms and their associated hyperparameters. In Proc. of MLAS.

Digital Library

[61]

Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. 2012. Practical bayesian optimization of machine learning algorithms. In Proc. of NIPS.

Digital Library

[62]

Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. 2005. Introduction to data mining. Addison-Wesley Longman Publishing Co., Inc.

[63]

Chris Thornton, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2013. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In Proc. of KDD.

Digital Library

[64]

Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Reiter, and Thomas Ristenpart. 2016. Stealing machine learning models via Prediction APIs. In Proc. of USENIX Security.

[65]

Ferdinand Van Der Heijden, Robert Duin, Dick De Ridder, and David MJ Tax. 2005. Classification, parameter estimation and state estimation: an engineering approach using MATLAB. John Wiley & Sons.

[66]

Joaquin Vanschoren, Hendrik Blockeel, Bernhard Pfahringer, and Geoffrey Holmes. 2012. Experiment databases. Machine Learning 87, 2 (2012), 127--158.

Digital Library

[67]

Jean-Philippe Vert, Koji Tsuda, and Bernhard Schölkopf. 2004. A primer on kernel methods. Kernel Methods in Computational Biology (2004), 35--70.

[68]

Kiri Wagstaff. 2012. Machine learning that matters. In Proc. of ICML.

Digital Library

[69]

Abdullah H. Wahbeh, Qasem A. Al-Radaideh, Mohammed N. Al-Kabi, and Emad M. Al-Shawakfa. 2011. A comparison study between data mining tools over some classification methods. International Journal of Advanced Computer Science and Applications (2011), 18--26.

[70]

Gang Wang, Tristan Konolige, Christo Wilson, Xiao Wang, Haitao Zheng, and Ben Y. Zhao. 2013. You are how you click: Clickstream analysis for sybil detection. In Proc. of Usenix Security.

Digital Library

[71]

Gang Wang, Bolun Wang, Tianyi Wang, Ana Nika, Haitao Zheng, and Ben Y. Zhao. 2014. Whispers in the dark: Analysis of an anonymous social network. In Proc. of IMC.

Digital Library

[72]

Gang Wang, Xinyi Zhang, Shiliang Tang, Haitao Zheng, and Ben Y. Zhao. 2016. Unsupervised clickstream clustering for user behavior analysis. In Proc. of CHI.

Digital Library

[73]

David J. Whellan, Robert H. Tuttle, Eric J. Velazquez, Linda K. Shaw, James G. Jollis, Wendell Ellis, Christopher M. O'connor, Robert M. Califf, and Salvador Borges-Neto. 2006. Predicting significant coronary artery disease in patients with left ventricular dysfunction. American heart journal 152, 2 (2006), 340--347.

[74]

Keith Winstein and Hari Balakrishnan. 2013. TCP ex Machina: Computer-generated congestion control. In Proc. of SIGCOMM.

Digital Library

[75]

Ian H. Witten, Eibe Frank, Leonard E. Trigg, Mark A. Hall, Geoffrey Holmes, and Sally Jo Cunningham. 1999. Weka: Practical machine learning tools and techniques with Java implementations.

Digital Library

[76]

Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael I. Jordan. 2009. Detecting large-scale system problems by mining console logs. In Proc. of SOSP.

Digital Library

[77]

Minhui Xue, Cameron Ballard, Kelvin Liu, Carson Nemelka, Yanqiu Wu, Keith Ross, and Haifeng Qian. 2016. You can Yak but you can't hide: Localizing anonymous social network users. In Proc. of IMC.

Digital Library

[78]

Julian Zubek and Dariusz M. Plewczynski. 2016. Complexity curve: a graphical measure of data complexity and classifier performance. Peer J Computer Science 2 (2016), e76.

Cited By

Kangethe LWimmer HRebman Jr. C(2024)Network Intrusion Detection system with Machine learning Intrusion Detection System with Machine Learning As a ServiceJournal of Information Systems Applied Research10.62273/EWQL502317:3(4-15)Online publication date: 2024
https://doi.org/10.62273/EWQL5023
Eyuboglu SGoel KDesai AChen LMonfort MRé CZou J(2024)Model ChangeLists: Characterizing Updates to ML ModelsProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659047(2432-2453)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3659047
Yoo JPark JPark H(2024)Enhancing safety of construction workers in Korea: an integrated text mining and machine learning framework for predicting accident typesInternational Journal of Injury Control and Safety Promotion10.1080/17457300.2023.230042431:2(203-215)Online publication date: 2-Jan-2024
https://doi.org/10.1080/17457300.2023.2300424
Show More Cited By

Index Terms

Complexity vs. performance: empirical analysis of machine learning as a service
1. Computing methodologies
  1. Machine learning

Recommendations

Engineering a platform for reinforcement learning workloads
CAIN '22: Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI

Reinforcement Learning (RL) is an area of machine learning concerned with teaching intelligent agents to take desired actions in a specific environment. The teaching part can be performed in a simulated environment where the agent can learn how to react ...
Read More
Task Failure Prediction using Combine Bagging Ensemble (CBE) Classification in Cloud Workflow

Scientific applications adopt cloud environment for executing its workflows as tasks. When a task fails, dependency nature of the workflows affects the overall performance of the execution. An efficient failure prediction mechanism is needed to execute ...
Read More
General pattern recognition using machine learning in the cloud
Abstract
Machine learning (ML) and cloud computing are two subjects that mix very well. The existence of cloud computing enables data scientists to create their machine learning models with the benefits of cloud computing which are very low cost, high ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IMC '17: Proceedings of the 2017 Internet Measurement Conference

November 2017

509 pages

ISBN:9781450351188

DOI:10.1145/3131365

General Chairs:
Steve Uhlig
Queen Mary University of London
,
Olaf Maennel
Tallinn University of Technology

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

USENIX Assoc: USENIX Assoc

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

IMC '17

Sponsor:

IMC '17: Internet Measurement Conference

November 1 - 3, 2017

London, United Kingdom

Acceptance Rates

Overall Acceptance Rate 277 of 1,083 submissions, 26%

Upcoming Conference

IMC '24

Sponsor:
sigcomm
sigcomm

ACM Internet Measurement Conference

November 4 - 6, 2024

Madrid , AA , Spain

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

46
Total Citations
View Citations
1,006
Total Downloads

Downloads (Last 12 months)152
Downloads (Last 6 weeks)22

Other Metrics

View Author Metrics

Citations

Cited By

Kangethe LWimmer HRebman Jr. C(2024)Network Intrusion Detection system with Machine learning Intrusion Detection System with Machine Learning As a ServiceJournal of Information Systems Applied Research10.62273/EWQL502317:3(4-15)Online publication date: 2024
https://doi.org/10.62273/EWQL5023
Eyuboglu SGoel KDesai AChen LMonfort MRé CZou J(2024)Model ChangeLists: Characterizing Updates to ML ModelsProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659047(2432-2453)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3659047
Yoo JPark JPark H(2024)Enhancing safety of construction workers in Korea: an integrated text mining and machine learning framework for predicting accident typesInternational Journal of Injury Control and Safety Promotion10.1080/17457300.2023.230042431:2(203-215)Online publication date: 2-Jan-2024
https://doi.org/10.1080/17457300.2023.2300424
Arslan MHunjra AAhmed WBen Zaied Y(2024)Forecasting multi‐frequency intraday exchange rates using deep learning modelsJournal of Forecasting10.1002/for.308243:5(1338-1355)Online publication date: 15-Feb-2024
https://doi.org/10.1002/for.3082
Keller Tesser RMarques ABorin E(2024)A lightweight performance proxy for deep‐learning model training on Amazon SageMakerConcurrency and Computation: Practice and Experience10.1002/cpe.810436:14Online publication date: 8-Apr-2024
https://doi.org/10.1002/cpe.8104
Cahyaningtyas ZPurwitasari DFatichah C(2023)Deep Learning Approaches for Automatic Drum TranscriptionEMITTER International Journal of Engineering Technology10.24003/emitter.v11i1.764(21-34)Online publication date: 23-Jun-2023
https://doi.org/10.24003/emitter.v11i1.764
Wan CLiu YDu KHoffmann HJiang JMaire MLu S(2023)Run-Time Prevention of Software Integration Failures of Machine Learning APIsProceedings of the ACM on Programming Languages10.1145/36228067:OOPSLA2(264-291)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1145/3622806
Wang JChen KShou LJiang DChen GDas SPandis ISelçuk Candan KAmer-Yahia S(2023)SMILE: A Cost-Effective System for Serving Massive Pretrained Language Models in The CloudCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589720(135-138)Online publication date: 4-Jun-2023
https://dl.acm.org/doi/10.1145/3555041.3589720
Phalak CChahal DSinghal R(2023)SIRM: Cost efficient and SLO aware ML prediction on Fog-Cloud Network2023 15th International Conference on COMmunication Systems & NETworkS (COMSNETS)10.1109/COMSNETS56262.2023.10041384(825-829)Online publication date: 3-Jan-2023
https://doi.org/10.1109/COMSNETS56262.2023.10041384
Jagati ASubbulakshmi T(2023)Building ML Workflow for Malware Images Classification using Machine Learning Services in Leading Cloud Platforms2023 International Conference on Computational Intelligence and Sustainable Engineering Solutions (CISES)10.1109/CISES58720.2023.10183421(233-239)Online publication date: 28-Apr-2023
https://doi.org/10.1109/CISES58720.2023.10183421
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents