Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3510003.3510052acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Manas: mining software repositories to assist AutoML

Published: 05 July 2022 Publication History

Abstract

Today deep learning is widely used for building software. A software engineering problem with deep learning is that finding an appropriate convolutional neural network (CNN) model for the task can be a challenge for developers. Recent work on AutoML, more precisely neural architecture search (NAS), embodied by tools like Auto-Keras aims to solve this problem by essentially viewing it as a search problem where the starting point is a default CNN model, and mutation of this CNN model allows exploration of the space of CNN models to find a CNN model that will work best for the problem. These works have had significant success in producing high-accuracy CNN models. There are two problems, however. First, NAS can be very costly, often taking several hours to complete. Second, CNN models produced by NAS can be very complex that makes it harder to understand them and costlier to train them. We propose a novel approach for NAS, where instead of starting from a default CNN model, the initial model is selected from a repository of models extracted from GitHub. The intuition being that developers solving a similar problem may have developed a better starting point compared to the default model. We also analyze common layer patterns of CNN models in the wild to understand changes that the developers make to improve their models. Our approach uses commonly occurring changes as mutation operators in NAS. We have extended Auto-Keras to implement our approach. Our evaluation using 8 top voted problems from Kaggle for tasks including image classification and image regression shows that given the same search time, without loss of accuracy, Manas produces models with 42.9% to 99.6% fewer number of parameters than Auto-Keras' models. Benchmarked on GPU, Manas' models train 30.3% to 641.6% faster than Auto-Keras' models.

References

[1]
Anonymized. 2015. Resnet network doesn't work as expected. https://stackoverflow.com/questions/49226447/resnet-network-doesnt-work-as-expected
[2]
Anonymized. 2021. Keras documentation. https://keras.io/api/
[3]
Arunava. 2018. Malaria Cell Images Dataset. https://www.kaggle.com/iarunava/cell-images-for-detecting-malaria
[4]
Bowen Baker, Otkrist Gupta, Nikhil Naik, and Ramesh Raskar. 2017. Designing Neural Network Architectures using Reinforcement Learning. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=S1c2cvqee
[5]
Puneet Bansal. 2018. Intel Image Classification. https://www.kaggle.com/puneet6060/intel-image-classification
[6]
Sumon Biswas and Hridesh Rajan. 2020. Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness. In ESEC/FSE'2020: The 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Sacramento, California, United States).
[7]
Sumon Biswas, Mohammad Wardat, and Hridesh Rajan. 2022. The Art and Practice of Data Science Pipelines: A Comprehensive Study of Data Science Pipelines In Theory, In-The-Small, and In-The-Large. In ICSE'22: The 44th International Conference on Software Engineering (Pittsburgh, PA, USA).
[8]
Hudson Borges, Andre Hora, and Marco Tulio Valente. 2016. Understanding the factors that impact the popularity of GitHub repositories. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 334--344.
[9]
Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, and Jun Wang. 2018. Efficient architecture search by network transformation. In Thirty-Second AAAI Conference on Artificial Intelligence.
[10]
José P Cambronero, Jürgen Cito, and Martin C Rinard. 2020. Ams: Generating automl search spaces from weak specifications. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 763--774.
[11]
José P Cambronero and Martin C Rinard. 2019. AL: autogenerating supervised learning programs. Proceedings of the ACM on Programming Languages 3, OOPSLA (2019), 1--28.
[12]
Tianqi Chen, Ian J. Goodfellow, and Jonathon Shlens. 2016. Net2Net: Accelerating Learning via Knowledge Transfer. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2--4, 2016, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1511.05641
[13]
Yukang Chen, Tong Yang, Xiangyu Zhang, GAOFENG MENG, Xinyu Xiao, and Jian Sun. 2019. DetNAS: Backbone Search for Object Detection. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2019/file/228b25587479f2fc7570428e8bcbabdc-Paper.pdf
[14]
Ekin D Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V Le. 2019. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 113--123.
[15]
Jiequan Cui, Pengguang Chen, Ruiyu Li, Shu Liu, Xiaoyong Shen, and Jiaya Jia. 2019. Fast and practical neural architecture search. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6509--6518.
[16]
Malinda Dilhara, Ameya Ketkar, Nikhith Sannidhi, and Danny Dig. 2022. Discovering Repetitive Code Changes in Python ML Systems. In International Conference on Software Engineering (Pittsburgh, United States) (ICSE '22). ACM/IEEE. To appear.
[17]
Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2018. Simple and efficient architecture search for Convolutional Neural Networks. https://openreview.net/forum?id=SySaJ0xCZ
[18]
Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2019. Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution. In International Conference on Learning Representations. https://openreview.net/forum?id=ByME42AqK7
[19]
Matthias Feurer, Katharina Eggensperger, Stefan Falkner, Marius Lindauer, and Frank Hutter. 2020. Auto-sklearn 2.0: The next generation. arXiv preprint arXiv:2007.04074 (2020).
[20]
Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and robust automated machine learning. In Advances in neural information processing systems. 2962--2970.
[21]
Giang Nguyen, Md Johirul Islam, Rangeet Pan, and Hridesh Rajan. 2021. Manas artifact. https://github.com/giangnm58/Manas
[22]
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H Witten. 2009. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11, 1 (2009), 10--18.
[23]
Greg Hamerly and Charles Elkan. 2004. Learning the k in k-means. In Advances in neural information processing systems. 281--288.
[24]
Song Han, Huizi Mao, and William J. Dally. 2016. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2--4, 2016, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1510.00149
[25]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[26]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700--4708.
[27]
Forrest N Iandola, Matthew W Moskewicz, Khalid Ashraf, and Kurt Keutzer. 2016. Firecaffe: near-linear acceleration of deep neural network training on compute clusters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2592--2600.
[28]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. PMLR, 448--456.
[29]
Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A Comprehensive Study on Deep Learning Bug Characteristics. In ESEC/FSE'19: The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (ESEC/FSE 2019).
[30]
Md Johirul Islam, Rangeet Pan, Giang Nguyen, and Hridesh Rajan. 2020. Repairing Deep Neural Networks: Fix Patterns and Challenges. In ICSE'20: The 42nd International Conference on Software Engineering (Seoul, South Korea).
[31]
Haifeng Jin, Qingquan Song, and Xia Hu. 2019. Auto-Keras: An Efficient Neural Architecture Search System. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1946--1956.
[32]
Kirthevasan Kandasamy, Willie Neiswanger, Jeff Schneider, Barnabas Poczos, and Eric P Xing. 2018. Neural Architecture Search with Bayesian Optimisation and Optimal Transport. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.), Vol. 31. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2018/file/f33ba15efa5c10e873bf3842afb46a6-Paper.pdf
[33]
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. Technical Report. Citeseer.
[34]
Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, et al. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
[35]
Min Lin, Qiang Chen, and Shuicheng Yan. 2014. Network In Network. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14--16, 2014, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1312.4400
[36]
Kevin Mader. 2018. Skin Cancer MNIST: HAM10000. https://www.kaggle.com/kmader/skin-cancer-mnist-ham10000
[37]
Alexander Mamaev. 2017. Flowers Recognition. https://www.kaggle.com/alxmamaev/flowers-recognition
[38]
Arda Mavi. 2017. Sign Language Digits Dataset. https://www.kaggle.com/ardamavi/sign-language-digits-dataset
[39]
Paul Mooney. 2017. Blood Cell Images. https://www.kaggle.com/paultimothymooney/blood-cells
[40]
Paul Mooney. 2017. Breast Histopathology Images. https://www.kaggle.com/paultimothymooney/breast-histopathology-images
[41]
Niv Nayman, Asaf Noy, Tal Ridnik, Itamar Friedman, Rong Jin, and Lihi Zelnik. 2019. XNAS: Neural Architecture Search with Expert Advice. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2019/file/00e26af6ac3b1c1c49d7c3d79c60d000-Paper.pdf
[42]
Randal S Olson, Nathan Bartley, Ryan J Urbanowicz, and Jason H Moore. 2016. Evaluation of a tree-based pipeline optimization tool for automating data science. In Proceedings of the Genetic and Evolutionary Computation Conference 2016. ACM, 485--492.
[43]
Randal S Olson, Ryan J Urbanowicz, Peter C Andrews, Nicole A Lavender, Jason H Moore, et al. 2016. Automating biomedical data science through tree-based pipeline optimization. In European Conference on the Applications of Evolutionary Computation. Springer, 123--137.
[44]
Rangeet Pan and Hridesh Rajan. 2022. Decomposing Convolutional Neural Networks into Reusable and Replaceable Modules. In ICSE'22: The 44th International Conference on Software Engineering (Pittsburgh, PA, USA).
[45]
Junran Peng, Ming Sun, ZHAO-XIANG ZHANG, Tieniu Tan, and Junjie Yan. 2019. Efficient Neural Architecture Transformation Search in Channel-Level for Object Detection. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2019/file/3aaa3db6a8983226601cac5dde15a26b-Paper.pdf
[46]
Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V Le, and Alexey Kurakin. 2017. Large-scale evolution of image classifiers. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2902--2911.
[47]
Adrian Rosebrock. 2019. Auto-Keras and AutoML: A Getting Started Guide. https://www.pyimagesearch.com/2019/01/07/auto-keras-and-automl-a-getting-started-guide
[48]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15, 1 (2014), 1929--1958.
[49]
Masanori Suganuma, Shinichi Shirakawa, and Tomoharu Nagao. 2017. A genetic programming approach to designing convolutional neural network architectures. In Proceedings of the Genetic and Evolutionary Computation Conference. ACM, 497--504.
[50]
Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning. PMLR, 6105--6114.
[51]
Tecperson. 2017. Sign Language MNIST. https://www.kaggle.com/datamunge/sign-language-mnist
[52]
Chris Thornton, Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. 2013. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 847--855.
[53]
Mohammad Wardat, Breno Dantas Cruz, Wei Le, and Hridesh Rajan. 2022. Deep-Diagnosis: Automatically Diagnosing Faults and Recommending Actionable Fixes in Deep Learning Programs. In ICSE'22: The 44th International Conference on Software Engineering (Pittsburgh, PA, USA).
[54]
Mohammad Wardat, Wei Le, and Hridesh Rajan. 2021. DeepLocalize: Fault Localization for Deep Neural Networks. In ICSE'21: The 43nd International Conference on Software Engineering (Virtual Conference).
[55]
Tao Wei, Changhu Wang, Yong Rui, and Chang Wen Chen. 2016. Network morphism. In International Conference on Machine Learning. 564--572.
[56]
Ian H Witten, Eibe Frank, Mark A Hall, and Christopher J Pal. 2016. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.
[57]
Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017).
[58]
Lingxi Xie and Alan Yuille. 2017. Genetic cnn. In Proceedings of the IEEE International Conference on Computer Vision. 1379--1388.
[59]
Zhao Zhong, Junjie Yan, Wei Wu, Jing Shao, and Cheng-Lin Liu. 2018. Practical block-wise neural network architecture generation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2423--2432.
[60]
Runjie Zhu, Xinhui Tu, and Jimmy Xiangji Huang. 2020. Chapter seven - Deep learning on information retrieval and its applications. In Deep Learning for Data Analytics, Himansu Das, Chittaranjan Pradhan, and Nilanjan Dey (Eds.). Academic Press, 125--153.
[61]
Barret Zoph and Quoc V. Le. 2017. Neural Architecture Search with Reinforcement Learning. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=r1Ue8Hcxg
[62]
Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8697--8710.

Cited By

View all
  • (2024)Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in DeploymentProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623333(1-13)Online publication date: 20-May-2024
  • (2024)Automated Machine Learning for Enhanced Software Reliability Growth Modeling: A Comparative Analysis with Traditional SRGMs2024 IEEE 24th International Conference on Software Quality, Reliability and Security (QRS)10.1109/QRS62785.2024.00055(483-493)Online publication date: 1-Jul-2024
  • (2023)Statistical Type Inference for Incomplete ProgramsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616283(720-732)Online publication date: 30-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '22: Proceedings of the 44th International Conference on Software Engineering
May 2022
2508 pages
ISBN:9781450392211
DOI:10.1145/3510003
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2022

Check for updates

Author Tags

  1. AutoML
  2. MSR
  3. deep learning
  4. mining software repositories

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)213
  • Downloads (Last 6 weeks)30
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in DeploymentProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623333(1-13)Online publication date: 20-May-2024
  • (2024)Automated Machine Learning for Enhanced Software Reliability Growth Modeling: A Comparative Analysis with Traditional SRGMs2024 IEEE 24th International Conference on Software Quality, Reliability and Security (QRS)10.1109/QRS62785.2024.00055(483-493)Online publication date: 1-Jul-2024
  • (2023)Statistical Type Inference for Incomplete ProgramsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616283(720-732)Online publication date: 30-Nov-2023
  • (2023)Fix Fairness, Don’t Ruin Accuracy: Performance Aware Fairness Repair using AutoMLProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616257(502-514)Online publication date: 30-Nov-2023
  • (2023)Design by Contract for Deep Learning APIsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616247(94-106)Online publication date: 30-Nov-2023
  • (2023)AutoML from Software Engineering Perspective: Landscapes and Challenges2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)10.1109/MSR59073.2023.00019(39-51)Online publication date: May-2023
  • (2023)SourceWarp: A scalable, SCM-driven testing and benchmarking approach to support data-driven and agile decision making for CI/CD tools and DevOps platforms2023 IEEE/ACM International Conference on Automation of Software Test (AST)10.1109/AST58925.2023.00011(68-78)Online publication date: May-2023
  • (2023)Challenges of Accurate and Efficient AutoMLProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00182(1834-1839)Online publication date: 11-Nov-2023
  • (2022)23 shades of self-admitted technical debt: an empirical study on machine learning softwareProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3549088(734-746)Online publication date: 7-Nov-2022
  • (2022)Discovering repetitive code changes in python ML systemsProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510225(736-748)Online publication date: 21-May-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media