Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3666015.3666018acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Insights on Implementing a Metrics Baseline for Post-Deployment AI Container Monitoring

Published: 04 September 2024 Publication History

Abstract

Post-deployment monitoring (PDM) occurs in the late stages of a DevSecOps (DSO) pipeline. Its role in DSO is critical in providing feedback loops on system performance leading to desirable changes achieving long-term system and application sustainment. Containers are the de-facto deployed artifacts in DSO for diverse forms of systems and applications including AI models. Long-term sustainment of containerized AI models requires appropriate metrics for the successful maintenance of optimal container and model computing performance and correct model inference. There is no agreed upon set of metrics that should always be present when monitoring a deployed containerized AI model. The current literature and practice can benefit from a standard baseline of metrics for long-term monitoring of containerized AI models focused on computing and inference. In this paper, we propose a candidate baseline of metrics for consideration as a standard across PDM for any containerized AI model. We present a proof-of-concept (PoC) that implements a baseline of metrics for the continuous monitoring of an operationally deployed containerized AI model. The baseline represents the minimal metrics required for any containerized model deployed and actively operating to ensure successful long-term monitoring and support of optimal operation and performance. The metrics focus on container operation, model operation, and model inference. This paper also details the raw data required for the metrics along with a PoC which demonstrates container engineering for their acquisition. The paper illustrates the baseline as a mix of dynamic metrics that are customized for each problem class (e.g., object detection, regression) and data modality together with static metrics that should be present for any containerized model. The paper further shows that a containerized AI model can be engineered to produce these metrics and describes the benefits of a standardized baseline of metrics to aid in the reduction of power consumption in the global digital enterprise.

References

[1]
C3 AI. 2022. Precision. https://c3.ai/glossary/machine-learning/precision/.
[2]
C3 AI. 2022. Recall. https://c3.ai/glossary/data-science/recall/.
[3]
Naylor G. Bachiega, Paulo S. L. Souza, Sarita M. Bruschi, and Simone do R. S. de Souza. 2018. Container-Based Performance Evaluation: A Survey and Challenges. In 2018 IEEE International Conference on Cloud Engineering (IC2E). 398–403. https://doi.org/10.1109/IC2E.2018.00075
[4]
Donnie Berkholz. 2021. Docker Index Shows Continued Massive Developer Adoption and Activity to Build and Share Apps with Docker. https://www.docker.com/blog/docker-index-shows-continued-massive-developer-adoption-and-activity-to-build-and-share-apps-with-docker/.
[5]
Rolando Brondolin, Tommaso Sardelli, and Marco D. Santambrogio. 2018. DEEP-Mon: Dynamic and Energy Efficient Power Monitoring for Container-Based Infrastructures. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 676–684. https://doi.org/10.1109/IPDPSW.2018.00110
[6]
Jason Brownlee. 2019. Confidence Intervals for Machine Learning. https://machinelearningmastery.com/confidence-intervals-for-machine-learning/.
[7]
Andrey Chernykh. 2022. Reducing CPU usage in Machine Learning model inference with ONNX Runtime. https://www.inworld.ai/blog/reducing-cpu-usage-in-machine-learning-model-inference-with-onnx-runtime.
[8]
Ivan Curkovic. 2022. How to Monitor Container Memory and CPU Usage in Docker Desktop. https://www.docker.com/blog/how-to-monitor-container-memory-and-cpu-usage-in-docker-desktop/.
[9]
Datadog. 2018. 8 Surprising facts about real Docker Adoption. https://www.datadoghq.com/docker-adoption/.
[10]
Subham Datta. 2024. Private Bytes, Virtual Bytes, and Working Set. https://www.baeldung.com/cs/private-bytes-virtual-bytes-working-set.
[11]
Ali Dehghani. 2024. How Long a Linux Process Has Been Running. https://www.baeldung.com/linux/process-running-time.
[12]
Frank Denneman. 2022. Training vs Inference – Memory Consumption by Neural Networks. https://frankdenneman.nl/2022/07/15/training-vs-inference-memory-consumption-by-neural-networks/.
[13]
Aparna Dhinakaran. 2021. Two Essentials for ML Service-Level Performance Monitoring. https://towardsdatascience.com/two-essentials-for-ml-service-level-performance-monitoring-2637bdabc0d2.
[14]
docker.com. 2024. Runtime Metrics. https://docs.docker.com/config/containers/runmetrics/.
[15]
Tom Donohue. 2022. Containers: one single process, or multiple processes?https://www.tutorialworks.com/containers-single-or-multiple-processes/.
[16]
Lennart Espe, Anshul Jindal, Vladimir Podolskiy, and Michael Gerndt. 2020. Performance Evaluation of Container Runtimes. 273–281. https://doi.org/10.5220/0009340402730281
[17]
Nicole Forsgren and Mik Kersten. 2018. DevOps metrics. Commun. ACM 61, 4 (mar 2018), 44–48. https://doi.org/10.1145/3159169
[18]
Python Software Foundation. 2024. Garbage collector design. https://devguide.python.org/internals/garbage-collector/.
[19]
Python Software Foundation. 2024. System-specific parameters and functions. https://docs.python.org/3/library/sys.html.
[20]
GeeksForGeeks.org. 2023. Garbage Collection in Python. https://www.geeksforgeeks.org/garbage-collection-python/.
[21]
Tony Ginart, Martin Jinye Zhang, and James Zou. 2022. MLDemon:Deployment Monitoring for Machine Learning Systems. In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics(Proceedings of Machine Learning Research, Vol. 151), Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera (Eds.). PMLR, 3962–3997. https://proceedings.mlr.press/v151/ginart22a.html
[22]
Anusooya Govindarajan and Vijayakumar Varadarajan. 2021. Reduced carbon emission and optimized power consumption technique using container over virtual machine. Wireless Networks 27 (11 2021). https://doi.org/10.1007/s11276-019-02001-x
[23]
Marcel Großmann and Clemens Klug. 2017. Monitoring Container Services at the Network Edge. In 2017 29th International Teletraffic Congress (ITC 29), Vol. 1. 130–133. https://doi.org/10.23919/ITC.2017.8064348
[24]
Raluca Maria Hampau, Maurits Kaptein, Robin van Emden, Thomas Rost, and Ivano Malavolta. 2022. An empirical study on the Performance and Energy Consumption of AI Containerization Strategies for Computer-Vision Tasks on the Edge. In Proceedings of the 26th International Conference on Evaluation and Assessment in Software Engineering (Gothenburg, Sweden) (EASE ’22). Association for Computing Machinery, New York, NY, USA, 50–59. https://doi.org/10.1145/3530019.3530025
[25]
Ahmed Hashesh. 2023. Version Control for ML Models: Why You Need It, What It Is, How To Implement It. https://neptune.ai/blog/version-control-for-ml-models.
[26]
Ramtin Jabbari, Nauman bin Ali, Kai Petersen, and Binish Tanveer. 2016. What is DevOps? A Systematic Mapping Study on Definitions and Practices. In Proceedings of the Scientific Workshop Proceedings of XP2016 (Edinburgh, Scotland, UK) (XP ’16 Workshops). Association for Computing Machinery, New York, NY, USA, Article 12, 11 pages. https://doi.org/10.1145/2962695.2962707
[27]
Prateek Jangid. 2022. Show Threads Using PS Command in Linux. https://linuxhint.com/linux-show-ps-threads/.
[28]
Isam Mashhour Al Jawarneh, Paolo Bellavista, Luca Foschini, Giuseppe Martuscelli, Rebecca Montanari, Amedeo Palopoli, and Filippo Bosi. 2019. QoS and performance metrics for container-based virtualization in cloud environments. In Proceedings of the 20th International Conference on Distributed Computing and Networking (Bangalore, India) (ICDCN ’19). Association for Computing Machinery, New York, NY, USA, 178–182. https://doi.org/10.1145/3288599.3288631
[29]
Shujian Ji, Kejiang Ye, and Cheng-Zhong Xu. 2019. CMonitor: A Monitoring and Alarming Platform for Container-Based Clouds. In Cloud Computing – CLOUD 2019, Dilma Da Silva, Qingyang Wang, and Liang-Jie Zhang (Eds.). Springer International Publishing, Cham, 324–339.
[30]
Shahidullah Kaiser, Ali şaman Tosun, and Turgay Korkmaz. 2023. Benchmarking Container Technologies on ARM-Based Edge Devices. IEEE Access 11 (2023), 107331–107347. https://doi.org/10.1109/ACCESS.2023.3321274
[31]
Dong-Ki Kang, Gyu-Beom Choi, Seong-Hwan Kim, Il-Sun Hwang, and Chan-Hyun Youn. 2016. Workload-aware resource management for energy efficient heterogeneous Docker containers. In 2016 IEEE Region 10 Conference (TENCON). 2428–2431. https://doi.org/10.1109/TENCON.2016.7848467
[32]
kifarunix.com. 2024. How to Check Docker Container RAM and CPU Usage. https://kifarunix.com/how-to-check-docker-container-ram-and-cpu-usage/.
[33]
Jason Kreisa. 2020. Docker Index: Dramatic Growth in Docker Usage Affirms the Continued Rising Power of Developers. https://www.docker.com/blog/docker-index-dramatic-growth-in-docker-usage-affirms-the-continued-rising-power-of-developers/.
[34]
Dominik Kreuzberger, Niklas Kühl, and Sebastian Hirschl. 2023. Machine Learning Operations (MLOps): Overview, Definition, and Architecture. IEEE Access 11 (2023), 31866–31879. https://doi.org/10.1109/ACCESS.2023.3262138
[35]
R. Madhumathi. 2018. The Relevance of Container Monitoring Towards Container Intelligence. In 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT). 1–5. https://doi.org/10.1109/ICCCNT.2018.8493766
[36]
Hemant Kumar Mehta, Paul Harvey, Omer Rana, Rajkumar Buyya, and Blesson Varghese. 2020. WattsApp: Power-Aware Container Scheduling. In 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC). 79–90. https://doi.org/10.1109/UCC48980.2020.00027
[37]
Marco Miglierina and Damian A. Tamburri. 2017. Towards Omnia: A Monitoring Factory for Quality-Aware DevOps. In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering Companion (L’Aquila, Italy) (ICPE ’17 Companion). Association for Computing Machinery, New York, NY, USA, 145–150. https://doi.org/10.1145/3053600.3053629
[38]
Roberto Morabito. 2015. Power Consumption of Virtualization Technologies: An Empirical Investigation. In 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC). 522–527. https://doi.org/10.1109/UCC.2015.93
[39]
Roberto Morabito. 2017. Virtualization on Internet of Things Edge Devices With Container Technologies: A Performance Evaluation. IEEE Access 5 (2017), 8835–8850. https://doi.org/10.1109/ACCESS.2017.2704444
[40]
David Nigenda, Zohar Karnin, Muhammad Bilal Zafar, Raghu Ramesha, Alan Tan, Michele Donini, and Krishnaram Kenthapadi. 2022. Amazon SageMaker Model Monitor: A System for Real-Time Insights into Deployed Machine Learning Models. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington DC, USA) (KDD ’22). Association for Computing Machinery, New York, NY, USA, 3671–3681. https://doi.org/10.1145/3534678.3539145
[41]
National Institutes of Health. 2019. Analyze Menu. https://imagej.net/ij/docs/menus/analyze.html.
[42]
Tasneem Salah, M. Jamal Zemerly, Chan Yeob Yeun, Mahmoud Al-Qutayri, and Yousof Al-Hammadi. 2017. Performance comparison between container-based and VM-based services. In 2017 20th Conference on Innovations in Clouds, Internet and Networks (ICIN). 185–190. https://doi.org/10.1109/ICIN.2017.7899408
[43]
Eddie Santos, Carson McLean, Christopher Solinas, and Abram Hindle. 2017. How does Docker affect energy consumption? Evaluating workloads in and out of Docker containers. Journal of Systems and Software 146 (05 2017). https://doi.org/10.1016/j.jss.2018.07.077
[44]
David Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. Hidden technical debt in machine learning systems. Advances in neural information processing systems 28 (2015).
[45]
Takashi Shiraishi, Masaaki Noro, Reiko Kondo, Yosuke Takano, and Naoki Oguchi. 2020. Real-time Monitoring System for Container Networks in the Era of Microservices. In 2020 21st Asia-Pacific Network Operations and Management Symposium (APNOMS). 161–166. https://doi.org/10.23919/APNOMS50412.2020.9237055
[46]
Georgios Symeonidis, Evangelos Nerantzis, Apostolos Kazakis, and George A. Papakostas. 2022. MLOps - Definitions, Tools and Challenges. In 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC). 0453–0460. https://doi.org/10.1109/CCWC54503.2022.9720902
[47]
Senay Semu Tadesse, Francesco Malandrino, and Carla-Fabiana Chiasserini. 2017. Energy Consumption Measurements in Docker. In 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), Vol. 2. 272–273. https://doi.org/10.1109/COMPSAC.2017.117
[48]
Damian A. Tamburri. 2020. Sustainable MLOps: Trends and Challenges. In 2020 22nd International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). 17–23. https://doi.org/10.1109/SYNASC51798.2020.00015
[49]
Mehul Warade, Kevin Lee, Chathurika Ranaweera, and Jean-Guy Schneider. 2023. Monitoring the Energy Consumption of Docker Containers. In 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC). 1703–1710. https://doi.org/10.1109/COMPSAC57700.2023.00263
[50]
Thomas Wood. 2019. F-Score. https://deepai.org/machine-learning-glossary-and-terms/f-score.
[51]
Takeshi Yoshimura, Rina Nakazawa, and Tatsuhiro Chiba. 2020. ImageJockey: A Framework for Container Performance Engineering. In 2020 IEEE 13th International Conference on Cloud Computing (CLOUD). 238–247. https://doi.org/10.1109/CLOUD49709.2020.00043
[52]
Junyeol Yu, Jongseok Kim, and Euiseong Seo. 2021. A DNN Inference Latency-aware GPU Power Management Scheme. 551–554. https://doi.org/10.1109/ECICE52819.2021.9645654
[53]
Alexandros Zenonos. 2020. Inference vs. Prediction. https://pub.towardsai.net/inference-vs-prediction-6ce093214d8e.
[54]
Xusheng Zhang, Ziyu Shen, Bin Xia, Zheng Liu, and Yun Li. 2020. Estimating Power Consumption of Containers and Virtual Machines in Data Centers. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). 288–293. https://doi.org/10.1109/CLUSTER49012.2020.00039
[55]
Kevin Zhao, Prateesh Goyal, Mohammad Alizadeh, and Thomas E. Anderson. 2022. Scalable Tail Latency Estimation for Data Center Networks. arxiv:2205.01234 [cs.NI]
[56]
Zhuping Zou, Yulai Xie, Kai Huang, Gongming Xu, Dan Feng, and Darrell Long. 2022. A Docker Container Anomaly Monitoring System Based on Optimized Isolation Forest. IEEE Transactions on Cloud Computing 10, 1 (2022), 134–145. https://doi.org/10.1109/TCC.2019.2935724

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICSSP '24: Proceedings of the 2024 International Conference on Software and Systems Processes
September 2024
106 pages
ISBN:9798400709913
DOI:10.1145/3666015
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 September 2024

Check for updates

Author Tags

  1. AI
  2. Artificial Intelligence
  3. Container
  4. Continuous Deployment
  5. Deployment
  6. DevOps
  7. DevSecOps.
  8. ML
  9. Machine Learning
  10. Metrics
  11. Performance

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICSSP '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 163
    Total Downloads
  • Downloads (Last 12 months)163
  • Downloads (Last 6 weeks)60
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media