Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
The A-Z of Data
MLOps, Natural Language Processing,
Computer Vision, Time-Series Forecasting
17 August – Introduction to MLOps
25 August – Monitoring Machine Learning Models in Production
31 August – From research to product with Hydrosphere
8 September – Kubeflow
DVC / use case webinar and expert panel discussion
The A-Z of Data:
Introduction to MLOps
About Me
Dmitry Spodarets
● Head of R&D & ML competency at VITech
● Founder and chief editor of Data Phoenix
● Active participant of the ODS.ai community
The A-Z of Data: Introduction to MLOps
Agenda
● What is MLOps?
● Principles and Practices
● ML processes and tools
The A-Z of Data: Introduction to MLOps
CRISP-DM
Core phases for ML solution
Experimental
phase
QA
phase
Prod
phase
Hidden Technical Debt in Machine Learning Systems
https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
The goal of MLOps is to reduce technical friction to get the model from an idea
into production in the shortest possible time with as little risk as possible.
MLOps is not a single tool or platform
MLOps is about agreeing to do ML the right way and then supporting it.
A few shared principles will take you a long way…
ML should be collaborative
A few shared principles will take you a long way…
ML should be reproducible
A few shared principles will take you a long way…
ML should be continuous
A few shared principles will take you a long way…
ML should be tested & monitored
Continuous X
MLOps is an ML engineering culture that includes the following practices:
● Continuous Integration (CI) extends the testing and validating code and
components by adding testing and validating data and models.
● Continuous Delivery (CD) concerns with delivery of an ML training pipeline
that automatically deploys another the ML model prediction service.
● Continuous Training (CT) is unique to ML systems property, which
automatically retrains ML models for re-deployment.
● Continuous Monitoring (CM) concerns with monitoring production data and
model performance metrics, which are bound to business metrics.
And tooling will help implement your process
ML should be collaborative
Shared Infrastructure
And tooling will help implement your process
ML should be reproducible
Versioning for Code, Data and Metadata
And tooling will help implement your process
ML should be continuous
Machine Learning Pipelines
And tooling will help implement your process
ML should be tested & monitored
Model Deployment and Monitoring
The A-Z of Data: Introduction to MLOps
Machine Learning Process
Research
Build & Train
Model
Deploy Model
Prepare data Predictions
Machine Learning Process
Research &
Discovery
Data storage
Data
validation
Data extraction
& collection
Data labeling
Model
validation
Model training
Feature engineering /
feature storage
Model
evaluation
Data
preparation
Model
storage
Model serving
Model
optimization
Monitoring
Predictions
MLOps levels
● Level 0 - Manual process.
● Level 1 - ML pipeline automation.
● Level 2 - CI/CD pipeline automation.
MLOps level 0: Manual process
https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
MLOps level 1: ML pipeline automation
https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
MLOps level 2: CI/CD pipeline automation
https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
MLOps Stack
● LF AI & Data landscape - https://landscape.lfai.foundation/
● THE 2020 DATA & AI LANDSCAPE - https://mattturck.com/data2020/
MLOps Stack
● All-in-one tools
● CI/CD
● Data & Model registry, Tracking Experiments.
● Model serving
● Model monitoring
Jupyter Notebooks
Notebooks have become fundamental to data science.
Problems with notebooks:
● Hard to version
● Very hard to test
● Out-of-order execution artifacts
Hard to run long or distributed
tasks
Netflix bases all ML workflows on Jupyter Notebooks
https://netflixtechblog.com/notebook-innovation-591ee3221233
MLOps Stack: All-in-one tools
MLOps Stack: All-in-one tools
MLOps Stack: All-in-one tools
MLOps Stack: CI/CD
MLOps Stack: Data&Model registry, tracking experiments
DIY: Store in S3 or push to Git or …
MLOps Stack: additional dev stack
● Framework - Catalyst (PyTorch)
● Augmentations - Albumentations
● Code style - black+isort
● Type descriptions - mypy
● Testing - pytest+hypothesis
MLOps Stack: Model serving
DIY: Implement model server (Flask App), Dockerize, Expose API (HTTP, gRPC,
batch, Streaming)...
Frameworks & tools: TensorFlow Serving, TorchServe, NVIDIA Triton Inference
Server...
Model Serving Patterns
https://ml-ops.org/
MLOps Stack: Model serving
DIY: Implement model server (Flask App), Dockerize, Expose API (HTTP, gRPC,
batch, Streaming)...
Frameworks & tools: TensorFlow Serving, TorchServe, NVIDIA Triton Inference
Server...
NVIDIA Triton Inference Server
Amazon SageMaker Edge Manager
MLOps Stack: Model monitoring
DIY:
Amazon CloudWatch
Example of ML solution architecture
The A-Z of Data: Introduction to MLOps
Questions?
Dmitry Spodarets
d.spodarets@dataphoenix.info
https://dataphoenix.info
https://vitechteam.com

More Related Content

The A-Z of Data: Introduction to MLOps

  • 3. The A-Z of Data MLOps, Natural Language Processing, Computer Vision, Time-Series Forecasting 17 August – Introduction to MLOps 25 August – Monitoring Machine Learning Models in Production 31 August – From research to product with Hydrosphere 8 September – Kubeflow DVC / use case webinar and expert panel discussion
  • 4. The A-Z of Data: Introduction to MLOps
  • 5. About Me Dmitry Spodarets ● Head of R&D & ML competency at VITech ● Founder and chief editor of Data Phoenix ● Active participant of the ODS.ai community
  • 7. Agenda ● What is MLOps? ● Principles and Practices ● ML processes and tools
  • 10. Core phases for ML solution Experimental phase QA phase Prod phase
  • 11. Hidden Technical Debt in Machine Learning Systems https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf
  • 15. The goal of MLOps is to reduce technical friction to get the model from an idea into production in the shortest possible time with as little risk as possible.
  • 16. MLOps is not a single tool or platform
  • 17. MLOps is about agreeing to do ML the right way and then supporting it.
  • 18. A few shared principles will take you a long way… ML should be collaborative
  • 19. A few shared principles will take you a long way… ML should be reproducible
  • 20. A few shared principles will take you a long way… ML should be continuous
  • 21. A few shared principles will take you a long way… ML should be tested & monitored
  • 22. Continuous X MLOps is an ML engineering culture that includes the following practices: ● Continuous Integration (CI) extends the testing and validating code and components by adding testing and validating data and models. ● Continuous Delivery (CD) concerns with delivery of an ML training pipeline that automatically deploys another the ML model prediction service. ● Continuous Training (CT) is unique to ML systems property, which automatically retrains ML models for re-deployment. ● Continuous Monitoring (CM) concerns with monitoring production data and model performance metrics, which are bound to business metrics.
  • 23. And tooling will help implement your process ML should be collaborative Shared Infrastructure
  • 24. And tooling will help implement your process ML should be reproducible Versioning for Code, Data and Metadata
  • 25. And tooling will help implement your process ML should be continuous Machine Learning Pipelines
  • 26. And tooling will help implement your process ML should be tested & monitored Model Deployment and Monitoring
  • 28. Machine Learning Process Research Build & Train Model Deploy Model Prepare data Predictions
  • 29. Machine Learning Process Research & Discovery Data storage Data validation Data extraction & collection Data labeling Model validation Model training Feature engineering / feature storage Model evaluation Data preparation Model storage Model serving Model optimization Monitoring Predictions
  • 30. MLOps levels ● Level 0 - Manual process. ● Level 1 - ML pipeline automation. ● Level 2 - CI/CD pipeline automation.
  • 31. MLOps level 0: Manual process https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
  • 32. MLOps level 1: ML pipeline automation https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
  • 33. MLOps level 2: CI/CD pipeline automation https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
  • 34. MLOps Stack ● LF AI & Data landscape - https://landscape.lfai.foundation/ ● THE 2020 DATA & AI LANDSCAPE - https://mattturck.com/data2020/
  • 35. MLOps Stack ● All-in-one tools ● CI/CD ● Data & Model registry, Tracking Experiments. ● Model serving ● Model monitoring
  • 36. Jupyter Notebooks Notebooks have become fundamental to data science. Problems with notebooks: ● Hard to version ● Very hard to test ● Out-of-order execution artifacts Hard to run long or distributed tasks
  • 37. Netflix bases all ML workflows on Jupyter Notebooks https://netflixtechblog.com/notebook-innovation-591ee3221233
  • 42. MLOps Stack: Data&Model registry, tracking experiments DIY: Store in S3 or push to Git or …
  • 43. MLOps Stack: additional dev stack ● Framework - Catalyst (PyTorch) ● Augmentations - Albumentations ● Code style - black+isort ● Type descriptions - mypy ● Testing - pytest+hypothesis
  • 44. MLOps Stack: Model serving DIY: Implement model server (Flask App), Dockerize, Expose API (HTTP, gRPC, batch, Streaming)... Frameworks & tools: TensorFlow Serving, TorchServe, NVIDIA Triton Inference Server...
  • 46. MLOps Stack: Model serving DIY: Implement model server (Flask App), Dockerize, Expose API (HTTP, gRPC, batch, Streaming)... Frameworks & tools: TensorFlow Serving, TorchServe, NVIDIA Triton Inference Server...
  • 49. MLOps Stack: Model monitoring DIY: Amazon CloudWatch
  • 50. Example of ML solution architecture