During the MLOps CodeBreakfast, we will be giving an introduction to MLOps. After this introduction, we will go into more detail on how to implement and deploy a Machine Learning pipeline on both Azure and AWS.
2. Who are we?
Sander van Donkelaar
Machine Learning Guru
Jordi Smit
Machine Learning Engineer
3. Use case: Fancy Fashion
• Sustainable fashion start-up with an app that helps people sell and
share second-hand clothing
• A key part of the app is an ML model that automatically analyses
uploaded images and assigns labels to fashion articles
• The ML model is in the PoC phase where it can classify images into
preset categories (e.g. bag, sneaker, dress, etc.)
4. Development has not been easy so far…
Laptops with insufficient compute
Packages that cannot be installed on specific
operation systems
So many experiments
logs
5. What do we need!
On demand compute on a cluster
Consistent and easy to use environments
Experiments tracking in a single location
6. Azure ML - Components
Compute
Environment
Experiments
• Personal development VMs
• Clusters for long jobs
• Tracking data science experiments
• Capturing metrics, performance, input datasets, etc.
• Consistent and resuable Python environments
7. Azure ML – Components (and more)
Datastores
Pipelines
Models
• Links to your databases and blob/file containers
• Datasets link to a table of file(s) on Datastores
• Pipelines for reproducibly training models
at large scale in the cloud
• Registry for serialized (trained) models
• Allows you to track models over time
8. Exercise 0: Getting started with Azure ML
1. Opening the Azure ML workspace
• Open the Azure Portal and login with your Microsoft Account
• Open the Azure ML workspace for this training: TODO
2. Logging into your Azure ML VM
• Open Visual Studio Code on your laptop and install the Azure ML extension
• Open the Azure ML extension side bar and navigate to the workspace
• Under Compute > Compute Instances, find your VM, click on VS Code
• Wait for Visual Studio Code to connect to your VM!
• Try to run notebook 0 to see if everything is working.
10. Exercise 2: submit a training job
Open the second notebook
1. Register a dataset from the datastore
2. Take a look at the code, can you understand what it does?
3. Fix the code such that Submit your training job to AzureML
4. Open the Azure portal: Did your training job succeeded?
11. Exercise 3: Playtime
Choose what you want to work on
• Add experiment tracking using MLFlow
• Log hyperparameters
• Log metrics
• Log figures
• Visualize the data set and log it in the run.
• Create a confusion matrix using a third party library by adding it to your
environment.
• Refactor the train.py into a pipeline
12. Feature leader board (✅)
Team MLFlow
logging
Data
visualization
Third party
library
Pipeline Execution
order pipeline
1
2
3
4
5
6
7
8
9
10