Mohamed Sabri: Operationalize machine learning with Kubeflow

Hands-on Workshop:
Kubeflow Pipeline

Requirements for this
workshop

MOHAMED
SABRI
Mentor and data science leader

Our
approach:
Strategize
Shape
Spread

Spread –
Operationalize (MLOps)
Strategize
Shape
Spread

This is no traditional consulting
where we burn your cash
Our philosophy
We deliver concrete value
with full transparancy

Our approach
in MLOps
Analyze
Design
Coach
Implement

Delivrables
The implementation
of a viable environment
in MLOps
MLOps training
sessions
High-Level and Low-
Level Design
Document
A report with
recommendations
and roadmap
The following deliverables to be expected during and
after our mandate:

How to define an
MLOps efficient
architecture?
What is the level of expertise in MLOps
do we have or willing to hire?
What type of inference in ML are we looking for?
How many machine learning projects do we have
inline in short/mid/long term?
Automation vs Resource scalability?
Which type of vendors is the company working
with? Or open source vs Enterprise
Is the data science team following the state of the
art when it comes to source coding and versioning?

Microservices design environment
Machine learning
experimentation
Performance monitoring
Model serving
Retraining pipelines
Model registry
Development
environment
Dashboard for
monitoring
Data
analysis
Trigger retraining
Pipelines deployment
Experiment
tracking
Source code & versioning
Automatic
detection of
new models
Framework for
microservices
Push new model for deployment
Automated
pipelines
Challenges
• Data scientists need more
education about code submission
and production-ready code.
• The customer is looking to scale the
environment for all the
organization’s machine learning
projects.
• No clear performance metrics have
been defined by the customer to
evaluate model performance in
production.
• Helping the customer identify the
right resources internally to
maintain the environment

Microservices design environment
Challenges
• Reaching a low latency (maximum
15 ms) to allow a fast reactivity after
model inference.
• Handling a large volume of data
points per second (between 1
million to 10 million per second)
• Scaling the streams, environment
based on data volume with no
buffer.
• Automatically updating the machine
learning model if required.
Model registry
Domain events
Update
docker
image
connectors
Machine learning
experimentation
Push new model for deployment
Logs & KPI storage
Real time monitoring
Steam
processing
Data
analysis
Development
environment

Some technologies and tools
End to end platform
Continuous delivery
platform
Commercial
Microservice
deployment
Automation and
pipelines
Experimentation
tracking and
versioning
Open source

For you, what is MLOps ? Why is it
necessary ?

MLOps is not just about deployment

MLOps is like DevOps but for ML
• Continuous integration (CI)
CI is about testing and validating code and components, but also testing and validating
data, data schemas, and models.
• Continuous delivery (CD)
CD is about a system (an ML training pipeline) that should automatically deploy another
service (model prediction service).
• Continuous training (CT)
CT is concerned with automatically retraining and serving the models.

From the notebook to production

Our architecture
Data
extraction
Data pre-
processing
Building
classifier
Trigger
deployment
Model registry (persistent
volume)
Data pre-
processing
Inference
model
Automated
Training
pipeline
ML Engine
Integration with app

Mohamed Sabri: Operationalize machine learning with Kubeflow

More Related Content

Mohamed Sabri: Operationalize machine learning with Kubeflow