Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Cristiano Rocha and Bart Buter
Building a scalable &
open source ML
platform
• Introduction to MLOps
• Use case: ING ML batch platform
• Exploration to production
• Considerations
Agenda
Sander van Donkelaar
Machine Learning Engineer
Xebia Data
About us…
Cristiano Rocha
Principal AI & Data engineer
Xebia Data
Bart Buter
Batch Execution Platform P.O. &
Analytics Platform Adoption Lead
ING
The machine learning lifecycle
The machine learning lifecycle
The machine learning lifecycle
The machine learning lifecycle
But machine learning in production is hard…
• Deploying a model requires
significant efforts in the IT
infrastructure.
• Difficult for other applications to
leverage produced insights.
• Data scientist != MLOps Engineer
• Deploying a model requires a different
skillset than building it.
• No CI/CD
• Lack of collaboration
• Lack of best-practices
• No standardization
Productionalizing ML: market observations
People
Technology
Processes
• Flexible
• Scalable
• Innovative
• Cross-functional
• Self-reliant
• Collaboration
• Standardized
• Automated
• Self-service
• Repeatable
MLOps: the ideal world
People
Technology
Processes
MLOps:
business
value for ING
Auditability
• Increased efficiency &
reusability.
• Reduced time to value
• Rapid onboarding of
new use-cases.
• Automation
Speed Quality
• Incorporate software
engineering best
practices
• Consistency
• Reliability
• Transparancy
• Model- and data
lineage
• Reproducibility
Security
Teams should be able to deploy train- and
deploy their own models, without relying on
support from ops teams
Solution should fully leverage open-source
and cloud native tooling (Kubernetes, Mlflow,
Airflow etc.)
It should be possible to dynamically
distribute- and scale workloads to distribute
resources efficiently
ING: some extra requirements for MLOps
Open-Source / (Private) Cloud Self-service
Scalable
Solution should service multiple customers.
Permissions should be determined on the
basis of roles- and group memberships.
Multi-tenancy
ING Analytics Platforms
Exploration
Execution
Private cloud Public cloud
API
Streaming
Exploration &
Model Development
Batch
Public Cloud Exploration
Public Cloud Execution
Platform
Maturity
Time
Sandbox
environment
Standardize
deployment
and
onboarding
processeses
Implement quality
checks, introduce
data- and model
lineage. Introduce
automatic retraining.
Self-service ML
Standardize- and centralize
monitoring of ML models.
Automated ML solution
onboarding
Experiment
Consistent
Reliable
Scalable &
Optimized
MLOps maturity levels
ML model lifecycle
Open source software
Con’s
• (Usually) not enterprise-ready as it requires
modifications or extensions to meet organazational’s
requirements
• Higher overhead (e.g., deployment on more complex
scenarios)
• Lack of vendor support for criticas issues (e.g. SLA’s)
Pro’s
• Modular: solves a specific purpose
• Community: embedded best practices and more
support for questions and issues
• Flexibility through customization
• Quicker updates and vulnerability fixes
• Reduced vendor lock-in risks
• Lower vendor costs
• Reduces integration costs
Open source tools & frameworks
1/2
Kubernete
s
System for orchestrating and
automating deployment,
scaling and management of
containerized applications.
MLFlo
w
Platform for managing
end-to-end machine learning
lifecycle (including experiment
tracking, model management,
deployment and registry).
Airflo
w
Platform for creating, scheduling
and monitoring workflow. When
such workflows are defined as
code, they become more
maintainable, versionable,
testable, and collaborative.
Open source tools & frameworks
2/2
Spar
k
Engine for executing data
engineering, data science and
machine learning workloads in
a distributed manner for fast
analytic queries against data of
any size.
NGIN
X
HTTP and reverse proxy
server. As a reverse proxy, it
sits in front of servers and
forward client request to such
servers. Typically, it is added to
help increasing security and
reliability.
Open Policy
Agent
General-purpose engine to
enforce policies in a
decoupled manner. OPA
provides a declarative
language that lets you specify
policy as code and simple APIs
to offload policy
decision-making from an
application logic.
ML model lifecycle
ML Batch Exploration
Architecture
ML Batch Execution
Architecture
Role-based access control on MLFlow OSS
Open Policy Agent policies description (policies.rego)
Golden path template
Template for training a model
Template for promoting models to Production
Template for promoting models to Production
Automatically set model metadata
• Use Golden-path repositories to
create examples that incorporate
best practices.
• Solution must be flexible enough to cover
a wide variety of use cases.
• But this should not come at the cost of
complexity!
• Modular: More difficult, requires a wide
variety of domain knowledge.
• End2End: Easier, but it can give problems
in the long term.
Considerations
Flexibility vs. Complexity
Self-Service vs.
Best-Practices
Modular vs. End2End

More Related Content

Similar to Building a Scalable and reliable open source ML Platform with MLFlow

Deploying ML models in the enterprise
Deploying ML models in the enterpriseDeploying ML models in the enterprise
Deploying ML models in the enterprise
doppenhe
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
Lviv Startup Club
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
Edunomica
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
Nick Pentreath
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...
Sri Ambati
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
Revolution Analytics
 
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M ResumeShaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
DataScienceConferenc1
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
Data Science Milan
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
Databricks
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
Inside Analysis
 
The differing ways to monitor and instrument
The differing ways to monitor and instrumentThe differing ways to monitor and instrument
The differing ways to monitor and instrument
Jonah Kowall
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
Decision Science Community
 
DutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorDutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive Sector
BigML, Inc
 
Presentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - englishPresentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - english
Jose Luis Sanchez del Coso
 
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoTUtilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
Pôle Systematic Paris-Region
 
IncQuery Suite demo for INCOSE 2022IW
IncQuery Suite demo for INCOSE 2022IWIncQuery Suite demo for INCOSE 2022IW
IncQuery Suite demo for INCOSE 2022IW
IncQuery Labs
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
Modern Architecture in the Cloud of 2018 (IT Camp 2018)
Modern Architecture in the Cloud of 2018 (IT Camp 2018)Modern Architecture in the Cloud of 2018 (IT Camp 2018)
Modern Architecture in the Cloud of 2018 (IT Camp 2018)
Marius Zaharia
 

Similar to Building a Scalable and reliable open source ML Platform with MLFlow (20)

Deploying ML models in the enterprise
Deploying ML models in the enterpriseDeploying ML models in the enterprise
Deploying ML models in the enterprise
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
 
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M ResumeShaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M Resume
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
The differing ways to monitor and instrument
The differing ways to monitor and instrumentThe differing ways to monitor and instrument
The differing ways to monitor and instrument
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
 
DutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorDutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive Sector
 
Presentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - englishPresentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - english
 
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoTUtilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
 
IncQuery Suite demo for INCOSE 2022IW
IncQuery Suite demo for INCOSE 2022IWIncQuery Suite demo for INCOSE 2022IW
IncQuery Suite demo for INCOSE 2022IW
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
 
Modern Architecture in the Cloud of 2018 (IT Camp 2018)
Modern Architecture in the Cloud of 2018 (IT Camp 2018)Modern Architecture in the Cloud of 2018 (IT Camp 2018)
Modern Architecture in the Cloud of 2018 (IT Camp 2018)
 

More from GoDataDriven

Streamlining Data Science Workflows with a Feature Catalog
Streamlining Data Science Workflows with a Feature CatalogStreamlining Data Science Workflows with a Feature Catalog
Streamlining Data Science Workflows with a Feature Catalog
GoDataDriven
 
Visualizing Big Data in a Small Screen
Visualizing Big Data in a Small ScreenVisualizing Big Data in a Small Screen
Visualizing Big Data in a Small Screen
GoDataDriven
 
Training Taster: Leading the way to become a data-driven organization
Training Taster: Leading the way to become a data-driven organizationTraining Taster: Leading the way to become a data-driven organization
Training Taster: Leading the way to become a data-driven organization
GoDataDriven
 
My Path From Data Engineer to Analytics Engineer
My Path From Data Engineer to Analytics EngineerMy Path From Data Engineer to Analytics Engineer
My Path From Data Engineer to Analytics Engineer
GoDataDriven
 
dbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchezdbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchez
GoDataDriven
 
Workshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformWorkshop on Google Cloud Data Platform
Workshop on Google Cloud Data Platform
GoDataDriven
 
How to create a Devcontainer for your Python project
How to create a Devcontainer for your Python projectHow to create a Devcontainer for your Python project
How to create a Devcontainer for your Python project
GoDataDriven
 
Using Graph Neural Networks To Embrace The Dependency In Your Data by Usman Z...
Using Graph Neural Networks To Embrace The Dependency In Your Data by Usman Z...Using Graph Neural Networks To Embrace The Dependency In Your Data by Usman Z...
Using Graph Neural Networks To Embrace The Dependency In Your Data by Usman Z...
GoDataDriven
 
Common Issues With Time Series by Vadim Nelidov - GoDataFest 2022
Common Issues With Time Series by Vadim Nelidov - GoDataFest 2022Common Issues With Time Series by Vadim Nelidov - GoDataFest 2022
Common Issues With Time Series by Vadim Nelidov - GoDataFest 2022
GoDataDriven
 
MLOps CodeBreakfast on AWS - GoDataFest 2022
MLOps CodeBreakfast on AWS - GoDataFest 2022MLOps CodeBreakfast on AWS - GoDataFest 2022
MLOps CodeBreakfast on AWS - GoDataFest 2022
GoDataDriven
 
MLOps CodeBreakfast on Azure - GoDataFest 2022
MLOps CodeBreakfast on Azure - GoDataFest 2022MLOps CodeBreakfast on Azure - GoDataFest 2022
MLOps CodeBreakfast on Azure - GoDataFest 2022
GoDataDriven
 
Tableau vs. Power BI by Juan Manuel Perafan - GoDataFest 2022
Tableau vs. Power BI by Juan Manuel Perafan - GoDataFest 2022Tableau vs. Power BI by Juan Manuel Perafan - GoDataFest 2022
Tableau vs. Power BI by Juan Manuel Perafan - GoDataFest 2022
GoDataDriven
 
Deploying a Modern Data Stack by Lasse Benninga - GoDataFest 2022
Deploying a Modern Data Stack by Lasse Benninga - GoDataFest 2022Deploying a Modern Data Stack by Lasse Benninga - GoDataFest 2022
Deploying a Modern Data Stack by Lasse Benninga - GoDataFest 2022
GoDataDriven
 
AWS Well-Architected Webinar Security - Ben de Haan
AWS Well-Architected Webinar Security - Ben de HaanAWS Well-Architected Webinar Security - Ben de Haan
AWS Well-Architected Webinar Security - Ben de Haan
GoDataDriven
 
The 7 Habits of Effective Data Driven Companies
The 7 Habits of Effective Data Driven CompaniesThe 7 Habits of Effective Data Driven Companies
The 7 Habits of Effective Data Driven Companies
GoDataDriven
 
DevOps for Data Science on Azure - Marcel de Vries (Xpirit) and Niels Zeilema...
DevOps for Data Science on Azure - Marcel de Vries (Xpirit) and Niels Zeilema...DevOps for Data Science on Azure - Marcel de Vries (Xpirit) and Niels Zeilema...
DevOps for Data Science on Azure - Marcel de Vries (Xpirit) and Niels Zeilema...
GoDataDriven
 
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
Artificial intelligence in actions: delivering a new experience to Formula 1 ...Artificial intelligence in actions: delivering a new experience to Formula 1 ...
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
GoDataDriven
 
Smart application on Azure at Vattenfall - Rens Weijers & Peter van 't Hof
Smart application on Azure at Vattenfall - Rens Weijers & Peter van 't HofSmart application on Azure at Vattenfall - Rens Weijers & Peter van 't Hof
Smart application on Azure at Vattenfall - Rens Weijers & Peter van 't Hof
GoDataDriven
 
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
GoDataDriven
 
The world runs on AI - Tony Krijnen (Microsoft) at GoDataFest 2019
The world runs on AI - Tony Krijnen (Microsoft) at GoDataFest 2019The world runs on AI - Tony Krijnen (Microsoft) at GoDataFest 2019
The world runs on AI - Tony Krijnen (Microsoft) at GoDataFest 2019
GoDataDriven
 

More from GoDataDriven (20)

Streamlining Data Science Workflows with a Feature Catalog
Streamlining Data Science Workflows with a Feature CatalogStreamlining Data Science Workflows with a Feature Catalog
Streamlining Data Science Workflows with a Feature Catalog
 
Visualizing Big Data in a Small Screen
Visualizing Big Data in a Small ScreenVisualizing Big Data in a Small Screen
Visualizing Big Data in a Small Screen
 
Training Taster: Leading the way to become a data-driven organization
Training Taster: Leading the way to become a data-driven organizationTraining Taster: Leading the way to become a data-driven organization
Training Taster: Leading the way to become a data-driven organization
 
My Path From Data Engineer to Analytics Engineer
My Path From Data Engineer to Analytics EngineerMy Path From Data Engineer to Analytics Engineer
My Path From Data Engineer to Analytics Engineer
 
dbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchezdbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchez
 
Workshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformWorkshop on Google Cloud Data Platform
Workshop on Google Cloud Data Platform
 
How to create a Devcontainer for your Python project
How to create a Devcontainer for your Python projectHow to create a Devcontainer for your Python project
How to create a Devcontainer for your Python project
 
Using Graph Neural Networks To Embrace The Dependency In Your Data by Usman Z...
Using Graph Neural Networks To Embrace The Dependency In Your Data by Usman Z...Using Graph Neural Networks To Embrace The Dependency In Your Data by Usman Z...
Using Graph Neural Networks To Embrace The Dependency In Your Data by Usman Z...
 
Common Issues With Time Series by Vadim Nelidov - GoDataFest 2022
Common Issues With Time Series by Vadim Nelidov - GoDataFest 2022Common Issues With Time Series by Vadim Nelidov - GoDataFest 2022
Common Issues With Time Series by Vadim Nelidov - GoDataFest 2022
 
MLOps CodeBreakfast on AWS - GoDataFest 2022
MLOps CodeBreakfast on AWS - GoDataFest 2022MLOps CodeBreakfast on AWS - GoDataFest 2022
MLOps CodeBreakfast on AWS - GoDataFest 2022
 
MLOps CodeBreakfast on Azure - GoDataFest 2022
MLOps CodeBreakfast on Azure - GoDataFest 2022MLOps CodeBreakfast on Azure - GoDataFest 2022
MLOps CodeBreakfast on Azure - GoDataFest 2022
 
Tableau vs. Power BI by Juan Manuel Perafan - GoDataFest 2022
Tableau vs. Power BI by Juan Manuel Perafan - GoDataFest 2022Tableau vs. Power BI by Juan Manuel Perafan - GoDataFest 2022
Tableau vs. Power BI by Juan Manuel Perafan - GoDataFest 2022
 
Deploying a Modern Data Stack by Lasse Benninga - GoDataFest 2022
Deploying a Modern Data Stack by Lasse Benninga - GoDataFest 2022Deploying a Modern Data Stack by Lasse Benninga - GoDataFest 2022
Deploying a Modern Data Stack by Lasse Benninga - GoDataFest 2022
 
AWS Well-Architected Webinar Security - Ben de Haan
AWS Well-Architected Webinar Security - Ben de HaanAWS Well-Architected Webinar Security - Ben de Haan
AWS Well-Architected Webinar Security - Ben de Haan
 
The 7 Habits of Effective Data Driven Companies
The 7 Habits of Effective Data Driven CompaniesThe 7 Habits of Effective Data Driven Companies
The 7 Habits of Effective Data Driven Companies
 
DevOps for Data Science on Azure - Marcel de Vries (Xpirit) and Niels Zeilema...
DevOps for Data Science on Azure - Marcel de Vries (Xpirit) and Niels Zeilema...DevOps for Data Science on Azure - Marcel de Vries (Xpirit) and Niels Zeilema...
DevOps for Data Science on Azure - Marcel de Vries (Xpirit) and Niels Zeilema...
 
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
Artificial intelligence in actions: delivering a new experience to Formula 1 ...Artificial intelligence in actions: delivering a new experience to Formula 1 ...
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
 
Smart application on Azure at Vattenfall - Rens Weijers & Peter van 't Hof
Smart application on Azure at Vattenfall - Rens Weijers & Peter van 't HofSmart application on Azure at Vattenfall - Rens Weijers & Peter van 't Hof
Smart application on Azure at Vattenfall - Rens Weijers & Peter van 't Hof
 
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
 
The world runs on AI - Tony Krijnen (Microsoft) at GoDataFest 2019
The world runs on AI - Tony Krijnen (Microsoft) at GoDataFest 2019The world runs on AI - Tony Krijnen (Microsoft) at GoDataFest 2019
The world runs on AI - Tony Krijnen (Microsoft) at GoDataFest 2019
 

Recently uploaded

❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
javier ramirez
 
Applications of Data Science in Various Industries
Applications of Data Science in Various IndustriesApplications of Data Science in Various Industries
Applications of Data Science in Various Industries
IABAC
 
AIRLINE_SATISFACTION_Data Science Solution on Azure
AIRLINE_SATISFACTION_Data Science Solution on AzureAIRLINE_SATISFACTION_Data Science Solution on Azure
AIRLINE_SATISFACTION_Data Science Solution on Azure
SanelaNikodinoska1
 
NEW THYROID DISEASES CLASSIFICATION USING ML.docx
NEW THYROID DISEASES CLASSIFICATION USING ML.docxNEW THYROID DISEASES CLASSIFICATION USING ML.docx
NEW THYROID DISEASES CLASSIFICATION USING ML.docx
dharugayu13475
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
How We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeachHow We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeach
javier ramirez
 
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
SARITA PANDEY
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
Orange Yellow Gradient Aesthetic Y2K Creative Portfolio Presentation -3.pdf
Orange Yellow Gradient Aesthetic Y2K Creative Portfolio Presentation -3.pdfOrange Yellow Gradient Aesthetic Y2K Creative Portfolio Presentation -3.pdf
Orange Yellow Gradient Aesthetic Y2K Creative Portfolio Presentation -3.pdf
RealDarrah
 
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
Amazon Web Services Korea
 
Seamlessly Pay Online, Pay In Stores or Send Money
Seamlessly Pay Online, Pay In Stores or Send MoneySeamlessly Pay Online, Pay In Stores or Send Money
Seamlessly Pay Online, Pay In Stores or Send Money
gargtinna79
 
Daryaganj @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Daryaganj @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model SafeDaryaganj @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Daryaganj @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
nehadubay1
 
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
manjukaushik328
 
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
Disha Mukharji
 
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model SafeKarol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
bookmybebe1
 
[D3T1S02] Aurora Limitless Database Introduction
[D3T1S02] Aurora Limitless Database Introduction[D3T1S02] Aurora Limitless Database Introduction
[D3T1S02] Aurora Limitless Database Introduction
Amazon Web Services Korea
 
一比一原版(爱大文凭证书)美国爱荷华大学毕业证如何办理
一比一原版(爱大文凭证书)美国爱荷华大学毕业证如何办理一比一原版(爱大文凭证书)美国爱荷华大学毕业证如何办理
一比一原版(爱大文凭证书)美国爱荷华大学毕业证如何办理
ekehyz
 
Simon Fraser University degree offer diploma Transcript
Simon Fraser University  degree offer diploma TranscriptSimon Fraser University  degree offer diploma Transcript
Simon Fraser University degree offer diploma Transcript
taqyea
 
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
Donghwan Lee
 

Recently uploaded (20)

❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
 
Applications of Data Science in Various Industries
Applications of Data Science in Various IndustriesApplications of Data Science in Various Industries
Applications of Data Science in Various Industries
 
AIRLINE_SATISFACTION_Data Science Solution on Azure
AIRLINE_SATISFACTION_Data Science Solution on AzureAIRLINE_SATISFACTION_Data Science Solution on Azure
AIRLINE_SATISFACTION_Data Science Solution on Azure
 
NEW THYROID DISEASES CLASSIFICATION USING ML.docx
NEW THYROID DISEASES CLASSIFICATION USING ML.docxNEW THYROID DISEASES CLASSIFICATION USING ML.docx
NEW THYROID DISEASES CLASSIFICATION USING ML.docx
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
 
How We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeachHow We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeach
 
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
 
Orange Yellow Gradient Aesthetic Y2K Creative Portfolio Presentation -3.pdf
Orange Yellow Gradient Aesthetic Y2K Creative Portfolio Presentation -3.pdfOrange Yellow Gradient Aesthetic Y2K Creative Portfolio Presentation -3.pdf
Orange Yellow Gradient Aesthetic Y2K Creative Portfolio Presentation -3.pdf
 
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
 
Seamlessly Pay Online, Pay In Stores or Send Money
Seamlessly Pay Online, Pay In Stores or Send MoneySeamlessly Pay Online, Pay In Stores or Send Money
Seamlessly Pay Online, Pay In Stores or Send Money
 
Daryaganj @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Daryaganj @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model SafeDaryaganj @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Daryaganj @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
 
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
 
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
 
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model SafeKarol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
 
[D3T1S02] Aurora Limitless Database Introduction
[D3T1S02] Aurora Limitless Database Introduction[D3T1S02] Aurora Limitless Database Introduction
[D3T1S02] Aurora Limitless Database Introduction
 
一比一原版(爱大文凭证书)美国爱荷华大学毕业证如何办理
一比一原版(爱大文凭证书)美国爱荷华大学毕业证如何办理一比一原版(爱大文凭证书)美国爱荷华大学毕业证如何办理
一比一原版(爱大文凭证书)美国爱荷华大学毕业证如何办理
 
Simon Fraser University degree offer diploma Transcript
Simon Fraser University  degree offer diploma TranscriptSimon Fraser University  degree offer diploma Transcript
Simon Fraser University degree offer diploma Transcript
 
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
 

Building a Scalable and reliable open source ML Platform with MLFlow

  • 1. Cristiano Rocha and Bart Buter Building a scalable & open source ML platform
  • 2. • Introduction to MLOps • Use case: ING ML batch platform • Exploration to production • Considerations Agenda
  • 3. Sander van Donkelaar Machine Learning Engineer Xebia Data About us… Cristiano Rocha Principal AI & Data engineer Xebia Data Bart Buter Batch Execution Platform P.O. & Analytics Platform Adoption Lead ING
  • 8. But machine learning in production is hard…
  • 9. • Deploying a model requires significant efforts in the IT infrastructure. • Difficult for other applications to leverage produced insights. • Data scientist != MLOps Engineer • Deploying a model requires a different skillset than building it. • No CI/CD • Lack of collaboration • Lack of best-practices • No standardization Productionalizing ML: market observations People Technology Processes
  • 10. • Flexible • Scalable • Innovative • Cross-functional • Self-reliant • Collaboration • Standardized • Automated • Self-service • Repeatable MLOps: the ideal world People Technology Processes
  • 11. MLOps: business value for ING Auditability • Increased efficiency & reusability. • Reduced time to value • Rapid onboarding of new use-cases. • Automation Speed Quality • Incorporate software engineering best practices • Consistency • Reliability • Transparancy • Model- and data lineage • Reproducibility Security
  • 12. Teams should be able to deploy train- and deploy their own models, without relying on support from ops teams Solution should fully leverage open-source and cloud native tooling (Kubernetes, Mlflow, Airflow etc.) It should be possible to dynamically distribute- and scale workloads to distribute resources efficiently ING: some extra requirements for MLOps Open-Source / (Private) Cloud Self-service Scalable Solution should service multiple customers. Permissions should be determined on the basis of roles- and group memberships. Multi-tenancy
  • 13. ING Analytics Platforms Exploration Execution Private cloud Public cloud API Streaming Exploration & Model Development Batch Public Cloud Exploration Public Cloud Execution
  • 14. Platform Maturity Time Sandbox environment Standardize deployment and onboarding processeses Implement quality checks, introduce data- and model lineage. Introduce automatic retraining. Self-service ML Standardize- and centralize monitoring of ML models. Automated ML solution onboarding Experiment Consistent Reliable Scalable & Optimized MLOps maturity levels
  • 16. Open source software Con’s • (Usually) not enterprise-ready as it requires modifications or extensions to meet organazational’s requirements • Higher overhead (e.g., deployment on more complex scenarios) • Lack of vendor support for criticas issues (e.g. SLA’s) Pro’s • Modular: solves a specific purpose • Community: embedded best practices and more support for questions and issues • Flexibility through customization • Quicker updates and vulnerability fixes • Reduced vendor lock-in risks • Lower vendor costs • Reduces integration costs
  • 17. Open source tools & frameworks 1/2 Kubernete s System for orchestrating and automating deployment, scaling and management of containerized applications. MLFlo w Platform for managing end-to-end machine learning lifecycle (including experiment tracking, model management, deployment and registry). Airflo w Platform for creating, scheduling and monitoring workflow. When such workflows are defined as code, they become more maintainable, versionable, testable, and collaborative.
  • 18. Open source tools & frameworks 2/2 Spar k Engine for executing data engineering, data science and machine learning workloads in a distributed manner for fast analytic queries against data of any size. NGIN X HTTP and reverse proxy server. As a reverse proxy, it sits in front of servers and forward client request to such servers. Typically, it is added to help increasing security and reliability. Open Policy Agent General-purpose engine to enforce policies in a decoupled manner. OPA provides a declarative language that lets you specify policy as code and simple APIs to offload policy decision-making from an application logic.
  • 22. Role-based access control on MLFlow OSS Open Policy Agent policies description (policies.rego)
  • 23. Golden path template Template for training a model
  • 24. Template for promoting models to Production
  • 25. Template for promoting models to Production
  • 27. • Use Golden-path repositories to create examples that incorporate best practices. • Solution must be flexible enough to cover a wide variety of use cases. • But this should not come at the cost of complexity! • Modular: More difficult, requires a wide variety of domain knowledge. • End2End: Easier, but it can give problems in the long term. Considerations Flexibility vs. Complexity Self-Service vs. Best-Practices Modular vs. End2End