MLOps – Applying DevOps to Competitive Advantage

MLOps: Applying DevOps to
Competitive Advantage
Presented by: William McKnight
President, McKnight Consulting Group
linkedin.com/in/wmcknight
www.mcknightcg.com
(214) 514-1444

8th December, 2022
Put AI Into Action And Boost
Productivity with MLOps
Abhilash Mula
Senior Manager, Product Management

2 © Informatica. Proprietary and Confidential.
New World of Cloud AI & Analytics
Situation: Unprecedented volume/type of data, on multiple clouds, leveraged by
multiple user profiles, with exploding AI/ML usage
500
million
business
data users
64.2
zettabytes
of data
per year
1 billion
workers
assisted
by AI/ML
80% of
organizations
store data in
multi-hybrid
Data in the Multi
Cloud, Hybrid
46
billion
connected
devices
New
Users
Machine
Learning/AI
New Data Types
(mobile, social, IoT)
Explosion in
Data Volume

Data Management Challenges Are Derailing AI &
Analytics Initiatives
Cost Overruns
75% of organizations using
cloud data management
will encounter budget
overruns resulting in their
questioning the value of
using cloud services
Resource Constraints
96% of IT and engineering
decision-makers say no-
code/low-code will be a
priority because of the lack
of software engineers
Complexity
72% of organizations
are still struggling to
operationalize within
their enterprise
96% 75% 72%
Source: 1– Aptum cloud impact study | 2– Advanced Global Research, May 28, 2020 | 3- Venturebbeat.

AI/ML Projects Rarely Make It into Production
Only 1% of
AI/ML projects
are successful
*Source: Databricks research 2018

MLOps Streamlines the Development,
Operationalization, and Execution of AI/ML Models
MLOps covers all the key phases of AI/ML
Prepare Data Build Model Deploy, Consume
and Monitor
Understanding the
objectives and
requirements of the
project and preparing
the data needed for
the model.
Build and assess
various models
based on a variety of
different modeling
techniques.
Operationalize and
monitor the models
to deliver business
value and
performance.

MLOps is a Team Sport
Cross-functional collaboration is key
Business Expert Data Scientist Data Engineer
Data Steward Data Analyst Citizen Integrator

One-click, serverless deployment of ANY AIML Model
Only with Informatica, data scientists and ML engineers can operationalize AI/ML
models @ scale with ModelServe
• Simple, easy-to-use, wizard-driven approach for data
scientists and ML engineers to deploy and operationalize
any AI/ML models at scale
• Provide flexibility for data scientists and ML engineers to
build their AI/ML models in any framework and consume
them in any application
• Enable data scientists to accelerate AI/ML initiatives with
high-quality, trusted, and governed data
• Improve the productivity of data science teams by
streamlining and automating the process of building,
deploying, and monitoring machine learning models
• Enhance model performance with timely delivery of trusted
data using integrated DataOps

Call to Action
Sign up for Informatica ModelServe Public Preview
Download the MLOps White Paper to Put AI Into Action

William McKnight
President, McKnight Consulting Group
• Frequent keynote speaker and trainer internationally
• Consulted to Pfizer, Scotiabank, Fidelity, TD Ameritrade, Teva
Pharmaceuticals, Verizon, and many other Global 1000
companies
• Hundreds of articles, blogs and white papers in publication
• Focused on delivering business value and solving business
problems utilizing proven, streamlined approaches to
information management
• Former Database Engineer, Fortune 50 Information
Technology executive and Ernst&Young Entrepreneur of Year
Finalist
• Owner/consultant: Research, Data Strategy and
Implementation consulting firm
2

McKnight Consulting Group Offerings
Strategy
Training
Strategy
 Trusted Advisor
 Action Plans
 Roadmaps
 Tool Selections
 Program Management
Training
 Classes
 Workshops
Implementation
 Data/Data Warehousing/Business
Intelligence/Analytics
 Big Data
 Master Data Management
 Governance/Quality
Implementation
3

McKnight Consulting Group Client Portfolio

Use Cases for ML
Flow optimization Modeling and
analytics
Predictive insights Threat and risk
analysis
Public Sector Traffic flow
management
Smart city planning Autonomous
routing
Situational
Awareness
Oil and Gas Pipeline modelling Drilling patterns
and asset
utilization
Intelligent planning Safety assurance
Manufacturing Supply chain
optimization
Production
optimization
Predictive
maintenance
Fault identification
Retail Supply chain
optimization
Customer
experience
Segmentation
analysis and
forecasting
Fraud and theft
identification
Healthcare Patient care
pathway
optimization
Disease research
and drug creation
Early diagnosis of
conditions
Patient safety
Technology Operational
efficiency
Log analysis Capacity planning Cybersecurity and
zero-day detection
6

Drivers to MLOps
• Senior management does not always see ML as strategic, and it can be
difficult to measure and manage the value of ML projects.
• ML initiatives can work in isolation from each other, resulting in
difficulties aligning workflows between ML and other teams.
• To be effective, ML training requires large quantities of high-quality data,
which creates significant overheads across data access, preparation, and
ongoing management.
• ML/data science work requires a large amount of trial and error, making
it hard to plan the time required to complete a project.
7

What is MLOps?
• MLOps is a practice for collaboration
between data science and operations
to manage the production machine
learning (ML) lifecycles.
• As an amalgamation of “machine
learning” and “operations,” MLOps
applies DevOps principles to ML
delivery, enabling the delivery of ML-
based innovation at scale to result
in:
– Faster time to market of ML-
based solutions
– More rapid rate of
experimentation, driving
innovation
– Assurance of quality,
trustworthiness, and ethical AI
8

From ML to MLOps
• Many companies have built strong ML capabilities
• Few businesses have been successful in putting the majority of their
ML models into production, leaving a sizable amount of value
untapped.
• Machine learning operations, also known as MLOps, are a set of
standards, tools, and frameworks that are used to scale ML to reach
its full potential.
• Three main objectives of MLOps, which concentrates on the entire life
cycle of ML model design, implementation, testing, monitoring, and
management, are as follows:
– To create a highly repeatable procedure for the entire life cycle of a model, from
feature exploration to model deployment in production.
– Data scientists and analysts should be shielded from the complexity of the
infrastructure so they can concentrate on their models and plans.
– Develop MLOps so that it scales without a horde of engineers, along with the number
of models and modeling complexity.
9

MLOps Operations
• For modern enterprises, use of ML goes to the heart of
digital transformation, enabling organizations to harness
the power of their data and deliver new and
differentiated services to their customers. Achieving this
goal is predicated on three pillars:
• Development of such models requires an iterative
approach so the domain can be better understood,
and the models improved over time, as new
learnings are achieved from data and inference.
• Automated tools and repositories need to store
and keep track of models, code, data lineage, and a
target environment for deployment of ML-enabled
applications at speed without undermining
governance.
• Developers and data scientists need to work
collaboratively to ensure ML initiatives are aligned
with broader software delivery and, more broadly
still, IT-business alignment.
10

Why not DevOps?
• Connect data and services. DevOps success depends
on how well platforms of data and existing/new
services can be integrated, adapting to changing
circumstances.
• Automate deployment. Automation needs to be
considered in the context of the above, to ensure
constant, consistent delivery of business value.
• Operate and orchestrate resources. A commoditized,
flexible platform is table stakes: as platform efficiency
increases, so does DevOps effectiveness.
11

The goal is to assure the delivery of value to the business,
its customers and other stakeholders.
12

Terminology
• Pipeline. Each development iteration of an ML-based application will
follow a planned and automated series of steps. The pipeline itself
can be put under configuration control, such that the steps can be
repeated.
• Datasets store/Datasets. MLOps relies on an easily accessible and
scalable source of data, both during training and inference. While
data may come from several places, it will be prepared, cleaned and
accessed as a single resource.
• Repository. A common, version-controlled storage resource (e.g. Git,
Artifactory, Azure Artifacts) for data, model and configuration
schemas, managing dependencies between models, libraries and
other resources.
• Registry. A logical picture of all elements required to support a given
ML model, across its development and operational pipeline.
13

Terminology
• Workspace. Model and application developers conduct their activities
within individual workspaces, accessible graphically or via code (e.g.
written in Python), with access control over data sets, models and
insights
• Target. A deployment environment for ML models and code,
packaged for example as containers/microservices that is often cloud-
based, but can include on-premises and edge-based environments.
• Experiment. Outputs of a given iteration or run need to be stored so
they can be assessed, compared and monitored for audit purposes.
• Model. Packaged output of an experiment which can be used to
predict values or built on top of (via transfer learning).
• Endpoint. Internet-capable computer hardware device on a TCP/IP
network.
14

Applying MLOps in Practice
• Configure Target – Set up the compute targets on which models will be trained.
• Prepare data – Set up how data is ingested, prepared and used
• Train Model – Develop ML training scripts and submit them to the compute target
• Containerize the Service – After a satisfactory run is found, register the persisted model in a
model registry.
• Validate Results – Application integration test of the service deployed on dev/test target.
• Deploy Model – If the model is satisfactory, deploy it into the target environment
• Monitor Model – Monitor the deployed model to evaluate its inferencing performance and
accuracy
17

For iterative pipelines to continue to deliver
results, we need
• Reproducibility – as with software configuration management and continuous
integration, ML pipelines and steps, together with their data sources and models,
libraries and SDKs, need to be stored and maintained such that they can be repeated
exactly as previously.
• Reusability– to fit with principles of continuous delivery, the pipeline needs to be
able to package and deliver models and code into production, both to training and
target environments.
• Manageability – the ability to apply governance, linking changes to models and code
to development activities (for example through sprints) and enabling managers to
measure and oversee both progress and value delivery.
• Automation – as with DevOps, continuous integration and delivery require
automation to assure rapid and repeatable pipelines, particularly when these are
augmented by governance and testing (which can otherwise create a bottleneck).
18

MLOps scenario: Customer Churn
• Prepare Environment: Create and configure data stores, in this
case CRM data
• Normalize, transform and otherwise prepare datasets for
training and inference
• Point algorithms and code to the data
• Enforce transparency (e.g. through audit trails) to build
confidence in results
19

Create Pipelines for Training and Inference
20

Monitor Results for Applicability and Effectiveness
of Insights
21

Azure Machine Learning (example)
22

Azure Solution Architecture (example)
• With security controls in place, a user can provision a workspace
private link, customer managed keys, and role-based access control
(RBAC) using AML python SDK, CLI, or UX. ARM templates can be
used for automation.
• Compute instance is used as a managed workstation by data
scientists and is used to build models. IT Admin can create a compute
instance behind a VNet if there are restrictions in place to not use a
public IP.
• Compute Cluster is used as a training compute to train ML models. IT
Admin (not shown) can create a compute cluster behind a VNet or
enable a private link if there are restrictions in place to not use a public
IP.
• Once a model is created it can be deployed on AKS cluster. A private
AKS cluster with no public IP can be attached to the AML workspace
and an internal load balancer can be used so that the deployed
scoring endpoint is not visible outside of the virtual network. All the
scoring requests to the deployed model are made over TLS/SSL.
23

MLOps Features
• Ease of Setup and Use
– Create ML Managed Endpoints
– Create Compute Resources
– Manage Compute Resources
• MLOps Workflow
– Model Orchestration
– Data Orchestration
24

MLOps Features
• Security
– Network
– User
– Data
• Governance
– Monitoring
– Control
• Automation
– Experiments
– Workflow
– Code and App Orchestration
– Event-Driven
25

MLOps Features
• Experiment Management
• Scheduling
• Accuracy Management
• Retraining
26

MLOps Features
• Model Explainability
• A/B Model Testing
• Granular Data Preparation
27

Midsize Organization MLOps Costs
Category Type Price Per
Time
Time Units
Per Year
Subtotal Units Amount
ML1
Compute E8 v3 $0.504 8,760 $4,415 16 $70,641
Service included $0.000 8,760 $0 16 $0
ML2
Model
Training
Per node per
hour
$19.32 8,760 $203,092 0.2 $33,849
Batch
prediction
Per node per
hour
$1.160 8,760 $10,162 16 $162,586
ML3 Compute ml.r5.2xlarge $0.504 8,760 $4,415 16 $70,641
Service ml.r5.2xlarge $0.101 8,760 $885 16 $14,156
28

Maturity Levels
29
1 Just gaining an understanding of using machine learning. No data scientists hired. Early data models built
without much success. There is a belief that whatever DevOps processes are in place will handle ML.
2 The data architecture serves most data that would be necessary for ML. A cloud commitment and direction is
present, providing scale for ML. A first data scientist is hired and prototyping is done. A full lifecycle ML is
accomplished with manual processes. MLOps is still an afterthought.
3 This company is actively looking to deliver the benefits of ML across the company. There is recognition of ML at
the executive level. However, early processes in use resemble DevOps and will not scale. Company begins
forking their DevOps for ML.
4 There is company-wide embracement of ML. Benefits have been produced and realized. There are numerous
and ample data scientists and the data architecture has matured so that more ML benefits can be realized.
Although there still isn’t full consistency in processes, the company has embraced MLOps and is rapidly
adapting it.
5 The business has fundamentally changed due to ML and it could not have done so without MLOps. ML is
applied to initiatives wherever possible. MLOps is nurtured as much as ML and includes model sharing,
reusability and reproducibility, model diagnostics and a strong path to production. Governance has become
central to ML strategy, ensuring outcomes that are explainable and transparent.
As featured in

In Conclusion
• ML Uptake is Strong
• A MLOps workspace is a cloud-based
development environment that enables you to
collaboratively develop, test and deploy
machine learning models
• Develop iterative pipelines to continue to
deliver result
• Automation is a key differentiator in MLOps
platforms
• Embrace Transparency and Predictability
30

MLOps – Applying DevOps to Competitive Advantage

More Related Content

MLOps – Applying DevOps to Competitive Advantage