Pm.ais ummit 180917 final

CONFIDENTIALCONFIDENTIAL
MLOps: From Data
Science to Business ROI
N I S H A T A L A G A L A
CTO, ParallelM

CONFIDENTIAL
Source:(“Artificial(Intelligence:(The(Next(Digital(Frontier?”,(McKinsey(Global(Institute,(June(2017
Out of 160 reviewed AI
use cases:
88% did not
progress beyond
the experimental
stage
But successful early
AI adopters report:
Profit margins
3–15%
higher than
industry average
20%
AI in Production
80%
Developing,
Experimenting,
Contemplating
Survey(of(3073(AIIaware(CIlevel(
Executives
Growing AI Investments; Few Deployed at Scale

The ML Development and Deployment Cycle
Bulk of effort today is in the left side of this process (development)
• Many tools, libraries, etc.
• Democratization of Data Science
• Auto-ML

What makes ML uniquely challenging in production?
Part I : Dataset dependency
• ML ‘black box’ into which many inputs (algorithmic, human, dataset
etc.) go to provide output.
• Difficult to have reproducible, deterministically ‘correct’ result as input
data changes
• ML in production may behave differently than in developer sandbox
because live data ≠ training data

Part II : Simple to Complex Practical Topologies
• Multiple loosely coupled pipelines running possibly in parallel, with
dependencies and human interactions
• Feature engineering pipelines must match for Training and Inference
(CodeGen Pipelines can help here)
• Control pipelines, Canaries, A/B Tests etc.
• Further complexity if ensembles, federated learning etc are used

Part III : Heterogeneity and Scale
• Possibly differing engines (Spark, TensorFlow, Caffe, PyTorch, Sci-kit
Learn, etc. )
• Different languages (Python, Java, Scala, R ..)
• Inference vs Training engines
• Training can be frequently batch
• Inference (Prediction, Model Serving) can be REST endpoint/custom code, streaming engine,
micro-batch, etc.
• Feature manipulation done at training needs to be replicated (or factored in) at inference
• Each engine presents its own scale opportunities/issues

Part IV : Compliance, Regulations…
• Established: Example: Model Risk Management in Financial Services
• https://www.federalreserve.gov/supervisionreg/srletters/sr1107a1.pdf
• Emerging: Example GDPR on Reproducing and Explaining ML
Decisions
• https://iapp.org/news/a/is-there-a-right-to-explanation-for-machine-
learning-in-the-gdpr/
• Emerging: New York City Algorithm Fairness Monitoring
• https://techcrunch.com/2017/12/12/new-york-city-moves-to-establish-
algorithm-monitoring-task-force/

Part V : Collaboration, Process
C O L L A B O R A T I O N
• Expertise mismatch between Data Science & Ops complicates
handoff and continuous management and optimization
P R O C E S S
• Many objects to be tracked and managed (algorithms, models,
pipelines, versions etc.)
• ML pipelines are code. Some approach them as code, some not
• Some ML objects (like Models and Human approvals) are not
best handled in source control repositories

MLOps, DevOps and SDLC
• Integrate with SDLC (Source control repositories, etc.) for code
• Integrate with DevOps for Automation, Scale and Collaboration
Automate
Scale
Measure Business Success
Manage Risk
Compliance & Governance
Automate
Scale
Manage ML Application
Collaborate
MLOps
DevOps

ML Orchestration
ML Health
Business
Impact
Model
Governance
Continuous
Integration/
Deployment
Database
Machine
Learning Models
Business Value
MLOps – Automating the Production ML Lifecycle

Models, Retraining
Control, Statistics
Events, Alerts
Data
Data Science
Platforms
Data Streams Data Lakes
MCenter
MCenter Server
Analytic
Engines
MCenter
Developer Connectors
MCenter
Agent
MCenter
Agent
MCenter
Agent
MCenter
Agent
MCenter
Agent
MCenter
Agent
(CDSW)
How it Works – MCenter Architecture

Deployed Container
MCenter Server
MCenter
Agent
Launch an ML
Application
ML Compute
Platforms
Packaged as
Container
Upload Trained Model
(and optional
Inference Logic)
ServingEndpoint
Prediction
Request
Prediction
Response Business
Application
Data Scientist
Environment
MCenter Production Serving
1
2
3
4
Inference
Logic
Trained
Model
Serving
Endpoint
MCenter
Monitoring
Inference
Logic
Trained
Model

Summary
• We are at the beginnings of ML Operationalization
• Much like databases (backbone of production applications) need
DBAs and software needs DevOps, ML needs MLOps
(specialized operationalization practices, tools and training)
• For more information
• https://www.mlops.org for MLOps resources
• https://www.parallelm.com
W E A R E H I R I N G !

Thank You
nisha.talagala@parallelm.com

Pm.ais ummit 180917 final

Related slideshows

More Related Content

Pm.ais ummit 180917 final