Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
CONFIDENTIALCONFIDENTIAL
MLOps: From Data
Science to Business ROI
N I S H A T A L A G A L A
CTO, ParallelM
CONFIDENTIAL
Source:(“Artificial(Intelligence:(The(Next(Digital(Frontier?”,(McKinsey(Global(Institute,(June(2017
Out of 160 reviewed AI
use cases:
88% did not
progress beyond
the experimental
stage
But successful early
AI adopters report:
Profit margins
3–15%
higher than
industry average
20%
AI in Production
80%
Developing,
Experimenting,
Contemplating
Survey(of(3073(AIIaware(CIlevel(
Executives
Growing AI Investments; Few Deployed at Scale
CONFIDENTIALCONFIDENTIAL
The ML Development and Deployment Cycle
Bulk of effort today is in the left side of this process (development)
• Many tools, libraries, etc.
• Democratization of Data Science
• Auto-ML
CONFIDENTIALCONFIDENTIAL
What makes ML uniquely challenging in production?
Part I : Dataset dependency
• ML ‘black box’ into which many inputs (algorithmic, human, dataset
etc.) go to provide output.
• Difficult to have reproducible, deterministically ‘correct’ result as input
data changes
• ML in production may behave differently than in developer sandbox
because live data ≠ training data
CONFIDENTIALCONFIDENTIAL
What makes ML uniquely challenging in production?
Part II : Simple to Complex Practical Topologies
• Multiple loosely coupled pipelines running possibly in parallel, with
dependencies and human interactions
• Feature engineering pipelines must match for Training and Inference
(CodeGen Pipelines can help here)
• Control pipelines, Canaries, A/B Tests etc.
• Further complexity if ensembles, federated learning etc are used
CONFIDENTIALCONFIDENTIAL
What makes ML uniquely challenging in production?
Part III : Heterogeneity and Scale
• Possibly differing engines (Spark, TensorFlow, Caffe, PyTorch, Sci-kit
Learn, etc. )
• Different languages (Python, Java, Scala, R ..)
• Inference vs Training engines
• Training can be frequently batch
• Inference (Prediction, Model Serving) can be REST endpoint/custom code, streaming engine,
micro-batch, etc.
• Feature manipulation done at training needs to be replicated (or factored in) at inference
• Each engine presents its own scale opportunities/issues
CONFIDENTIALCONFIDENTIAL
What makes ML uniquely challenging in production?
Part IV : Compliance, Regulations…
• Established: Example: Model Risk Management in Financial Services
• https://www.federalreserve.gov/supervisionreg/srletters/sr1107a1.pdf
• Emerging: Example GDPR on Reproducing and Explaining ML
Decisions
• https://iapp.org/news/a/is-there-a-right-to-explanation-for-machine-
learning-in-the-gdpr/
• Emerging: New York City Algorithm Fairness Monitoring
• https://techcrunch.com/2017/12/12/new-york-city-moves-to-establish-
algorithm-monitoring-task-force/
CONFIDENTIALCONFIDENTIAL
What makes ML uniquely challenging in production?
Part V : Collaboration, Process
C O L L A B O R A T I O N
• Expertise mismatch between Data Science & Ops complicates
handoff and continuous management and optimization
P R O C E S S
• Many objects to be tracked and managed (algorithms, models,
pipelines, versions etc.)
• ML pipelines are code. Some approach them as code, some not
• Some ML objects (like Models and Human approvals) are not
best handled in source control repositories
CONFIDENTIALCONFIDENTIAL
MLOps, DevOps and SDLC
• Integrate with SDLC (Source control repositories, etc.) for code
• Integrate with DevOps for Automation, Scale and Collaboration
Automate
Scale
Measure Business Success
Manage Risk
Compliance & Governance
Automate
Scale
Manage ML Application
Collaborate
MLOps
DevOps
CONFIDENTIALCONFIDENTIAL
ML Orchestration
ML Health
Business
Impact
Model
Governance
Continuous
Integration/
Deployment
Database
Machine
Learning Models
Business Value
MLOps – Automating the Production ML Lifecycle
CONFIDENTIALCONFIDENTIAL
Models, Retraining
Control, Statistics
Events, Alerts
Data
Data Science
Platforms
Data Streams Data Lakes
MCenter
MCenter Server
Analytic
Engines
MCenter
Developer Connectors
MCenter
Agent
MCenter
Agent
MCenter
Agent
MCenter
Agent
MCenter
Agent
MCenter
Agent
(CDSW)
How it Works – MCenter Architecture
CONFIDENTIALCONFIDENTIAL
Deployed Container
MCenter Server
MCenter
Agent
Launch an ML
Application
ML Compute
Platforms
Packaged as
Container
Upload Trained Model
(and optional
Inference Logic)
ServingEndpoint
Prediction
Request
Prediction
Response Business
Application
Data Scientist
Environment
MCenter Production Serving
1
2
3
4
Inference
Logic
Trained
Model
Serving
Endpoint
MCenter
Monitoring
Inference
Logic
Trained
Model
CONFIDENTIALCONFIDENTIAL
Summary
• We are at the beginnings of ML Operationalization
• Much like databases (backbone of production applications) need
DBAs and software needs DevOps, ML needs MLOps
(specialized operationalization practices, tools and training)
• For more information
• https://www.mlops.org for MLOps resources
• https://www.parallelm.com
W E A R E H I R I N G !
CONFIDENTIALCONFIDENTIAL
Thank You
nisha.talagala@parallelm.com

More Related Content

Pm.ais ummit 180917 final

  • 1. CONFIDENTIALCONFIDENTIAL MLOps: From Data Science to Business ROI N I S H A T A L A G A L A CTO, ParallelM
  • 2. CONFIDENTIAL Source:(“Artificial(Intelligence:(The(Next(Digital(Frontier?”,(McKinsey(Global(Institute,(June(2017 Out of 160 reviewed AI use cases: 88% did not progress beyond the experimental stage But successful early AI adopters report: Profit margins 3–15% higher than industry average 20% AI in Production 80% Developing, Experimenting, Contemplating Survey(of(3073(AIIaware(CIlevel( Executives Growing AI Investments; Few Deployed at Scale
  • 3. CONFIDENTIALCONFIDENTIAL The ML Development and Deployment Cycle Bulk of effort today is in the left side of this process (development) • Many tools, libraries, etc. • Democratization of Data Science • Auto-ML
  • 4. CONFIDENTIALCONFIDENTIAL What makes ML uniquely challenging in production? Part I : Dataset dependency • ML ‘black box’ into which many inputs (algorithmic, human, dataset etc.) go to provide output. • Difficult to have reproducible, deterministically ‘correct’ result as input data changes • ML in production may behave differently than in developer sandbox because live data ≠ training data
  • 5. CONFIDENTIALCONFIDENTIAL What makes ML uniquely challenging in production? Part II : Simple to Complex Practical Topologies • Multiple loosely coupled pipelines running possibly in parallel, with dependencies and human interactions • Feature engineering pipelines must match for Training and Inference (CodeGen Pipelines can help here) • Control pipelines, Canaries, A/B Tests etc. • Further complexity if ensembles, federated learning etc are used
  • 6. CONFIDENTIALCONFIDENTIAL What makes ML uniquely challenging in production? Part III : Heterogeneity and Scale • Possibly differing engines (Spark, TensorFlow, Caffe, PyTorch, Sci-kit Learn, etc. ) • Different languages (Python, Java, Scala, R ..) • Inference vs Training engines • Training can be frequently batch • Inference (Prediction, Model Serving) can be REST endpoint/custom code, streaming engine, micro-batch, etc. • Feature manipulation done at training needs to be replicated (or factored in) at inference • Each engine presents its own scale opportunities/issues
  • 7. CONFIDENTIALCONFIDENTIAL What makes ML uniquely challenging in production? Part IV : Compliance, Regulations… • Established: Example: Model Risk Management in Financial Services • https://www.federalreserve.gov/supervisionreg/srletters/sr1107a1.pdf • Emerging: Example GDPR on Reproducing and Explaining ML Decisions • https://iapp.org/news/a/is-there-a-right-to-explanation-for-machine- learning-in-the-gdpr/ • Emerging: New York City Algorithm Fairness Monitoring • https://techcrunch.com/2017/12/12/new-york-city-moves-to-establish- algorithm-monitoring-task-force/
  • 8. CONFIDENTIALCONFIDENTIAL What makes ML uniquely challenging in production? Part V : Collaboration, Process C O L L A B O R A T I O N • Expertise mismatch between Data Science & Ops complicates handoff and continuous management and optimization P R O C E S S • Many objects to be tracked and managed (algorithms, models, pipelines, versions etc.) • ML pipelines are code. Some approach them as code, some not • Some ML objects (like Models and Human approvals) are not best handled in source control repositories
  • 9. CONFIDENTIALCONFIDENTIAL MLOps, DevOps and SDLC • Integrate with SDLC (Source control repositories, etc.) for code • Integrate with DevOps for Automation, Scale and Collaboration Automate Scale Measure Business Success Manage Risk Compliance & Governance Automate Scale Manage ML Application Collaborate MLOps DevOps
  • 11. CONFIDENTIALCONFIDENTIAL Models, Retraining Control, Statistics Events, Alerts Data Data Science Platforms Data Streams Data Lakes MCenter MCenter Server Analytic Engines MCenter Developer Connectors MCenter Agent MCenter Agent MCenter Agent MCenter Agent MCenter Agent MCenter Agent (CDSW) How it Works – MCenter Architecture
  • 12. CONFIDENTIALCONFIDENTIAL Deployed Container MCenter Server MCenter Agent Launch an ML Application ML Compute Platforms Packaged as Container Upload Trained Model (and optional Inference Logic) ServingEndpoint Prediction Request Prediction Response Business Application Data Scientist Environment MCenter Production Serving 1 2 3 4 Inference Logic Trained Model Serving Endpoint MCenter Monitoring Inference Logic Trained Model
  • 13. CONFIDENTIALCONFIDENTIAL Summary • We are at the beginnings of ML Operationalization • Much like databases (backbone of production applications) need DBAs and software needs DevOps, ML needs MLOps (specialized operationalization practices, tools and training) • For more information • https://www.mlops.org for MLOps resources • https://www.parallelm.com W E A R E H I R I N G !