Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
MLOps Virtual Event:
Automating ML at Scale
Matei Zaharia
Chief Technologist, Databricks
@matei_zaharia
ML is Transforming All Major Industries
Healthcare
Logistics
Telecom
Government
Banking
High Tech Oil & Gas
Agriculture
Retail
Travel
But ML is Different from Traditional Software
Traditional Software
Goal: meet a functional
specification
Quality depends only on
application code
Pick one software stack
Machine Learning
Goal: optimize a metric
(e.g. prediction accuracy)
Quality depends on training data
and tuning parameters
Constantly evaluate and combine
new libraries for the same task
So Operating ML is Complex!
§ Many teams and systems involved
§ Constantly update data & metrics
§ Hard to move from development
to production environments
Data Prep
Training
Deployment
Raw Data
ML ENGINEER
APPLICATION
DEVELOPER
DATA
ENGINEER
So Operating ML is Complex!
§ Many teams and systems involved
§ Constantly update data & metrics
§ Hard to move from development
to production environments
Data Prep
Training
Deployment
Raw Data
ML ENGINEER
APPLICATION
DEVELOPER
DATA
ENGINEER
ML teams often spend >50% of time
maintaining existing models
Response: ML Platforms
Software to manage the ML development and operations process,
from data to experimentation to production
Examples: Google TFX, Facebook FBLearner, Uber Michelangelo,
MLflow
Typical functionality:
▪ Data management
▪ Experiment management
▪ Model management
▪ Deployment for inference
▪ Reproducibility
▪ Testing & monitoring
All through a
consistent interface!
Desirable Features for an ML Platform
Desirable Features for an ML Platform
1. Ease of adoption by data scientists, engineers, and model users
▪ How much work does it take to use? What ML libraries are supported? Etc.
Desirable Features for an ML Platform
1. Ease of adoption by data scientists, engineers, and model users
▪ How much work does it take to use? What ML libraries are supported? Etc.
2. Integration with data infrastructure to support data versioning,
monitoring, and governance across data pipeline & ML steps
Desirable Features for an ML Platform
1. Ease of adoption by data scientists, engineers, and model users
▪ How much work does it take to use? What ML libraries are supported? Etc.
2. Integration with data infrastructure to support data versioning,
monitoring, and governance across data pipeline & ML steps
3. Collaboration functions to enable sharing code, data, features,
experiments and models in a central place (securely!)
Our MLOps Approach in Databricks
§ Every org’s requirements will be different, and will change over time
§ Provide a general platform that is easy to integrate with diverse tools
Open source machine
learning platform
Transactional, versioned
data lake storage
Data science & ML workspace
In This Webinar
§ How we and other organizations perform MLOps at scale
§ Demos and experience from two customers
§ Live Q&A with presenters

More Related Content

MLOps Virtual Event: Automating ML at Scale

  • 1. MLOps Virtual Event: Automating ML at Scale Matei Zaharia Chief Technologist, Databricks @matei_zaharia
  • 2. ML is Transforming All Major Industries Healthcare Logistics Telecom Government Banking High Tech Oil & Gas Agriculture Retail Travel
  • 3. But ML is Different from Traditional Software Traditional Software Goal: meet a functional specification Quality depends only on application code Pick one software stack Machine Learning Goal: optimize a metric (e.g. prediction accuracy) Quality depends on training data and tuning parameters Constantly evaluate and combine new libraries for the same task
  • 4. So Operating ML is Complex! § Many teams and systems involved § Constantly update data & metrics § Hard to move from development to production environments Data Prep Training Deployment Raw Data ML ENGINEER APPLICATION DEVELOPER DATA ENGINEER
  • 5. So Operating ML is Complex! § Many teams and systems involved § Constantly update data & metrics § Hard to move from development to production environments Data Prep Training Deployment Raw Data ML ENGINEER APPLICATION DEVELOPER DATA ENGINEER ML teams often spend >50% of time maintaining existing models
  • 6. Response: ML Platforms Software to manage the ML development and operations process, from data to experimentation to production Examples: Google TFX, Facebook FBLearner, Uber Michelangelo, MLflow Typical functionality: ▪ Data management ▪ Experiment management ▪ Model management ▪ Deployment for inference ▪ Reproducibility ▪ Testing & monitoring All through a consistent interface!
  • 7. Desirable Features for an ML Platform
  • 8. Desirable Features for an ML Platform 1. Ease of adoption by data scientists, engineers, and model users ▪ How much work does it take to use? What ML libraries are supported? Etc.
  • 9. Desirable Features for an ML Platform 1. Ease of adoption by data scientists, engineers, and model users ▪ How much work does it take to use? What ML libraries are supported? Etc. 2. Integration with data infrastructure to support data versioning, monitoring, and governance across data pipeline & ML steps
  • 10. Desirable Features for an ML Platform 1. Ease of adoption by data scientists, engineers, and model users ▪ How much work does it take to use? What ML libraries are supported? Etc. 2. Integration with data infrastructure to support data versioning, monitoring, and governance across data pipeline & ML steps 3. Collaboration functions to enable sharing code, data, features, experiments and models in a central place (securely!)
  • 11. Our MLOps Approach in Databricks § Every org’s requirements will be different, and will change over time § Provide a general platform that is easy to integrate with diverse tools Open source machine learning platform Transactional, versioned data lake storage Data science & ML workspace
  • 12. In This Webinar § How we and other organizations perform MLOps at scale § Demos and experience from two customers § Live Q&A with presenters