Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
MLOps with
Databricks
Presentor :
Tanishka Garg
Lack of etiquette and manners is a huge turn off.
KnolX Etiquettes
 Punctuality
Join the session 5 minutes prior to the session start time. We start on
time and conclude on time!
 Feedback
Make sure to submit a constructive feedback for all sessions as it is very
helpful for the presenter.
 Silent Mode
Keep your mobile devices in silent mode, feel free to move out of session
in case you need to attend an urgent call.
 Avoid Disturbance
Avoid unwanted chit chat during the session.
1. Understanding Databricks
2. What is MLOps?
o Components of MLOps
o Benefits of MLOps
3. MLOps with Databricks
o Why Implement MLOps with Databricks?
o Development stage
o Staging stage
o Production stage
4. Components of MLOps with Databricks
5. Demo
Databricks for MLOps Presentation (AI/ML)
Introduction to Databricks
 Databricks is a unified, open analytics platform for building, deploying, sharing, and maintaining enterprise-grade data, analytics, and AI
solutions at scale. Most of the tasks include:
o Data processing scheduling and management, in particular ETL
o Generating dashboards and visualizations
o Managing security, governance, high availability, and disaster recovery
o Data discovery, annotation, and exploration
o Machine learning (ML) modeling, tracking, and model serving
o Generative AI solutions
02
What is MLOps?
 MLOps stands for Machine Learning Operations.
 MLOps is a core function of Machine Learning
engineering, focused on streamlining the process of
taking machine learning models to production, and
then maintaining and monitoring them.
 MLOps is a collaborative function, often comprising
data scientists, devops engineers, and IT.
Benefits of MLOps?
 Efficiency:
o MLOps allows data teams to achieve faster model development, deliver higher quality ML models, and faster deployment and
production.
 Scalability:
o MLOps also enables vast scalability and management where thousands of models can be overseen, controlled, managed, and
monitored for continuous integration, continuous delivery, and continuous deployment.
o Specifically, MLOps provides reproducibility of ML pipelines, enabling more tightly-coupled collaboration across data teams,
reducing conflict with devops and IT, and accelerating release velocity.
 Risk reduction:
o Machine learning models often need regulatory scrutiny and drift-check, and MLOps enables greater transparency and faster
response to such requests and ensures greater compliance with an organization's or industry's policies.
03
MLOps with Databricks
 MLOps is a set of processes and automated steps to manage
code, data, and models. It combines DevOps, DataOps, and
ModelOps.
 Databricks recommends creating separate environments for
the different stages of ML code and model development with
clearly defined transitions between stages.
o Development
o Staging
o Production
MLOps with Databricks
 Development Stage
o The focus of the development stage is experimentation.
o Data scientists develop features and models and run experiments to optimize model performance. The output of the
development process is ML pipeline code that can include feature computation, model training, inference, and monitoring.
 Staging Stage
o The focus of this stage is testing the ML pipeline code to ensure it is ready for production. All of the ML pipeline code is
tested in this stage, including code for model training as well as feature engineering pipelines, inference code, and so on.
o ML engineers create a CI pipeline to implement the unit and integration tests run in this stage. The output of the staging
process is a release branch that triggers the CI/CD system to start the production stage.
 Production Stage
o ML engineers own the production environment where ML pipelines are deployed and executed.
o These pipelines trigger model training, validate and deploy new model versions, publish predictions to downstream tables or
applications, and monitor the entire process to avoid performance degradation and instability.
Why implement MLOps with Databricks?
 A unified platform
 Scalibilty and performance
 MLflow Integration
 Simplified ML workflow
 Automated deployment
 Improved Model Training
04
What are the components of MLOps?
Exploratory Data Analysis - EDA
 Exploratory data analysis (EDA) includes methods for exploring data sets to summarize their main characteristics and
identify any problems with the data.
 EDA can also influence which algorithms you choose to apply for training ML models.
 Databricks has built-in analysis and visualization tools in both Databricks SQL and in Databricks Runtime.
Data Preparations and Feature Engineering
 Data preparation:
o Cleaning and formatting data: This includes tasks such as handling missing values or outliers, ensuring data is in the
correct format, and removing unneeded columns.
o Preprocessing data: This includes tasks like numerical transformations, aggregating data, encoding text or image data,
and creating new features.
o Combining data: This includes tasks like joining tables or merging datasets.
o Storing the data: For storing the data we use Delta Lake, as Delta Lake is the optimized storage layer that provides the
foundation for tables in a lakehouse on Databricks. Delta Lake is open source software that extends Parquet data files
with a file-based transaction log for ACID transactions and scalable metadata handling.
 Feature Engineering
o It is the process of converting raw data into features that can be used to develop machine learning models.
o For ML applications, Databricks Feature Store helps your team discover and re-use features, track feature lineage, and
publish features to online stores for real-time serving and automatic lookup.
Model Training and Tuning
 Once you’re done with feature engineering, the next step is to train and tune your model and select the right algorithm or
architecture so that your model can achieve the desired accuracy.
 Use popular open source libraries such as scikit-learn and hyperopt to train and improve model performance.
 As a simpler alternative, use automated machine learning tools such as AutoML to automatically perform trial runs and create
reviewable and deployable code.
Model serving and deployment
 Databricks Model Serving provides a unified interface to deploy, govern, and query AI models.
 Each model you serve is available as a REST API that you can integrate into your web or client application.
 Why use Model serving?
o Deploy and query any models
o Securely customize models with your private data
o Govern and monitor models
o Reduce cost with optimized inference and fast scaling
o Bring reliability and security to Modell serving
05
References
 https://www.databricks.com/
Databricks for MLOps Presentation (AI/ML)

More Related Content

Similar to Databricks for MLOps Presentation (AI/ML)

From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
Carl W. Handlin
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
Jordan Birdsell
 
Dmitry Spodarets: Modern MLOps toolchain 2023
Dmitry Spodarets: Modern MLOps toolchain 2023Dmitry Spodarets: Modern MLOps toolchain 2023
Dmitry Spodarets: Modern MLOps toolchain 2023
Lviv Startup Club
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
Databricks
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
Databricks
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
Itai Yaffe
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro session
Avinash Patil
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
DataScienceConferenc1
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
Nick Pentreath
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptx
Knoldus Inc.
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya
 
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleMLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
Databricks
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
[DSC DACH 23] Go with the flow – Track your machine learning lifecycle using ...
[DSC DACH 23] Go with the flow – Track your machine learning lifecycle using ...[DSC DACH 23] Go with the flow – Track your machine learning lifecycle using ...
[DSC DACH 23] Go with the flow – Track your machine learning lifecycle using ...
DataScienceConferenc1
 
Demystifying MLOps: A Beginner's Guide To Machine Learning Operations
Demystifying MLOps: A Beginner's Guide To Machine Learning OperationsDemystifying MLOps: A Beginner's Guide To Machine Learning Operations
Demystifying MLOps: A Beginner's Guide To Machine Learning Operations
Rahul Bedi
 
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher ScientificEnabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Databricks
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
Sandesh Rao
 
[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...
[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...
[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...
DataScienceConferenc1
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Edunomica
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017
Nisha Talagala
 

Similar to Databricks for MLOps Presentation (AI/ML) (20)

From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
 
Dmitry Spodarets: Modern MLOps toolchain 2023
Dmitry Spodarets: Modern MLOps toolchain 2023Dmitry Spodarets: Modern MLOps toolchain 2023
Dmitry Spodarets: Modern MLOps toolchain 2023
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro session
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptx
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
 
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleMLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
[DSC DACH 23] Go with the flow – Track your machine learning lifecycle using ...
[DSC DACH 23] Go with the flow – Track your machine learning lifecycle using ...[DSC DACH 23] Go with the flow – Track your machine learning lifecycle using ...
[DSC DACH 23] Go with the flow – Track your machine learning lifecycle using ...
 
Demystifying MLOps: A Beginner's Guide To Machine Learning Operations
Demystifying MLOps: A Beginner's Guide To Machine Learning OperationsDemystifying MLOps: A Beginner's Guide To Machine Learning Operations
Demystifying MLOps: A Beginner's Guide To Machine Learning Operations
 
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher ScientificEnabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
 
[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...
[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...
[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017
 

More from Knoldus Inc.

Introduction to Argo Rollouts Presentation
Introduction to Argo Rollouts PresentationIntroduction to Argo Rollouts Presentation
Introduction to Argo Rollouts Presentation
Knoldus Inc.
 
Intro to Azure Container App Presentation
Intro to Azure Container App PresentationIntro to Azure Container App Presentation
Intro to Azure Container App Presentation
Knoldus Inc.
 
Insights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability ExcellenceInsights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability Excellence
Knoldus Inc.
 
Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)
Knoldus Inc.
 
Code Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis FrameworkCode Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis Framework
Knoldus Inc.
 
AWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS PresentationAWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS Presentation
Knoldus Inc.
 
Amazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and AuthorizationAmazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and Authorization
Knoldus Inc.
 
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web DevelopmentZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
Knoldus Inc.
 
Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.
Knoldus Inc.
 
Facilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptxFacilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptx
Knoldus Inc.
 
Performance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume ServicesPerformance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume Services
Knoldus Inc.
 
Snowflake and its features (Presentation)
Snowflake and its features (Presentation)Snowflake and its features (Presentation)
Snowflake and its features (Presentation)
Knoldus Inc.
 
Terratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructureTerratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructure
Knoldus Inc.
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
Knoldus Inc.
 
Secure practices with dot net services.pptx
Secure practices with dot net services.pptxSecure practices with dot net services.pptx
Secure practices with dot net services.pptx
Knoldus Inc.
 
Distributed Cache with dot microservices
Distributed Cache with dot microservicesDistributed Cache with dot microservices
Distributed Cache with dot microservices
Knoldus Inc.
 
Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)
Knoldus Inc.
 
Using InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in JmeterUsing InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in Jmeter
Knoldus Inc.
 
Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)
Knoldus Inc.
 
Stakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) PresentationStakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) Presentation
Knoldus Inc.
 

More from Knoldus Inc. (20)

Introduction to Argo Rollouts Presentation
Introduction to Argo Rollouts PresentationIntroduction to Argo Rollouts Presentation
Introduction to Argo Rollouts Presentation
 
Intro to Azure Container App Presentation
Intro to Azure Container App PresentationIntro to Azure Container App Presentation
Intro to Azure Container App Presentation
 
Insights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability ExcellenceInsights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability Excellence
 
Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)
 
Code Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis FrameworkCode Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis Framework
 
AWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS PresentationAWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS Presentation
 
Amazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and AuthorizationAmazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and Authorization
 
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web DevelopmentZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
 
Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.
 
Facilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptxFacilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptx
 
Performance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume ServicesPerformance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume Services
 
Snowflake and its features (Presentation)
Snowflake and its features (Presentation)Snowflake and its features (Presentation)
Snowflake and its features (Presentation)
 
Terratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructureTerratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructure
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
 
Secure practices with dot net services.pptx
Secure practices with dot net services.pptxSecure practices with dot net services.pptx
Secure practices with dot net services.pptx
 
Distributed Cache with dot microservices
Distributed Cache with dot microservicesDistributed Cache with dot microservices
Distributed Cache with dot microservices
 
Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)
 
Using InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in JmeterUsing InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in Jmeter
 
Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)
 
Stakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) PresentationStakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) Presentation
 

Recently uploaded

Lessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien RiouxLessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien Rioux
crioux1
 
GDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
GDG Cloud Southlake #34: Neatsun Ziv: Automating AppsecGDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
GDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
James Anderson
 
5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx
SATYENDRA100
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time
@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time
@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time
amitchopra0215
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
ArgaBisma
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
jackson110191
 
Blockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre timesBlockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre times
anupriti
 
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
uuuot
 
K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024
The Digital Insurer
 
What's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdfWhat's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdf
SeasiaInfotech2
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 
AI_dev Europe 2024 - From OpenAI to Opensource AI
AI_dev Europe 2024 - From OpenAI to Opensource AIAI_dev Europe 2024 - From OpenAI to Opensource AI
AI_dev Europe 2024 - From OpenAI to Opensource AI
Raphaël Semeteys
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
UiPathCommunity
 
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
ScyllaDB
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Safe Software
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 

Recently uploaded (20)

Lessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien RiouxLessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien Rioux
 
GDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
GDG Cloud Southlake #34: Neatsun Ziv: Automating AppsecGDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
GDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
 
5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time
@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time
@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
 
Blockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre timesBlockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre times
 
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
 
K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024
 
What's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdfWhat's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdf
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 
AI_dev Europe 2024 - From OpenAI to Opensource AI
AI_dev Europe 2024 - From OpenAI to Opensource AIAI_dev Europe 2024 - From OpenAI to Opensource AI
AI_dev Europe 2024 - From OpenAI to Opensource AI
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
 
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 

Databricks for MLOps Presentation (AI/ML)

  • 2. Lack of etiquette and manners is a huge turn off. KnolX Etiquettes  Punctuality Join the session 5 minutes prior to the session start time. We start on time and conclude on time!  Feedback Make sure to submit a constructive feedback for all sessions as it is very helpful for the presenter.  Silent Mode Keep your mobile devices in silent mode, feel free to move out of session in case you need to attend an urgent call.  Avoid Disturbance Avoid unwanted chit chat during the session.
  • 3. 1. Understanding Databricks 2. What is MLOps? o Components of MLOps o Benefits of MLOps 3. MLOps with Databricks o Why Implement MLOps with Databricks? o Development stage o Staging stage o Production stage 4. Components of MLOps with Databricks 5. Demo
  • 5. Introduction to Databricks  Databricks is a unified, open analytics platform for building, deploying, sharing, and maintaining enterprise-grade data, analytics, and AI solutions at scale. Most of the tasks include: o Data processing scheduling and management, in particular ETL o Generating dashboards and visualizations o Managing security, governance, high availability, and disaster recovery o Data discovery, annotation, and exploration o Machine learning (ML) modeling, tracking, and model serving o Generative AI solutions
  • 6. 02
  • 7. What is MLOps?  MLOps stands for Machine Learning Operations.  MLOps is a core function of Machine Learning engineering, focused on streamlining the process of taking machine learning models to production, and then maintaining and monitoring them.  MLOps is a collaborative function, often comprising data scientists, devops engineers, and IT.
  • 8. Benefits of MLOps?  Efficiency: o MLOps allows data teams to achieve faster model development, deliver higher quality ML models, and faster deployment and production.  Scalability: o MLOps also enables vast scalability and management where thousands of models can be overseen, controlled, managed, and monitored for continuous integration, continuous delivery, and continuous deployment. o Specifically, MLOps provides reproducibility of ML pipelines, enabling more tightly-coupled collaboration across data teams, reducing conflict with devops and IT, and accelerating release velocity.  Risk reduction: o Machine learning models often need regulatory scrutiny and drift-check, and MLOps enables greater transparency and faster response to such requests and ensures greater compliance with an organization's or industry's policies.
  • 9. 03
  • 10. MLOps with Databricks  MLOps is a set of processes and automated steps to manage code, data, and models. It combines DevOps, DataOps, and ModelOps.  Databricks recommends creating separate environments for the different stages of ML code and model development with clearly defined transitions between stages. o Development o Staging o Production
  • 11. MLOps with Databricks  Development Stage o The focus of the development stage is experimentation. o Data scientists develop features and models and run experiments to optimize model performance. The output of the development process is ML pipeline code that can include feature computation, model training, inference, and monitoring.  Staging Stage o The focus of this stage is testing the ML pipeline code to ensure it is ready for production. All of the ML pipeline code is tested in this stage, including code for model training as well as feature engineering pipelines, inference code, and so on. o ML engineers create a CI pipeline to implement the unit and integration tests run in this stage. The output of the staging process is a release branch that triggers the CI/CD system to start the production stage.  Production Stage o ML engineers own the production environment where ML pipelines are deployed and executed. o These pipelines trigger model training, validate and deploy new model versions, publish predictions to downstream tables or applications, and monitor the entire process to avoid performance degradation and instability.
  • 12. Why implement MLOps with Databricks?  A unified platform  Scalibilty and performance  MLflow Integration  Simplified ML workflow  Automated deployment  Improved Model Training
  • 13. 04
  • 14. What are the components of MLOps?
  • 15. Exploratory Data Analysis - EDA  Exploratory data analysis (EDA) includes methods for exploring data sets to summarize their main characteristics and identify any problems with the data.  EDA can also influence which algorithms you choose to apply for training ML models.  Databricks has built-in analysis and visualization tools in both Databricks SQL and in Databricks Runtime.
  • 16. Data Preparations and Feature Engineering  Data preparation: o Cleaning and formatting data: This includes tasks such as handling missing values or outliers, ensuring data is in the correct format, and removing unneeded columns. o Preprocessing data: This includes tasks like numerical transformations, aggregating data, encoding text or image data, and creating new features. o Combining data: This includes tasks like joining tables or merging datasets. o Storing the data: For storing the data we use Delta Lake, as Delta Lake is the optimized storage layer that provides the foundation for tables in a lakehouse on Databricks. Delta Lake is open source software that extends Parquet data files with a file-based transaction log for ACID transactions and scalable metadata handling.  Feature Engineering o It is the process of converting raw data into features that can be used to develop machine learning models. o For ML applications, Databricks Feature Store helps your team discover and re-use features, track feature lineage, and publish features to online stores for real-time serving and automatic lookup.
  • 17. Model Training and Tuning  Once you’re done with feature engineering, the next step is to train and tune your model and select the right algorithm or architecture so that your model can achieve the desired accuracy.  Use popular open source libraries such as scikit-learn and hyperopt to train and improve model performance.  As a simpler alternative, use automated machine learning tools such as AutoML to automatically perform trial runs and create reviewable and deployable code.
  • 18. Model serving and deployment  Databricks Model Serving provides a unified interface to deploy, govern, and query AI models.  Each model you serve is available as a REST API that you can integrate into your web or client application.  Why use Model serving? o Deploy and query any models o Securely customize models with your private data o Govern and monitor models o Reduce cost with optimized inference and fast scaling o Bring reliability and security to Modell serving
  • 19. 05