Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Learn to Use Databricks
for Data Science
Sean Owen, Principal Solutions Architect
Austin Ford, Sr. Product Manager
Data Science is a tough job
▪ Today, companies are becoming
more and more data-driven, and
the ones getting the most out of
their data will be the ones to
succeed
▪ As a result, Data Science is now a
core capability of many
businesses
▪ Unfortunately, it comes with a
challenging, complex workflow at
scale
What does a data science workflow look like?
I need the correctly sized compute
resource for my task
I need to be able to find and access
the right data sources to fuel my
analysis
I need to be sure my toolbox is ready
with the packages and libraries
required for my work
1. Setup
I’ve been given a business
question to answer with data.
Before I can even get started on
the data science, I need to set
up my development
environment.
What does a data science workflow look like?
I uncover insights through
statistical inference, modeling, or
other methods
I start with exploratory data
analysis to familiarize myself with
the data and form hypotheses
I synthesize the results of my work
and the answers to the original
business question
2. Data Science
Once the initial overhead of
setup is complete, the real work
begins.
At any point, I could be sent
back to the Setup phase to add
another data source, change
the size of my compute
resource, or pull in another
library.
What does a data science workflow look like?
I share the results with my business
stakeholders via email or Slack
I formulate the results into a report
or dashboard so they can be
consumed
I get feedback about my work from
my stakeholders and iterate with
them to have the biggest impact
3. Sharing Results
The most important step comes
once I finish the analysis:
sharing the results with my
stakeholders.
Our answer: The Databricks Lakehouse Platform
We want to remove the overhead so
you can focus on the most important
part of your work — data science
Structured Semi-structured Unstructured Streaming
BI &
SQL Analytics
Machine Learning
Real-time Data
Applications
Data Management & Governance
Open Data Storage
Data Science &
Engineering
Lakehouse Platform
Simple | Open | Collaborative
Reliable | Scalable | Secure
Structured Semi-structured Unstructured Streaming
BI &
SQL Analytics
Machine Learning
Real-time Data
Applications
Data Management & Governance
Open Data Storage
Data Science &
Engineering
Lakehouse Platform
Simple | Open | Collaborative
Reliable | Scalable | Secure
Our focus today
Databricks makes setup easy
1. Setup
The Lakehouse brings all
your company’s data
together into a single place
so you don’t have to go
digging through a variety of
data sources
Easily choose the right
compute resource for your
task and switch as needed
single-machine VMs
GPUs Spark clusters
Databricks’ runtimes come
prepackaged with the most
common data science tools,
and customization is easy
Add Python libraries on top of a
runtime with a single line of code
Databricks has the tools to enable you to focus on
your work
2. Data Science
Multi-language,
collaborative notebooks
with co-presence,
commenting, and co-editing
Built-in visualizations that
take you from raw data to
insights in two clicks
Auto-logged revision
history and a git integration
to ensure reproducibility and
enable version control
Databricks lets your share results and iterate
quickly
3. Sharing Results
Easily share your notebooks
with stakeholders, who can
view them as reports
Create a dashboard directly
from your notebook’s results
Iterate with your
stakeholders directly in the
notebook through comments
and co-presence
Getting practical: hands-on with an expert
Sean Owen
Principal Solutions Architect

More Related Content

What's hot

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
Databricks
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With Data
Databricks
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
Databricks
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
LibbySchulze
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
Databricks
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
Databricks
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
Adam Doyle
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
Alex Ivy
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
Databricks
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
James Serra
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCP
Databricks
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Tristan Baker
 

What's hot (20)

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With Data
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCP
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
 

Similar to Learn to Use Databricks for Data Science

What makes an effective data team?
What makes an effective data team?What makes an effective data team?
What makes an effective data team?
Snowplow Analytics
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the Enterprise
Jesus Rodriguez
 
How to Scale your Analytics in a Maturing Organization
How to Scale your Analytics in a Maturing OrganizationHow to Scale your Analytics in a Maturing Organization
How to Scale your Analytics in a Maturing Organization
Kissmetrics on SlideShare
 
Data science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptxData science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptx
NagarajanG35
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterThe Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value Thereafter
Inside Analysis
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?
Inside Analysis
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Simplilearn
 
Rapid-fire BI
Rapid-fire BIRapid-fire BI
Rapid-fire BI
Brett Sheppard
 
Make data simple in the cognitive era
Make data simple in the cognitive eraMake data simple in the cognitive era
Make data simple in the cognitive era
IBM Analytics
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo
 
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering
 
Big data
Big dataBig data
Data_Harmonization_ClearStory
Data_Harmonization_ClearStoryData_Harmonization_ClearStory
Data_Harmonization_ClearStory
William Davis
 
Breed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptxBreed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptx
GautamPopli1
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
SpringPeople
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data Science
Annie Flippo
 
Advanced Use Cases for Analytics Breakout Session
Advanced Use Cases for Analytics Breakout SessionAdvanced Use Cases for Analytics Breakout Session
Advanced Use Cases for Analytics Breakout Session
Splunk
 
Notebooks in IBM
Notebooks in IBMNotebooks in IBM
Notebooks in IBM
Rosario Cunha
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Denodo
 
H2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.ioH2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.io
Sri Ambati
 

Similar to Learn to Use Databricks for Data Science (20)

What makes an effective data team?
What makes an effective data team?What makes an effective data team?
What makes an effective data team?
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the Enterprise
 
How to Scale your Analytics in a Maturing Organization
How to Scale your Analytics in a Maturing OrganizationHow to Scale your Analytics in a Maturing Organization
How to Scale your Analytics in a Maturing Organization
 
Data science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptxData science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptx
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterThe Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value Thereafter
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
 
Rapid-fire BI
Rapid-fire BIRapid-fire BI
Rapid-fire BI
 
Make data simple in the cognitive era
Make data simple in the cognitive eraMake data simple in the cognitive era
Make data simple in the cognitive era
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
 
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
 
Big data
Big dataBig data
Big data
 
Data_Harmonization_ClearStory
Data_Harmonization_ClearStoryData_Harmonization_ClearStory
Data_Harmonization_ClearStory
 
Breed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptxBreed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptx
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data Science
 
Advanced Use Cases for Analytics Breakout Session
Advanced Use Cases for Analytics Breakout SessionAdvanced Use Cases for Analytics Breakout Session
Advanced Use Cases for Analytics Breakout Session
 
Notebooks in IBM
Notebooks in IBMNotebooks in IBM
Notebooks in IBM
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
 
H2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.ioH2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.io
 

More from Databricks

Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
Databricks
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Databricks
 
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + FugueIntuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Databricks
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
Databricks
 

More from Databricks (20)

Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and Quality
 
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + FugueIntuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
 

Recently uploaded

11th-CS system overview ppt chapter-01.pdf
11th-CS system overview ppt chapter-01.pdf11th-CS system overview ppt chapter-01.pdf
11th-CS system overview ppt chapter-01.pdf
ravimeera74
 
MRP2 hshsbsbenne.pdfdbbdbsbebenebeneneebbe
MRP2 hshsbsbenne.pdfdbbdbsbebenebeneneebbeMRP2 hshsbsbenne.pdfdbbdbsbebenebeneneebbe
MRP2 hshsbsbenne.pdfdbbdbsbebenebeneneebbe
47NehaKJ
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
@Call @Girls in Bangalore 🚒 0000000000 🚒 Tanu Sharma Best High Class Bangalor...
@Call @Girls in Bangalore 🚒 0000000000 🚒 Tanu Sharma Best High Class Bangalor...@Call @Girls in Bangalore 🚒 0000000000 🚒 Tanu Sharma Best High Class Bangalor...
@Call @Girls in Bangalore 🚒 0000000000 🚒 Tanu Sharma Best High Class Bangalor...
ritu36392
 
01 - Motagua 3.0 - 16x9 - Light - [MAIN].pptx
01 - Motagua 3.0 - 16x9 - Light - [MAIN].pptx01 - Motagua 3.0 - 16x9 - Light - [MAIN].pptx
01 - Motagua 3.0 - 16x9 - Light - [MAIN].pptx
CindyBanurea3
 
bcme welcome and ground rule required for bcme course (1).pptx
bcme welcome and ground rule required for bcme course (1).pptxbcme welcome and ground rule required for bcme course (1).pptx
bcme welcome and ground rule required for bcme course (1).pptx
BINITADASH3
 
Delhi @ℂall @Girls ꧁❤ 9711199012 ❤꧂Glamorous sonam Mehra Top Model Safe
Delhi @ℂall @Girls ꧁❤ 9711199012 ❤꧂Glamorous sonam Mehra Top Model SafeDelhi @ℂall @Girls ꧁❤ 9711199012 ❤꧂Glamorous sonam Mehra Top Model Safe
Delhi @ℂall @Girls ꧁❤ 9711199012 ❤꧂Glamorous sonam Mehra Top Model Safe
dipti singh$A17
 
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
roobykhan02154
 
Seamlessly Pay Online, Pay In Stores or Send Money
Seamlessly Pay Online, Pay In Stores or Send MoneySeamlessly Pay Online, Pay In Stores or Send Money
Seamlessly Pay Online, Pay In Stores or Send Money
gargtinna79
 
2024 June - Orange County (CA) Tableau User Group Meeting
2024 June - Orange County (CA) Tableau User Group Meeting2024 June - Orange County (CA) Tableau User Group Meeting
2024 June - Orange County (CA) Tableau User Group Meeting
Alison Pitt
 
( Call  ) Girls Nehru Place 9711199012 Beautiful Girls
( Call  ) Girls Nehru Place 9711199012 Beautiful Girls( Call  ) Girls Nehru Place 9711199012 Beautiful Girls
( Call  ) Girls Nehru Place 9711199012 Beautiful Girls
Nikita Singh$A17
 
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
Amazon Web Services Korea
 
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
Donghwan Lee
 
Streamlining Legacy Complexity Through Modernization
Streamlining Legacy Complexity Through ModernizationStreamlining Legacy Complexity Through Modernization
Streamlining Legacy Complexity Through Modernization
sanjay singh
 
[D3T1S03] Amazon DynamoDB design puzzlers
[D3T1S03] Amazon DynamoDB design puzzlers[D3T1S03] Amazon DynamoDB design puzzlers
[D3T1S03] Amazon DynamoDB design puzzlers
Amazon Web Services Korea
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
javier ramirez
 
BIGPPTTTTTTTTtttttttttttttttttttttt.pptx
BIGPPTTTTTTTTtttttttttttttttttttttt.pptxBIGPPTTTTTTTTtttttttttttttttttttttt.pptx
BIGPPTTTTTTTTtttttttttttttttttttttt.pptx
RajdeepPaul47
 
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
seenu pandey
 
LLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptxLLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptx
Jyotishko Biswas
 

Recently uploaded (20)

11th-CS system overview ppt chapter-01.pdf
11th-CS system overview ppt chapter-01.pdf11th-CS system overview ppt chapter-01.pdf
11th-CS system overview ppt chapter-01.pdf
 
MRP2 hshsbsbenne.pdfdbbdbsbebenebeneneebbe
MRP2 hshsbsbenne.pdfdbbdbsbebenebeneneebbeMRP2 hshsbsbenne.pdfdbbdbsbebenebeneneebbe
MRP2 hshsbsbenne.pdfdbbdbsbebenebeneneebbe
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
 
@Call @Girls in Bangalore 🚒 0000000000 🚒 Tanu Sharma Best High Class Bangalor...
@Call @Girls in Bangalore 🚒 0000000000 🚒 Tanu Sharma Best High Class Bangalor...@Call @Girls in Bangalore 🚒 0000000000 🚒 Tanu Sharma Best High Class Bangalor...
@Call @Girls in Bangalore 🚒 0000000000 🚒 Tanu Sharma Best High Class Bangalor...
 
01 - Motagua 3.0 - 16x9 - Light - [MAIN].pptx
01 - Motagua 3.0 - 16x9 - Light - [MAIN].pptx01 - Motagua 3.0 - 16x9 - Light - [MAIN].pptx
01 - Motagua 3.0 - 16x9 - Light - [MAIN].pptx
 
bcme welcome and ground rule required for bcme course (1).pptx
bcme welcome and ground rule required for bcme course (1).pptxbcme welcome and ground rule required for bcme course (1).pptx
bcme welcome and ground rule required for bcme course (1).pptx
 
Delhi @ℂall @Girls ꧁❤ 9711199012 ❤꧂Glamorous sonam Mehra Top Model Safe
Delhi @ℂall @Girls ꧁❤ 9711199012 ❤꧂Glamorous sonam Mehra Top Model SafeDelhi @ℂall @Girls ꧁❤ 9711199012 ❤꧂Glamorous sonam Mehra Top Model Safe
Delhi @ℂall @Girls ꧁❤ 9711199012 ❤꧂Glamorous sonam Mehra Top Model Safe
 
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
 
Seamlessly Pay Online, Pay In Stores or Send Money
Seamlessly Pay Online, Pay In Stores or Send MoneySeamlessly Pay Online, Pay In Stores or Send Money
Seamlessly Pay Online, Pay In Stores or Send Money
 
2024 June - Orange County (CA) Tableau User Group Meeting
2024 June - Orange County (CA) Tableau User Group Meeting2024 June - Orange County (CA) Tableau User Group Meeting
2024 June - Orange County (CA) Tableau User Group Meeting
 
( Call  ) Girls Nehru Place 9711199012 Beautiful Girls
( Call  ) Girls Nehru Place 9711199012 Beautiful Girls( Call  ) Girls Nehru Place 9711199012 Beautiful Girls
( Call  ) Girls Nehru Place 9711199012 Beautiful Girls
 
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
 
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
 
Streamlining Legacy Complexity Through Modernization
Streamlining Legacy Complexity Through ModernizationStreamlining Legacy Complexity Through Modernization
Streamlining Legacy Complexity Through Modernization
 
[D3T1S03] Amazon DynamoDB design puzzlers
[D3T1S03] Amazon DynamoDB design puzzlers[D3T1S03] Amazon DynamoDB design puzzlers
[D3T1S03] Amazon DynamoDB design puzzlers
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
 
BIGPPTTTTTTTTtttttttttttttttttttttt.pptx
BIGPPTTTTTTTTtttttttttttttttttttttt.pptxBIGPPTTTTTTTTtttttttttttttttttttttt.pptx
BIGPPTTTTTTTTtttttttttttttttttttttt.pptx
 
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
 
LLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptxLLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptx
 

Learn to Use Databricks for Data Science

  • 1. Learn to Use Databricks for Data Science Sean Owen, Principal Solutions Architect Austin Ford, Sr. Product Manager
  • 2. Data Science is a tough job ▪ Today, companies are becoming more and more data-driven, and the ones getting the most out of their data will be the ones to succeed ▪ As a result, Data Science is now a core capability of many businesses ▪ Unfortunately, it comes with a challenging, complex workflow at scale
  • 3. What does a data science workflow look like? I need the correctly sized compute resource for my task I need to be able to find and access the right data sources to fuel my analysis I need to be sure my toolbox is ready with the packages and libraries required for my work 1. Setup I’ve been given a business question to answer with data. Before I can even get started on the data science, I need to set up my development environment.
  • 4. What does a data science workflow look like? I uncover insights through statistical inference, modeling, or other methods I start with exploratory data analysis to familiarize myself with the data and form hypotheses I synthesize the results of my work and the answers to the original business question 2. Data Science Once the initial overhead of setup is complete, the real work begins. At any point, I could be sent back to the Setup phase to add another data source, change the size of my compute resource, or pull in another library.
  • 5. What does a data science workflow look like? I share the results with my business stakeholders via email or Slack I formulate the results into a report or dashboard so they can be consumed I get feedback about my work from my stakeholders and iterate with them to have the biggest impact 3. Sharing Results The most important step comes once I finish the analysis: sharing the results with my stakeholders.
  • 6. Our answer: The Databricks Lakehouse Platform We want to remove the overhead so you can focus on the most important part of your work — data science
  • 7. Structured Semi-structured Unstructured Streaming BI & SQL Analytics Machine Learning Real-time Data Applications Data Management & Governance Open Data Storage Data Science & Engineering Lakehouse Platform Simple | Open | Collaborative Reliable | Scalable | Secure
  • 8. Structured Semi-structured Unstructured Streaming BI & SQL Analytics Machine Learning Real-time Data Applications Data Management & Governance Open Data Storage Data Science & Engineering Lakehouse Platform Simple | Open | Collaborative Reliable | Scalable | Secure Our focus today
  • 9. Databricks makes setup easy 1. Setup The Lakehouse brings all your company’s data together into a single place so you don’t have to go digging through a variety of data sources Easily choose the right compute resource for your task and switch as needed single-machine VMs GPUs Spark clusters Databricks’ runtimes come prepackaged with the most common data science tools, and customization is easy Add Python libraries on top of a runtime with a single line of code
  • 10. Databricks has the tools to enable you to focus on your work 2. Data Science Multi-language, collaborative notebooks with co-presence, commenting, and co-editing Built-in visualizations that take you from raw data to insights in two clicks Auto-logged revision history and a git integration to ensure reproducibility and enable version control
  • 11. Databricks lets your share results and iterate quickly 3. Sharing Results Easily share your notebooks with stakeholders, who can view them as reports Create a dashboard directly from your notebook’s results Iterate with your stakeholders directly in the notebook through comments and co-presence
  • 12. Getting practical: hands-on with an expert Sean Owen Principal Solutions Architect