Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Azure Databricks
TrustedProductive IntelligentHybrid
Azure. Cloud for all.
50+Azure
regions
Trusted
Intelligent
Hybrid
Productive
Learn more:
Microsoft.com/datacenter
© Microsoft Corporation
Open Source Support
Applications
Infrastructure
Management
Databases &
middleware
App frameworks
& tools
DevOps
Modern Data Estate
Azure databricks c sharp corner toronto feb 2019 heather grandy
Microsoft Cloud Big Data Building Blocks
Intelligence
Dashboards &
Visualizations
Big Data Stores Machine Learning
and Analytics
Event Hubs
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Intelligence Action
People
Automated
Systems
Apps
Web
Mobile
Bots
Bot
Framework
Azure SQL Data
Warehouse
Data Factory
Machine
Learning
Data Lake Store
Cognitive
Services
Power BI
Data
Sources
Apps
Sensors
and
devices
Data
Information
Management
Azure SQL DB Azure
Databricks
IoT Hub
CONTROL EASE OF USE
Azure Data Lake
Analytics
Azure Data Lake Store
Azure Storage
Any Hadoop technology,
any distribution
Workload optimized,
managed clusters
Data Engineering in a
Job-as-a-service model
Azure Marketplace
HDP | CDH | MapR
Azure Data Lake
Analytics
IaaS Clusters Managed Clusters Big Data as-a-service
Azure HDInsight
Frictionless & Optimized
Spark clusters
Azure Databricks
BIGDATA
STORAGE
BIGDATA
ANALYTICS
ReducedAdministration
Bring Big Data to Everyone on the Cloud
What is Spark?
S P A R K : A B R I E F H I S T O R Y
A P A C H E S P A R K
A unified, open source, parallel, data processing framework for Big Data Analytics
Spark Core Engine
Spark SQL
Interactive
Queries
Spark Structured
Streaming
Stream processing
Spark MLlib
Machine
Learning
Yarn Mesos
Standalone
Scheduler
Spark MLlib
Machine
Learning
Spark
Streaming
Stream processing
GraphX
Graph
Computation
A P A C H E S P A R K
A unified, open source, parallel, data processing framework for Big Data Analytics
A P A C H E S P A R K
Three Spark APIs: Resilient Distributed Datasets, DataFrames, Datasets
Spark Structured
Streaming
Stream processing
S P A R K - B E N E F I T S
Performance
Using in-memory computing, Spark is considerably
faster than Hadoop (100x in some tests).
Can be used for batch and real-time data processing.
Developer Productivity
Easy-to-use APIs for processing large datasets.
Includes 100+ operators for transforming.
Ecosystem
Spark has built-in support for many data sources,
rich ecosystem of ISV applications and a large dev
community.
Available on multiple public clouds (AWS, Google
and Azure) and multiple on-premises distributors
Unified Engine
Integrated framework includes higher-level libraries
for interactive SQL queries, Stream Analytics, ML and
graph processing.
A single application can combine all types of
processing.
What is Azure Databricks?
D A T A B R I C K S - C O M P A N Y O V E R V I E W
What is Azure Databricks?
A fast, easy and collaborative Apache® Spark™ based analytics platform optimized for Azure
Best of Databricks Best of Microsoft
Designed in collaboration with the founders of Apache Spark
One-click set up; streamlined workflows
Interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.
Native integration with Azure services (Power BI, SQL DW, Cosmos DB, Blob Storage)
Enterprise-grade Azure security (Active Directory integration, compliance, enterprise -grade SLAs)
Get started quickly by launching
your new Spark environment with
one click.
Share your insights in powerful
ways through rich integration with
Power BI.
Improve collaboration amongst
your analytics team through a
unified workspace.
Innovate faster with native
integration with rest of Azure
platform
Simplify security and identity control
with built-in integration with Active
Directory.
Regulate access with fine-grained user
permissions to Azure Databricks’
notebooks, clusters, jobs and data.
Build with confidence on the trusted
cloud backed by unmatched support,
compliance and SLAs.
Operate at massive scale
without limits globally.
Accelerate data processing with
the fastest Spark engine.
ENHANCE PRODUCTIVITY BUILD ON THE MOST COMPLIANT CLOUD SCALE WITHOUT LIMITS
Differentiated Experience on Azure
Optimized Databricks Runtime Engine
DATABRICKS I/O SERVERLESS
Collaborative Workspace
Cloud storage
Data warehouses
Hadoop storage
IoT / streaming data
Rest APIs
Machine learning models
BI tools
Data exports
Data warehouses
Azure Databricks
Enhance Productivity
Deploy Production Jobs & Workflows
APACHE SPARK
MULTI-STAGE PIPELINES
DATA ENGINEER
JOB SCHEDULER NOTIFICATION & LOGS
DATA SCIENTIST BUSINESS ANALYST
Build on secure & trusted cloud Scale without limits
Azure Databricks
Azure databricks c sharp corner toronto feb 2019 heather grandy
Azure databricks c sharp corner toronto feb 2019 heather grandy
Engage Microsoft experts for a workshop to help identify
high impact scenarios
Already using Azure? try Azure Databricks now or
create a free Azure account to start using Azure Databricks
Learn more about Azure Databricks www.azure.com/databricks
How to get started
Next event: IoT + Data OpenHack in Toronto: Tuesday,
March 5th- Thursday, March 7th, 2019
Learn more at https://openhack.microsoft.com/
Microsoft OpenHacks
OpenHacks are developer-focused events where participants
can learn through hands-on experience
Thank you!

More Related Content

Azure databricks c sharp corner toronto feb 2019 heather grandy

  • 4. © Microsoft Corporation Open Source Support Applications Infrastructure Management Databases & middleware App frameworks & tools DevOps
  • 7. Microsoft Cloud Big Data Building Blocks Intelligence Dashboards & Visualizations Big Data Stores Machine Learning and Analytics Event Hubs HDInsight (Hadoop and Spark) Stream Analytics Data Intelligence Action People Automated Systems Apps Web Mobile Bots Bot Framework Azure SQL Data Warehouse Data Factory Machine Learning Data Lake Store Cognitive Services Power BI Data Sources Apps Sensors and devices Data Information Management Azure SQL DB Azure Databricks IoT Hub
  • 8. CONTROL EASE OF USE Azure Data Lake Analytics Azure Data Lake Store Azure Storage Any Hadoop technology, any distribution Workload optimized, managed clusters Data Engineering in a Job-as-a-service model Azure Marketplace HDP | CDH | MapR Azure Data Lake Analytics IaaS Clusters Managed Clusters Big Data as-a-service Azure HDInsight Frictionless & Optimized Spark clusters Azure Databricks BIGDATA STORAGE BIGDATA ANALYTICS ReducedAdministration Bring Big Data to Everyone on the Cloud
  • 10. S P A R K : A B R I E F H I S T O R Y
  • 11. A P A C H E S P A R K A unified, open source, parallel, data processing framework for Big Data Analytics Spark Core Engine Spark SQL Interactive Queries Spark Structured Streaming Stream processing Spark MLlib Machine Learning Yarn Mesos Standalone Scheduler Spark MLlib Machine Learning Spark Streaming Stream processing GraphX Graph Computation
  • 12. A P A C H E S P A R K A unified, open source, parallel, data processing framework for Big Data Analytics
  • 13. A P A C H E S P A R K Three Spark APIs: Resilient Distributed Datasets, DataFrames, Datasets Spark Structured Streaming Stream processing
  • 14. S P A R K - B E N E F I T S Performance Using in-memory computing, Spark is considerably faster than Hadoop (100x in some tests). Can be used for batch and real-time data processing. Developer Productivity Easy-to-use APIs for processing large datasets. Includes 100+ operators for transforming. Ecosystem Spark has built-in support for many data sources, rich ecosystem of ISV applications and a large dev community. Available on multiple public clouds (AWS, Google and Azure) and multiple on-premises distributors Unified Engine Integrated framework includes higher-level libraries for interactive SQL queries, Stream Analytics, ML and graph processing. A single application can combine all types of processing.
  • 15. What is Azure Databricks?
  • 16. D A T A B R I C K S - C O M P A N Y O V E R V I E W
  • 17. What is Azure Databricks? A fast, easy and collaborative Apache® Spark™ based analytics platform optimized for Azure Best of Databricks Best of Microsoft Designed in collaboration with the founders of Apache Spark One-click set up; streamlined workflows Interactive workspace that enables collaboration between data scientists, data engineers, and business analysts. Native integration with Azure services (Power BI, SQL DW, Cosmos DB, Blob Storage) Enterprise-grade Azure security (Active Directory integration, compliance, enterprise -grade SLAs)
  • 18. Get started quickly by launching your new Spark environment with one click. Share your insights in powerful ways through rich integration with Power BI. Improve collaboration amongst your analytics team through a unified workspace. Innovate faster with native integration with rest of Azure platform Simplify security and identity control with built-in integration with Active Directory. Regulate access with fine-grained user permissions to Azure Databricks’ notebooks, clusters, jobs and data. Build with confidence on the trusted cloud backed by unmatched support, compliance and SLAs. Operate at massive scale without limits globally. Accelerate data processing with the fastest Spark engine. ENHANCE PRODUCTIVITY BUILD ON THE MOST COMPLIANT CLOUD SCALE WITHOUT LIMITS Differentiated Experience on Azure
  • 19. Optimized Databricks Runtime Engine DATABRICKS I/O SERVERLESS Collaborative Workspace Cloud storage Data warehouses Hadoop storage IoT / streaming data Rest APIs Machine learning models BI tools Data exports Data warehouses Azure Databricks Enhance Productivity Deploy Production Jobs & Workflows APACHE SPARK MULTI-STAGE PIPELINES DATA ENGINEER JOB SCHEDULER NOTIFICATION & LOGS DATA SCIENTIST BUSINESS ANALYST Build on secure & trusted cloud Scale without limits Azure Databricks
  • 22. Engage Microsoft experts for a workshop to help identify high impact scenarios Already using Azure? try Azure Databricks now or create a free Azure account to start using Azure Databricks Learn more about Azure Databricks www.azure.com/databricks How to get started
  • 23. Next event: IoT + Data OpenHack in Toronto: Tuesday, March 5th- Thursday, March 7th, 2019 Learn more at https://openhack.microsoft.com/ Microsoft OpenHacks OpenHacks are developer-focused events where participants can learn through hands-on experience