Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Big Data in Azure
Big Data is Changing
Traditional Data
Warehousing
… data warehousing has reached the
most significant tipping point since
its inception. The biggest, possibly
most elaborate data management
system in IT is changing.
– Gartner, “The State of Data Warehousing”*
* Donald Feinberg, Mark Beyer, Merv Adrian, Roxane Edjlali (Gartner), The State of Data Warehousing in 2012 (Stamford, CT.: Gartner, 2012)
Data Sources
ETL
Data Warehouse
BI and Analytics
Big Data is Driving Transformative Changes
Traditional Big Data
Relational Data
with highly modeled schema
All Data
with schema agility
Specialized Hardware Commodity Hardware
Data
characteristics
Costs
Culture
Operational Reporting
Focus on rear-view analysis
Experimentation leading
to intelligent action
With machine learning, graph, a/b testing
Big Data Introduces a
Culture of Experimentation
Tangerine instantly adapts to customer feedback to
offer customers what they want, when they want it
“I can see us…creating predictive, context-aware financial
services applications that give information based on
the time and where the customer is.”
Billy Lo
Head of Enterprise Architecture
Scenario
Lack of insight for targeted campaigns
Inability to support data growth
Solution
Azure HDInsight (Hadoop-as-a-service) with the Analytics
Platform System (APS) enables instant analysis of social
sentiment and customer feedback across digital, face-to-
face and phone.
Result
• Reduced time to customer insight
• Ability to make changes to campaigns or adjust product
rollouts based on real-time customer reactions
• Ability to offer incentives and new services to retain—
and grow—its customer base
However, there are challenges to Big Data…
Obtaining skills
and capabilities
Determining how
to get value
Integrating with
existing IT investments
*Gartner: Survey Analysis – Hadoop Adoption Drivers and Challenges (Stamford, CT.: Gartner, 2015)
But, Microsoft has done it before
We needed to better leverage data and analytics to do
more experimentation
So we:
• Designed a data lake for everyone to put their data into
• Built tools approachable by any developer
• Created machine learning tools for collaborating
across large experiment models
Result:
• Across Microsoft, ten thousand developers doing
experimentation leading to better insights
• Leading to growth in our Microsoft businesses:
• Office productivity revenue (45%YoY)*
• Intelligent Cloud (100% YoY)*
• Bing search share doubles
2010 2011 2012 2013 2014 2015
Growth of data @ Microsoft
Windows
SMSG
Live
Bing
CRM/Dynamics
Xbox Live
Office365
Malware Protection Microsoft Stores
Commerce Risk
Skype
LCA
Exchange
Yammer
PetabytesExabytes
* Microsoft. FY16 Q4 Results, URL: http://www.microsoft.com/en-us/Investor/earnings/FY-2016-Q4/press-release-webcast
Microsoft is now taking
everything we’ve
learned on this journey
and bringing it to our
customers
Technology. Cost. Culture.
Big Data as a Cornerstone of Cortana Intelligence
Action
People
Automated
Systems
Apps
Web
Mobile
Bots
Intelligence
Dashboards &
Visualizations
Cortana
Bot
Framework
Cognitive
Services
Power BI
Information
Management
Event Hubs
Data Catalog
Data Factory
Machine Learning
and Analytics
HDInsight
(Hadoop / Spark)
Stream Analytics
Intelligence
Data Lake
Analytics
Machine
Learning
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data
Sources
Apps
Sensors
and
devices
Data
Azure
HDInsight
Hadoop and Spark
as a Service on Azure
Fully-managed Hadoop and Spark
for the cloud
100% Open Source Hortonworks
data platform
Clusters up and running in minutes
Managed, monitored and supported
by Microsoft with the industry’s best SLA
Familiar BI tools for analysis, or open source
notebooks for interactive data science
63% lower TCO than deploy your own
Hadoop on-premises*
*IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
Comprehensive Set of Managed Apache Big Data Projects
• Scale to petabytes on demand
• Process unstructured and semi-structured data
• Develop in Java, .NET, and more
• Skip buying and maintaining hardware
• Deploy in Windows or Linux
• Spin up an Apache Hadoop cluster in minutes
• Visualize your Hadoop data in Excel
• Easily integrate on-premises Hadoop clusters
Core Engine
Batch
Map Reduce
Script
Pig
SQL
Hive
NoSQL
HBase
Streaming
Storm
In-Memory
Spark
Azure
Data Lake Store
A Hyper-Scale
Repository for Big Data
Analytics Workloads
Hadoop File System (HDFS) for the cloud
No limits to scale
Store any data in its native format
Enterprise-grade access control,
encryption at rest
Optimized for analytic workload performance
Azure Data Lake Store
Distributed, parallel file system in
the cloud
Performance-tuned and optimized
for analytics
No fixed size limits
Stores all data types
Highly available with local & geo
redundant storage
WebHDFS REST API
Supported by leading
Hadoop distros
Role-based security
Low latency and high throughput
workloads
YARN
HDFS
HDInsightAnalytics
Service
Store
U-SQL
Clickstream
Sensors
Video
Social
Web
Devices
Relational
Applications
Azure
Data Lake Analytics
A new distributed
analytics service
Distributed analytics service built on
Apache YARN
Elastic scale per query lets users focus on
business goals—not configuring hardware
Includes U-SQL—a language that unifies the
benefits of SQL with the expressive
power of C#
Integrates with Visual Studio to develop,
debug, and tune code faster
Federated query across Azure data sources
Enterprise-grade role based access control
Typical Azure Big Data Architecture
Azure
API
Management
Backend Services
Data
sources
Apps
Sensors
and
devices
Event Hubs
Machine Learning
HDInsight
(Apache Spark)
Storage
Power BIStream Analytics
SQL Data Warehouse
Azure Data Factory & Azure Data Catalog
Highest availability
guarantee in the industry
for peace of mind
• Managed, monitored and
supported by Microsoft
• Enterprise-leading SLA—
99.9% uptime
• No IT resources needed for
upgrades and patching
• Microsoft monitors your
deployment so you don’t
have to
*Applies to HDInsight only
99.9% SLA
Runs in the Most Datacenters Worldwide
Azure doubling compute
and storage every 6 months
*Applies to HDInsight only
Central US
Iowa
West US
California
East US
Virginia
North Central US
Illinois
South Central US
Texas
Brazil South
Sao Paulo State
West Europe
Netherlands
China North*
Beijing
China South*
Shanghai
Japan East
Tokyo, Saitama
Japan West
Osaka
East Asia
Hong Kong
SE Asia
Singapore
Australia South East
Victoria
Australia East
New South Wales
India Central
Pune
North Europe
Ireland
East US 2
Virginia
Lower Total Cost
of Ownership
• No hardware
• Hadoop support included with
Azure support
• Pay only for what you use
• Independently scale storage
and compute
• No need to hire specialized
operations team
• 63% lower total cost of
ownership than on-premises*
*IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud
with Microsoft Azure HDInsight”
Recognized by
Top Analysts
Forrester Wave for Big Data
Hadoop Cloud
• Named industry leader by
Forrester with the most
comprehensive, scalable, and
integrated platforms*
• Recognized for its cloud-first
strategy that is paying off*
*The Forrester WaveTM: Big Data Hadoop Cloud Solutions, Q2 2016.
Microsoft Data
Science Summit
Get hands-on with the latest cutting edge technologies
with Big Data, Machine Learning and Open Source
at the Microsoft Data Science Summit.
Hear from thought leaders, data scientists, engineers and
customers solving real world problems, make expert connections
to help you put these technologies to work for your business.
September 26-27, 2016
Atlanta, GA
Register Now!
aka.ms/microsoftdatasciencesummit
Target audience
• Data Scientists
• Big Data Engineers
• Machine Learning Practitioners/Engineers
• Data Science/Engineering Managers
Why attend
Readiness with architectural guidance &
hands-on training to operationalize
solutions at scale
Real world examples with how to apply
machine learning & data science techniques
to your business
Networking with the experts and the
community to bring your data to life
© 2016 Microsoft Corporation. All rights reserved.

More Related Content

Big Data in Azure

  • 2. Big Data is Changing Traditional Data Warehousing … data warehousing has reached the most significant tipping point since its inception. The biggest, possibly most elaborate data management system in IT is changing. – Gartner, “The State of Data Warehousing”* * Donald Feinberg, Mark Beyer, Merv Adrian, Roxane Edjlali (Gartner), The State of Data Warehousing in 2012 (Stamford, CT.: Gartner, 2012) Data Sources ETL Data Warehouse BI and Analytics
  • 3. Big Data is Driving Transformative Changes Traditional Big Data Relational Data with highly modeled schema All Data with schema agility Specialized Hardware Commodity Hardware Data characteristics Costs Culture Operational Reporting Focus on rear-view analysis Experimentation leading to intelligent action With machine learning, graph, a/b testing
  • 4. Big Data Introduces a Culture of Experimentation Tangerine instantly adapts to customer feedback to offer customers what they want, when they want it “I can see us…creating predictive, context-aware financial services applications that give information based on the time and where the customer is.” Billy Lo Head of Enterprise Architecture Scenario Lack of insight for targeted campaigns Inability to support data growth Solution Azure HDInsight (Hadoop-as-a-service) with the Analytics Platform System (APS) enables instant analysis of social sentiment and customer feedback across digital, face-to- face and phone. Result • Reduced time to customer insight • Ability to make changes to campaigns or adjust product rollouts based on real-time customer reactions • Ability to offer incentives and new services to retain— and grow—its customer base
  • 5. However, there are challenges to Big Data… Obtaining skills and capabilities Determining how to get value Integrating with existing IT investments *Gartner: Survey Analysis – Hadoop Adoption Drivers and Challenges (Stamford, CT.: Gartner, 2015)
  • 6. But, Microsoft has done it before We needed to better leverage data and analytics to do more experimentation So we: • Designed a data lake for everyone to put their data into • Built tools approachable by any developer • Created machine learning tools for collaborating across large experiment models Result: • Across Microsoft, ten thousand developers doing experimentation leading to better insights • Leading to growth in our Microsoft businesses: • Office productivity revenue (45%YoY)* • Intelligent Cloud (100% YoY)* • Bing search share doubles 2010 2011 2012 2013 2014 2015 Growth of data @ Microsoft Windows SMSG Live Bing CRM/Dynamics Xbox Live Office365 Malware Protection Microsoft Stores Commerce Risk Skype LCA Exchange Yammer PetabytesExabytes * Microsoft. FY16 Q4 Results, URL: http://www.microsoft.com/en-us/Investor/earnings/FY-2016-Q4/press-release-webcast
  • 7. Microsoft is now taking everything we’ve learned on this journey and bringing it to our customers Technology. Cost. Culture.
  • 8. Big Data as a Cornerstone of Cortana Intelligence Action People Automated Systems Apps Web Mobile Bots Intelligence Dashboards & Visualizations Cortana Bot Framework Cognitive Services Power BI Information Management Event Hubs Data Catalog Data Factory Machine Learning and Analytics HDInsight (Hadoop / Spark) Stream Analytics Intelligence Data Lake Analytics Machine Learning Big Data Stores SQL Data Warehouse Data Lake Store Data Sources Apps Sensors and devices Data
  • 9. Azure HDInsight Hadoop and Spark as a Service on Azure Fully-managed Hadoop and Spark for the cloud 100% Open Source Hortonworks data platform Clusters up and running in minutes Managed, monitored and supported by Microsoft with the industry’s best SLA Familiar BI tools for analysis, or open source notebooks for interactive data science 63% lower TCO than deploy your own Hadoop on-premises* *IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
  • 10. Comprehensive Set of Managed Apache Big Data Projects • Scale to petabytes on demand • Process unstructured and semi-structured data • Develop in Java, .NET, and more • Skip buying and maintaining hardware • Deploy in Windows or Linux • Spin up an Apache Hadoop cluster in minutes • Visualize your Hadoop data in Excel • Easily integrate on-premises Hadoop clusters Core Engine Batch Map Reduce Script Pig SQL Hive NoSQL HBase Streaming Storm In-Memory Spark
  • 11. Azure Data Lake Store A Hyper-Scale Repository for Big Data Analytics Workloads Hadoop File System (HDFS) for the cloud No limits to scale Store any data in its native format Enterprise-grade access control, encryption at rest Optimized for analytic workload performance
  • 12. Azure Data Lake Store Distributed, parallel file system in the cloud Performance-tuned and optimized for analytics No fixed size limits Stores all data types Highly available with local & geo redundant storage WebHDFS REST API Supported by leading Hadoop distros Role-based security Low latency and high throughput workloads YARN HDFS HDInsightAnalytics Service Store U-SQL Clickstream Sensors Video Social Web Devices Relational Applications
  • 13. Azure Data Lake Analytics A new distributed analytics service Distributed analytics service built on Apache YARN Elastic scale per query lets users focus on business goals—not configuring hardware Includes U-SQL—a language that unifies the benefits of SQL with the expressive power of C# Integrates with Visual Studio to develop, debug, and tune code faster Federated query across Azure data sources Enterprise-grade role based access control
  • 14. Typical Azure Big Data Architecture Azure API Management Backend Services Data sources Apps Sensors and devices Event Hubs Machine Learning HDInsight (Apache Spark) Storage Power BIStream Analytics SQL Data Warehouse Azure Data Factory & Azure Data Catalog
  • 15. Highest availability guarantee in the industry for peace of mind • Managed, monitored and supported by Microsoft • Enterprise-leading SLA— 99.9% uptime • No IT resources needed for upgrades and patching • Microsoft monitors your deployment so you don’t have to *Applies to HDInsight only 99.9% SLA
  • 16. Runs in the Most Datacenters Worldwide Azure doubling compute and storage every 6 months *Applies to HDInsight only Central US Iowa West US California East US Virginia North Central US Illinois South Central US Texas Brazil South Sao Paulo State West Europe Netherlands China North* Beijing China South* Shanghai Japan East Tokyo, Saitama Japan West Osaka East Asia Hong Kong SE Asia Singapore Australia South East Victoria Australia East New South Wales India Central Pune North Europe Ireland East US 2 Virginia
  • 17. Lower Total Cost of Ownership • No hardware • Hadoop support included with Azure support • Pay only for what you use • Independently scale storage and compute • No need to hire specialized operations team • 63% lower total cost of ownership than on-premises* *IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
  • 18. Recognized by Top Analysts Forrester Wave for Big Data Hadoop Cloud • Named industry leader by Forrester with the most comprehensive, scalable, and integrated platforms* • Recognized for its cloud-first strategy that is paying off* *The Forrester WaveTM: Big Data Hadoop Cloud Solutions, Q2 2016.
  • 19. Microsoft Data Science Summit Get hands-on with the latest cutting edge technologies with Big Data, Machine Learning and Open Source at the Microsoft Data Science Summit. Hear from thought leaders, data scientists, engineers and customers solving real world problems, make expert connections to help you put these technologies to work for your business. September 26-27, 2016 Atlanta, GA Register Now! aka.ms/microsoftdatasciencesummit Target audience • Data Scientists • Big Data Engineers • Machine Learning Practitioners/Engineers • Data Science/Engineering Managers Why attend Readiness with architectural guidance & hands-on training to operationalize solutions at scale Real world examples with how to apply machine learning & data science techniques to your business Networking with the experts and the community to bring your data to life
  • 20. © 2016 Microsoft Corporation. All rights reserved.

Editor's Notes

  1. Get it at //aka.ms/forresterwave