Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Hadoop Perspectives for 2017
Simplifying Big Data Integration
Goals of the Modern Data Architecture
• Centralize all your data
• Turn raw data into insights
• Maintain governance, compliance and
security standards
• Eliminate complexities within IT
Global Leader in Big Iron to Big Data Solutions
4Syncsort Confidential and Proprietary - do not copy or distribute
Deep engineering and go-to-market relationships with strategic
partners in Big Iron and Big Data ecosystems
Marquee global customer base of leaders and emerging
businesses across all major industries
PEARL RIVER, NY
SINGAPORE
JAPAN
Trusted Industry Leadership
Providing world-class data management software to large enterprises for
nearly 50 years with a focus on customer value and unparalleled support
Lower Cost, Higher Performance
Our fast, efficient software helps power business and operational analytics
while optimizing resource utilization to dramatically reduce the cost of
managing mainframe and legacy systems
Connecting Big Iron To Big Data
Our solutions address the complex set of requirements for delivering rich
data from mainframe and legacy sources to Big Data environments
Our Strategy: Simplify Big Data Integration
• Deploy on premise or in the cloud
• Choose among multiple execution frameworks – Hadoop, Spark, Linux, Unix,
Windows
• Integrate streaming and batch data with a single data pipeline for innovative
applications, like IoT
• Future-proof applications to avoid re-writing jobs in order to take advantage of
innovations in new execution frameworks
• Access and integrate ALL enterprise data sources – including mainframe – for
advanced analytics
Simplify Big Data Integration with Syncsort
6Syncsort Confidential and Proprietary - do not copy or distribute
Access Integrate Comply Simplify
Get best in class data
ingestion capabilities for
Hadoop. Mainframes,
RDBMS, MPP, JSON,
Parquet, Avro, ORC,
NoSQL, Kafka and more.
Single interface for
streaming and batch
processes. Single data
pipeline for all enterprise
data, batch or streaming.
Secure data access, data
governance and lineage.
Seamless integration
with Kerberos, Apache
Ranger, Apache Ambari,
Cloudera Manager,
Cloudera Navigator and
Sentry.
Design once, deploy
anywhere & insulate
your organization from
rapidly changing eco-
system. Future proof your
applications for new
compute frameworks, on
premise or in the cloud.
Hadoop Perspectives for 2017
Polling Question
What stage of Hadoop implementation is your organization in?
– Researching/Evaluating
– Proof of Concept/Pilot
– In production
– None of the above
About the Survey – Who Responded
270+ IT decision
makers
36% with
revenue of $1B+
Financial
Services
22%
Insurance
7%
Data Services
5%
Government
7%
Healthcare
10%
Non-profit
3%
Retail
8%
Software/SaaS
5%
Telco
8%
Information
Technology
16%
Travel/Hospitality
3%
Other
6%
INDUSTRY
BI/Data Analyst
12%
CIO/CTO/
COO
7%
Data Architect
18%
Data Scientist
9%
Developer
17%
Other IT Exec
5%
IT Manager
21%
Other
11%
ROLE
Compute Frameworks and Cloud Deployment
Multiple frameworks are the norm – 54% use more than one framework
Migration away from MapReduce, toward Spark
More organizations are deploying on premises – but almost a third have a
hybrid deployment
62%
40%
15%
11%
30%
10%
2%
4%
38%
52%
24%
13% 4%
8%
22%
6%
Polling Question
Do you plan on changing execution frameworks in the next 12 months?
– Yes
– No
– I don’t know
Data Sources
Traditional data sources are most
popular for filling the data lake
Newer sources like streaming data
and NoSQL databases on the rise
Even more pronounced for Financial
Services & Insurance:
– 2x more likely to ingest MF data
– 20% more likely to pull data
from RDBMS
Mainframe & Hadoop
71% of the Fortune 500 relies on
mainframe data to conduct 30 billion
business transactions a day
– 96 of the world’s top 100 banks
– 9 out of the world’s 10 largest
insurance companies
– 23 out of the top 25 U.S.
retailers
New solutions make it easier to take
advantage of this critical data with
Hadoop
HOW VALUABLE IS BEING ABLE TO
ACCESS/INTEGRATE MAINFRAME DATA
INTO HADOOP?
The Role of Hadoop
Hadoop & the Data Warehouse
– 70% of all EDW’s are performance
and capacity constrained
– Shifting ELT workloads to Hadoop is
a popular remedy
– However, most don’t expect Hadoop
to replace their existing EDWs
Nearly half of respondents see
Hadoop as a:
– Way to power better, faster analytics
– Cost effective storage/data archive
strategy
Top Use Cases for Hadoop
Polling Question
What are the top use cases for Hadoop in your organization?
– Active archive
– Advanced/predictive analytics
– Data discover/visualization
– ETL
– Clickstream/social media data analysis
– Internet of Things
– EDW/Mainframe offload
– Operational analytics
– Real-time analytics
Significant Business Benefits Expected
Polling Question
What is the #1 challenge/barrier for implementing Hadoop in your organization?
– Lack of skills/staffing
– Cost
– Connectivity to existing data sources/apps
– Rapid change in tools/technology
– Difficulty moving data in/out of Hadoop
– Uncertainty
Top 3 Challenges/Barriers to Hadoop Adoption
Skills/Staffing:
– 58% of respondents ranked it as a
significant challenge
– Top challenge reported for the 3rd
year in a row
Rapid Change:
– 53% of respondents ranked it as a
significant challenge
– Difficulty keeping up with evolving
compute frameworks and Hadoop-
related tools
Connectivity:
– 48% of respondents ranked it as a
significant challenge
1 2 3
What’s coming in 2017
1
2
3
4
5
Success begets success
Big Data insights will take their seat in the boardroom
Data governance grows in importance
Tools will become more flexible to adapt to changing
and hybrid environments
Organizations will seek simplicity to sustain & scale early
success
Thank You

More Related Content

Hadoop Perspectives for 2017

  • 2. Simplifying Big Data Integration
  • 3. Goals of the Modern Data Architecture • Centralize all your data • Turn raw data into insights • Maintain governance, compliance and security standards • Eliminate complexities within IT
  • 4. Global Leader in Big Iron to Big Data Solutions 4Syncsort Confidential and Proprietary - do not copy or distribute Deep engineering and go-to-market relationships with strategic partners in Big Iron and Big Data ecosystems Marquee global customer base of leaders and emerging businesses across all major industries PEARL RIVER, NY SINGAPORE JAPAN Trusted Industry Leadership Providing world-class data management software to large enterprises for nearly 50 years with a focus on customer value and unparalleled support Lower Cost, Higher Performance Our fast, efficient software helps power business and operational analytics while optimizing resource utilization to dramatically reduce the cost of managing mainframe and legacy systems Connecting Big Iron To Big Data Our solutions address the complex set of requirements for delivering rich data from mainframe and legacy sources to Big Data environments
  • 5. Our Strategy: Simplify Big Data Integration • Deploy on premise or in the cloud • Choose among multiple execution frameworks – Hadoop, Spark, Linux, Unix, Windows • Integrate streaming and batch data with a single data pipeline for innovative applications, like IoT • Future-proof applications to avoid re-writing jobs in order to take advantage of innovations in new execution frameworks • Access and integrate ALL enterprise data sources – including mainframe – for advanced analytics
  • 6. Simplify Big Data Integration with Syncsort 6Syncsort Confidential and Proprietary - do not copy or distribute Access Integrate Comply Simplify Get best in class data ingestion capabilities for Hadoop. Mainframes, RDBMS, MPP, JSON, Parquet, Avro, ORC, NoSQL, Kafka and more. Single interface for streaming and batch processes. Single data pipeline for all enterprise data, batch or streaming. Secure data access, data governance and lineage. Seamless integration with Kerberos, Apache Ranger, Apache Ambari, Cloudera Manager, Cloudera Navigator and Sentry. Design once, deploy anywhere & insulate your organization from rapidly changing eco- system. Future proof your applications for new compute frameworks, on premise or in the cloud.
  • 8. Polling Question What stage of Hadoop implementation is your organization in? – Researching/Evaluating – Proof of Concept/Pilot – In production – None of the above
  • 9. About the Survey – Who Responded 270+ IT decision makers 36% with revenue of $1B+ Financial Services 22% Insurance 7% Data Services 5% Government 7% Healthcare 10% Non-profit 3% Retail 8% Software/SaaS 5% Telco 8% Information Technology 16% Travel/Hospitality 3% Other 6% INDUSTRY BI/Data Analyst 12% CIO/CTO/ COO 7% Data Architect 18% Data Scientist 9% Developer 17% Other IT Exec 5% IT Manager 21% Other 11% ROLE
  • 10. Compute Frameworks and Cloud Deployment Multiple frameworks are the norm – 54% use more than one framework Migration away from MapReduce, toward Spark More organizations are deploying on premises – but almost a third have a hybrid deployment 62% 40% 15% 11% 30% 10% 2% 4% 38% 52% 24% 13% 4% 8% 22% 6%
  • 11. Polling Question Do you plan on changing execution frameworks in the next 12 months? – Yes – No – I don’t know
  • 12. Data Sources Traditional data sources are most popular for filling the data lake Newer sources like streaming data and NoSQL databases on the rise Even more pronounced for Financial Services & Insurance: – 2x more likely to ingest MF data – 20% more likely to pull data from RDBMS
  • 13. Mainframe & Hadoop 71% of the Fortune 500 relies on mainframe data to conduct 30 billion business transactions a day – 96 of the world’s top 100 banks – 9 out of the world’s 10 largest insurance companies – 23 out of the top 25 U.S. retailers New solutions make it easier to take advantage of this critical data with Hadoop HOW VALUABLE IS BEING ABLE TO ACCESS/INTEGRATE MAINFRAME DATA INTO HADOOP?
  • 14. The Role of Hadoop Hadoop & the Data Warehouse – 70% of all EDW’s are performance and capacity constrained – Shifting ELT workloads to Hadoop is a popular remedy – However, most don’t expect Hadoop to replace their existing EDWs Nearly half of respondents see Hadoop as a: – Way to power better, faster analytics – Cost effective storage/data archive strategy
  • 15. Top Use Cases for Hadoop
  • 16. Polling Question What are the top use cases for Hadoop in your organization? – Active archive – Advanced/predictive analytics – Data discover/visualization – ETL – Clickstream/social media data analysis – Internet of Things – EDW/Mainframe offload – Operational analytics – Real-time analytics
  • 18. Polling Question What is the #1 challenge/barrier for implementing Hadoop in your organization? – Lack of skills/staffing – Cost – Connectivity to existing data sources/apps – Rapid change in tools/technology – Difficulty moving data in/out of Hadoop – Uncertainty
  • 19. Top 3 Challenges/Barriers to Hadoop Adoption Skills/Staffing: – 58% of respondents ranked it as a significant challenge – Top challenge reported for the 3rd year in a row Rapid Change: – 53% of respondents ranked it as a significant challenge – Difficulty keeping up with evolving compute frameworks and Hadoop- related tools Connectivity: – 48% of respondents ranked it as a significant challenge 1 2 3
  • 20. What’s coming in 2017 1 2 3 4 5 Success begets success Big Data insights will take their seat in the boardroom Data governance grows in importance Tools will become more flexible to adapt to changing and hybrid environments Organizations will seek simplicity to sustain & scale early success