Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
avkash@bigdataperspective.com
Lets Start and 
Define Big 
Data
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Lets Start 
and 
Define 
Big Data 
How 
Hadoop 
Fits in this 
scenario
http://www.packtpub.com/using-cloudera-impala/book 
http://www.amazon.com/Simplifying-Windows-Azure-HDInsight-Service/dp/0735673802 
http://blogs.msdn.com/b/microsoft_press/archive/2014/05/27/free-ebook-introducing-microsoft-azure-hdinsight.aspx 
https://www.linkedin.com/in/avkashchauhan
Introduction to Big Data Analytics on Apache Hadoop
Hadoop is an Open Source (Java based), “Scalable”, “fault 
tolerant” platform for large amount of unstructured data storage 
& processing, distributed across machines.
Flexibility 
A Single Repo for 
storing and analyzing 
any kind of data not 
bounded by schema 
Scalability 
Scale-out architecture 
divides workload across 
multiple nodes using flexible 
distributed file system 
Low Cost 
Deployed on 
commodity 
hardware & open 
source platform 
Fault Tolerant 
Continue working 
event if node(s) go 
down
A system to move computation, where the data is.
Lets Start 
and Define 
Big Data 
Hadoop 
Landscape 
How 
Hadoop 
Fits in this 
scenario
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Lets Start 
and Define 
Big Data 
How Hadoop 
Fits in this 
scenario 
Hadoop 
Core 
Components 
Hadoop 
Landscape
Data 
Storage 
Data 
Processing
Introduction to Big Data Analytics on Apache Hadoop
HDFS 
MapReduce 
/YARN 
Hadoop Common
Cloud
Lets Start 
and Define 
Big Data 
How 
Hadoop Fits 
in this 
scenario 
Hadoop 
Landscape 
Applying 
Hadoop to 
Save $$ 
Hadoop 
Core 
Components
Introduction to Big Data Analytics on Apache Hadoop
Lets Start 
and Define 
Big Data 
How Hadoop 
Fits in this 
scenario 
Hadoop 
Landscape 
Hadoop Core 
Components 
Applying 
Hadoop to 
Save $$ 
Concept of 
Data Lake
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Lets Start 
and Define 
Big Data 
How 
Hadoop Fits 
in this 
scenario 
Hadoop 
Landscape 
Hadoop 
Core 
Components 
Concept of 
Data Lake 
Applying 
Hadoop to 
Save $$ 
Hadoop in 
Cloud
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Lets Start 
and Define 
Big Data 
How Hadoop 
Fits in this 
scenario 
Hadoop 
Landscape 
Hadoop Core 
Components 
Applying 
Hadoop to 
Save $$ 
Hadoop in 
Cloud 
Concept of 
Data Lake 
Big Data 
Analytics
Introduction to Big Data Analytics on Apache Hadoop
EDW 
OLAP 
ODS
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Lets Start 
and Define 
Big Data 
How Hadoop 
Fits in this 
scenario 
Hadoop 
Landscape 
Hadoop Core 
Components 
Applying 
Hadoop to 
Save $$ 
Hadoop in 
Cloud 
Concept of 
Data Lake 
Big Data 
Analytics 
With Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Amazon HDInsight Directives 
Data Storage S3 Azure Blobs Direct access to compute 
machine to super fast data 
delivery 
Processing EC2 
Azure Compute Dedicated Machines ready to 
turn with specific version of 
Hadoop runtime 
Processing Libraries Java based or any 
other language 
supported through 
Hadoop Streaming 
.Net based code User uploads their code 
processing binaries/ libraries 
Results S3 Azure Blobs Once job is completed the 
results are stored back to 
specific data storage used as 
source 
Visualization Custom Custom 3rd party application can 
connect to storage to perform 
visualization
Lets Start 
and Define 
Big Data 
How Hadoop 
Fits in this 
scenario 
Hadoop 
Landscape 
Hadoop Core 
Components 
Applying 
Hadoop to 
Save $$ 
Hadoop in 
Cloud 
Concept of 
Data Lake 
Big Data 
Analytics 
With Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
http://blogs.msdn.com/b/microsoft_press/archive/2014/05/27/free-ebook-introducing-microsoft-azure-hdinsight.aspx

More Related Content

Introduction to Big Data Analytics on Apache Hadoop