Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial | Simplilearn
What’s in it for you?
1. History of Spark
What’s in it for you?
What’s in it for you?
1. History of Spark
2. What is Spark?
What’s in it for you?
What’s in it for you?
1. History of Spark
2. What is Spark?
3. Hadoop vs Spark
What’s in it for you?
What’s in it for you?
1. History of Spark
2. What is Spark?
3. Hadoop vs Spark
4. Components of Apache Spark
What’s in it for you?
Spark Core
Spark SQL
Spark Streaming
Spark MLlib
GraphX
What’s in it for you?
1. History of Spark
2. What is Spark?
3. Hadoop vs Spark
4. Components of Apache Spark
5. Spark Architecture
What’s in it for you?
What’s in it for you?
1. History of Spark
2. What is Spark?
3. Hadoop vs Spark
4. Components of Apache Spark
5. Spark Architecture
6. Applications of Spark
What’s in it for you?
What’s in it for you?
1. History of Spark
2. What is Spark?
3. Hadoop vs Spark
4. Components of Apache Spark
5. Spark Architecture
6. Applications of Spark
7. Spark Use Case
What’s in it for you?
History of Apache Spark
Started as a project at UC
Berkley AMPLab
2009
History of Apache Spark
Started as a project at UC
Berkley AMPLab
Open sourced under a
BSD license
2009
2010
History of Apache Spark
Started as a project at UC
Berkley AMPLab
Open sourced under a
BSD license
Spark became an Apache top
level project
2009
2010
2013
History of Apache Spark
Started as a project at UC
Berkley AMPLab
Open sourced under a
BSD license
Spark became an Apache top
level project
Used by Databricks to sort
large-scale datasets and set a
new world record
2009
2010
2013
2014
History of Apache Spark
What is Apache Spark?
What is Apache Spark?
Apache Spark is an open-source data processing engine to store and process data in
real-time across various clusters of computers using simple programming constructs
What is Apache Spark?
Support various programming languages
Apache Spark is an open-source data processing engine to store and process data in
real-time across various clusters of computers using simple programming constructs
What is Apache Spark?
Support various programming languages Developers and data scientists incorporate
Spark into their applications to rapidly
query, analyze, and transform data at
scale
Query Analyze Transform
Apache Spark is an open-source data processing engine to store and process data in
real-time across various clusters of computers using simple programming constructs
History of Apache Spark
Hadoop vs Spark
Hadoop vs Spark
Processing data using MapReduce in Hadoop is slow
Spark processes data 100 times faster than MapReduce as it is done in-
memory
Hadoop vs Spark
Processing data using MapReduce in Hadoop is slow
Spark processes data 100 times faster than MapReduce as it is done in-
memory
Performs batch processing of data Performs both batch processing and real-time processing of data
Hadoop vs Spark
Processing data using MapReduce in Hadoop is slow
Spark processes data 100 times faster than MapReduce as it is done in-
memory
Performs batch processing of data Performs both batch processing and real-time processing of data
Hadoop has more lines of code. Since it is written in Java, it takes
more time to execute
Spark has fewer lines of code as it is implemented in Scala
Hadoop vs Spark
Processing data using MapReduce in Hadoop is slow
Spark processes data 100 times faster than MapReduce as it is done in-
memory
Performs batch processing of data Performs both batch processing and real-time processing of data
Hadoop has more lines of code. Since it is written in Java, it takes
more time to execute
Spark has fewer lines of code as it is implemented in Scala
Hadoop supports Kerberos authentication, which is difficult to manage Spark supports authentication via a shared secret. It can also
run on YARN leveraging the capability of Kerberos
History of Apache Spark
Spark Features
Spark Features
Fast processing
Spark contains Resilient Distributed
Datasets (RDD) which saves time
taken in reading, and writing
operations and hence, it runs almost
ten to hundred times faster than
Hadoop
Spark Features
In-memory
computing
In Spark, data is stored in the RAM,
so it can access the data quickly and
accelerate the speed of analytics
Fast processing
Spark Features
Flexible
Spark supports multiple languages
and allows the developers to write
applications in Java, Scala, R, or
Python
In-memory
computingFast processing
Spark Features
Fault tolerance
Spark contains Resilient Distributed
Datasets (RDD) that are designed to
handle the failure of any worker
node in the cluster. Thus, it ensures
that the loss of data reduces to zero
Flexible
In-memory
computingFast processing
Spark Features
Better analytics
Spark has a rich set of SQL queries,
machine learning algorithms,
complex analytics, etc. With all these
functionalities, analytics can be
performed better
Fault toleranceFlexible
In-memory
computingFast processing
History of Apache Spark
Components of Spark
Components of Apache Spark
Spark Core
Components of Apache Spark
Spark Core Spark SQL
SQL
Components of Apache Spark
Spark
Streaming
Spark Core Spark SQL
SQL Streaming
Components of Apache Spark
MLlib
Spark
Streaming
Spark Core Spark SQL
SQL Streaming MLlib
Components of Apache Spark
MLlib
Spark
Streaming
Spark Core Spark SQL GraphX
SQL Streaming MLlib
History of Apache Spark
Components of Spark –
Spark Core
Spark Core
Spark Core
Spark Core is the base engine for large-scale parallel and distributed
data processing
Spark Core
Spark Core
Spark Core is the base engine for large-scale parallel and distributed
data processing
It is responsible for:
memory management fault recovery
scheduling, distributing and
monitoring jobs on a cluster
interacting with storage
systems
Resilient Distributed Dataset
Spark Core
Spark Core is embedded with RDDs (Resilient Distributed Datasets), an
immutable fault-tolerant, distributed collection of objects that can be operated on
in parallel
RDD
Transformation Action
These are operations (such as reduce,
first, count) that return
a value after running a computation on
an RDD
These are operations (such as map, filter,
join, union) that are performed on an RDD
that yields a new RDD containing the
result
History of Apache Spark
Components of Spark –
Spark SQL
Spark SQL
Spark SQL framework component is used for structured and semi-structured data
processing
Spark SQL
SQL
Spark SQL
Spark SQL framework component is used for structured and semi-structured data
processing
Spark SQL
SQL
DataFrame DSL Spark SQL and HQL
DataFrame API
Data Source API
CSV JSON JDBC
Spark SQL Architecture
History of Apache Spark
Components of Spark –
Spark Streaming
Spark Streaming
Spark Streaming is a lightweight API that allows developers to perform batch
processing and real-time streaming of data with ease
Spark
Streaming
Streaming
Provides secure, reliable, and fast processing of live data
streams
Spark Streaming
Spark Streaming is a lightweight API that allows developers to perform batch
processing and real-time streaming of data with ease
Spark
Streaming
Streaming
Provides secure, reliable, and fast processing of live data
streams
Streaming Engine
Input data
stream
Batches of
input data
Batches of
processed
data
History of Apache Spark
Components of Spark –
Spark MLlib
Spark MLlib
MLlib is a low-level machine learning library that is simple to use,
is scalable, and compatible with various programming languages
MLlib
MLlib
MLlib eases the deployment and development of
scalable machine learning algorithms
Spark MLlib
MLlib is a low-level machine learning library that is simple to use,
is scalable, and compatible with various programming languages
MLlib
MLlib
MLlib eases the deployment and development of
scalable machine learning algorithms
It contains machine learning libraries that have an
implementation of various machine learning algorithms
Clustering Classification Collaborative
Filtering
History of Apache Spark
Components of Spark –
GraphX
GraphX
GraphX is Spark’s own Graph Computation Engine and data store
GraphX
GraphX
GraphX is Spark’s own Graph Computation Engine and data store
GraphX
Provides a uniform tool for ETL Exploratory data analysis
Interactive graph computations
History of Apache Spark
Spark Architecture
Master Node
Driver Program
SparkContext
• Master Node has a Driver Program
• The Spark code behaves as a driver
program and creates a SparkContext,
which is a gateway to all the Spark
functionalities
Apache Spark uses a master-slave architecture that consists of a driver, that runs on a
master node, and multiple executors which run across the worker nodes in the cluster
Spark Architecture
Master Node
Driver Program
SparkContext Cluster Manager
• Spark applications run as independent
sets of processes
on a cluster
• The driver program & Spark context
takes care of the job execution within
the cluster
Spark Architecture
Master Node
Driver Program
SparkContext Cluster Manager
Cache
Task Task
Executor
Worker Node
Cache
Task Task
Executor
Worker Node
• A job is split into multiple tasks that are
distributed over the worker node
• When an RDD is created in Spark
context, it can be distributed across
various nodes
• Worker nodes are slaves that run
different tasks
Spark Architecture
Master Node
Driver Program
SparkContext Cluster Manager
Cache
Task Task
Executor
Worker Node
Cache
Task Task
Executor
Worker Node
• The Executor is responsible for the
execution of these tasks
• Worker nodes execute the tasks
assigned by the Cluster Manager and
return the results back to the
SparkContext
Spark Architecture
Spark Cluster Managers
Standalone mode
1
By default, applications
submitted to the
standalone mode cluster
will run in FIFO order,
and each application will
try to use all available
nodes
Spark Cluster Managers
Standalone mode
1 2
By default, applications
submitted to the
standalone mode cluster
will run in FIFO order,
and each application will
try to use all available
nodes
Apache Mesos is an
open-source project to
manage computer
clusters, and can also run
Hadoop applications
Spark Cluster Managers
Standalone mode
1 2 3
By default, applications
submitted to the
standalone mode cluster
will run in FIFO order,
and each application will
try to use all available
nodes
Apache Mesos is an
open-source project to
manage computer
clusters, and can also run
Hadoop applications
Apache YARN is the
cluster resource manager
of Hadoop 2. Spark can
be run on YARN
Spark Cluster Managers
Standalone mode
1 2 3 4
By default, applications
submitted to the
standalone mode cluster
will run in FIFO order,
and each application will
try to use all available
nodes
Apache Mesos is an
open-source project to
manage computer
clusters, and can also run
Hadoop applications
Apache YARN is the
cluster resource manager
of Hadoop 2. Spark can
be run on YARN
Kubernetes is an open-
source system for
automating deployment,
scaling, and management
of containerized
applications
History of Apache Spark
Applications of Spark
Applications of Spark
Banking
JPMorgan uses Spark to detect
fraudulent transactions, analyze the
business spends of an individual to
suggest offers, and identify patterns
to decide how much to invest and
where to invest
Applications of Spark
Banking E-Commerce
JPMorgan uses Spark to detect
fraudulent transactions, analyze the
business spends of an individual to
suggest offers, and identify patterns
to decide how much to invest and
where to invest
Alibaba uses Spark to analyze large
sets of data such as real-time
transaction details, browsing history,
etc. in the form of Spark jobs and
provides recommendations to its users
Applications of Spark
Banking E-Commerce Healthcare
JPMorgan uses Spark to detect
fraudulent transactions, analyze the
business spends of an individual to
suggest offers, and identify patterns
to decide how much to invest and
where to invest
Alibaba uses Spark to analyze large
sets of data such as real-time
transaction details, browsing history,
etc. in the form of Spark jobs and
provides recommendations to its users
IQVIA is a leading healthcare company
that uses Spark to analyze patient’s
data, identify possible health issues,
and diagnose it based on their medical
history
Applications of Spark
Banking E-Commerce Healthcare Entertainment
JPMorgan uses Spark to detect
fraudulent transactions, analyze the
business spends of an individual to
suggest offers, and identify patterns
to decide how much to invest and
where to invest
Alibaba uses Spark to analyze large
sets of data such as real-time
transaction details, browsing history,
etc. in the form of Spark jobs and
provides recommendations to its users
IQVIA is a leading healthcare company
that uses Spark to analyze patient’s
data, identify possible health issues,
and diagnose it based on their medical
history
Entertainment and gaming companies
like Netflix and Riot games use
Apache Spark to showcase relevant
advertisements to their users based on
the videos that they watch, share, and
like
History of Apache Spark
Spark Use Case
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Video streaming is a challenge, especially with
increasing demand for high-quality streaming
experiences
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Video streaming is a challenge, especially with
increasing demand for high-quality streaming
experiences
Conviva collects data about video streaming
quality to give their customers visibility into the end-
user experience they are delivering
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva delivers a better
quality of service to its customers by removing the
screen buffering and learning in detail about the
network conditions in real-time
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva delivers a better
quality of service to its customers by removing the
screen buffering and learning in detail about the
network conditions in real-time
This information is stored in the video player to
manage live video traffic coming from 4 billion video
feeds every month, to ensure maximum retention
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva has
created an auto diagnostics alert
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva has
created an auto diagnostics alert
It automatically detects anomalies
along the video streaming pipeline and
diagnoses the root cause of the issue
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva has
created an auto diagnostics alert
It automatically detects anomalies
along the video streaming pipeline and
diagnoses the root cause of the issue
Reduces waiting time before the
video starts
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva has
created an auto diagnostics alert
It automatically detects anomalies
along the video streaming pipeline and
diagnoses the root cause of the issue
Reduces waiting time before the
video starts
Avoids buffering and recovers the
video from a technical error
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva has
created an auto diagnostics alert
It automatically detects anomalies
along the video streaming pipeline and
diagnoses the root cause of the issue
Reduces waiting time before the
video starts
Avoids buffering and recovers the
video from a technical error
Goal is to maximize the viewer
engagement
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial | Simplilearn

More Related Content

What's hot

Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
Russell Jurney
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
Joud Khattab
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
Introducing DataFrames in Spark for Large Scale Data Science
Introducing DataFrames in Spark for Large Scale Data ScienceIntroducing DataFrames in Spark for Large Scale Data Science
Introducing DataFrames in Spark for Large Scale Data Science
Databricks
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Edureka!
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
Edureka!
 
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Simplilearn
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
Anton Kirillov
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
Vadim Y. Bichutskiy
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQL
Yousun Jeong
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slides
Dat Tran
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
Mostafa
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
Databricks
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
sravya raju
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
Home
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
Gokhan Atil
 
Spark overview
Spark overviewSpark overview
Spark overview
Lisa Hua
 
Spark graphx
Spark graphxSpark graphx
Spark graphx
Carol McDonald
 
Apache Spark PDF
Apache Spark PDFApache Spark PDF
Apache Spark PDF
Naresh Rupareliya
 

What's hot (20)

Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Introducing DataFrames in Spark for Large Scale Data Science
Introducing DataFrames in Spark for Large Scale Data ScienceIntroducing DataFrames in Spark for Large Scale Data Science
Introducing DataFrames in Spark for Large Scale Data Science
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
 
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQL
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slides
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
 
Spark overview
Spark overviewSpark overview
Spark overview
 
Spark graphx
Spark graphxSpark graphx
Spark graphx
 
Apache Spark PDF
Apache Spark PDFApache Spark PDF
Apache Spark PDF
 

Similar to What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial | Simplilearn

Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
Dharmjit Singh
 
Spark from the Surface
Spark from the SurfaceSpark from the Surface
Spark from the Surface
Josi Aranda
 
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptxCLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
bhuvankumar3877
 
Big_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_SessionBig_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_Session
RUHULAMINHAZARIKA
 
Module01
 Module01 Module01
Module01
NPN Training
 
Apache spark
Apache sparkApache spark
Apache spark
Dona Mary Philip
 
Apache spark
Apache sparkApache spark
Apache spark
Prashant Pranay
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
Knoldus Inc.
 
Apache Spark on HDinsight Training
Apache Spark on HDinsight TrainingApache Spark on HDinsight Training
Apache Spark on HDinsight Training
Synergetics Learning and Cloud Consulting
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Databricks
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Helena Edelson
 
Apache Spark Introduction.pdf
Apache Spark Introduction.pdfApache Spark Introduction.pdf
Apache Spark Introduction.pdf
MaheshPandit16
 
Spark core
Spark coreSpark core
Spark core
Prashant Gupta
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
Rahul Borate
 
Big data processing with Apache Spark and Oracle Database
Big data processing with Apache Spark and Oracle DatabaseBig data processing with Apache Spark and Oracle Database
Big data processing with Apache Spark and Oracle Database
Martin Toshev
 
Jumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on DatabricksJumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on Databricks
Databricks
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
Databricks
 
Using pySpark with Google Colab & Spark 3.0 preview
Using pySpark with Google Colab & Spark 3.0 previewUsing pySpark with Google Colab & Spark 3.0 preview
Using pySpark with Google Colab & Spark 3.0 preview
Mario Cartia
 
Performance of Spark vs MapReduce
Performance of Spark vs MapReducePerformance of Spark vs MapReduce
Performance of Spark vs MapReduce
Edureka!
 
5 things one must know about spark!
5 things one must know about spark!5 things one must know about spark!
5 things one must know about spark!
Edureka!
 

Similar to What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial | Simplilearn (20)

Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Spark from the Surface
Spark from the SurfaceSpark from the Surface
Spark from the Surface
 
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptxCLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
 
Big_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_SessionBig_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_Session
 
Module01
 Module01 Module01
Module01
 
Apache spark
Apache sparkApache spark
Apache spark
 
Apache spark
Apache sparkApache spark
Apache spark
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
 
Apache Spark on HDinsight Training
Apache Spark on HDinsight TrainingApache Spark on HDinsight Training
Apache Spark on HDinsight Training
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
Apache Spark Introduction.pdf
Apache Spark Introduction.pdfApache Spark Introduction.pdf
Apache Spark Introduction.pdf
 
Spark core
Spark coreSpark core
Spark core
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
 
Big data processing with Apache Spark and Oracle Database
Big data processing with Apache Spark and Oracle DatabaseBig data processing with Apache Spark and Oracle Database
Big data processing with Apache Spark and Oracle Database
 
Jumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on DatabricksJumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on Databricks
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
 
Using pySpark with Google Colab & Spark 3.0 preview
Using pySpark with Google Colab & Spark 3.0 previewUsing pySpark with Google Colab & Spark 3.0 preview
Using pySpark with Google Colab & Spark 3.0 preview
 
Performance of Spark vs MapReduce
Performance of Spark vs MapReducePerformance of Spark vs MapReduce
Performance of Spark vs MapReduce
 
5 things one must know about spark!
5 things one must know about spark!5 things one must know about spark!
5 things one must know about spark!
 

More from Simplilearn

Top 5 Javascript Libraries You Must Know in 2023 | Javascript Libraries 2023 ...
Top 5 Javascript Libraries You Must Know in 2023 | Javascript Libraries 2023 ...Top 5 Javascript Libraries You Must Know in 2023 | Javascript Libraries 2023 ...
Top 5 Javascript Libraries You Must Know in 2023 | Javascript Libraries 2023 ...
Simplilearn
 
Software Engineer VS Web Developer Salary 2024 | Salary | Careers | Projects ...
Software Engineer VS Web Developer Salary 2024 | Salary | Careers | Projects ...Software Engineer VS Web Developer Salary 2024 | Salary | Careers | Projects ...
Software Engineer VS Web Developer Salary 2024 | Salary | Careers | Projects ...
Simplilearn
 
DevOps Engineer Roadmap 2024 | DevOps Engineer Career Path For 2024 | Simplil...
DevOps Engineer Roadmap 2024 | DevOps Engineer Career Path For 2024 | Simplil...DevOps Engineer Roadmap 2024 | DevOps Engineer Career Path For 2024 | Simplil...
DevOps Engineer Roadmap 2024 | DevOps Engineer Career Path For 2024 | Simplil...
Simplilearn
 
Learn Facebook Ads In 30 Minutes | Facebook Ads 2024 | Digital Marketing Tuto...
Learn Facebook Ads In 30 Minutes | Facebook Ads 2024 | Digital Marketing Tuto...Learn Facebook Ads In 30 Minutes | Facebook Ads 2024 | Digital Marketing Tuto...
Learn Facebook Ads In 30 Minutes | Facebook Ads 2024 | Digital Marketing Tuto...
Simplilearn
 
React Tutorial For Beginners | Learn React JS In 45 Minutes | ReactJS Basics ...
React Tutorial For Beginners | Learn React JS In 45 Minutes | ReactJS Basics ...React Tutorial For Beginners | Learn React JS In 45 Minutes | ReactJS Basics ...
React Tutorial For Beginners | Learn React JS In 45 Minutes | ReactJS Basics ...
Simplilearn
 
How To Become a Cloud Engineer | Step by Step Roadmap To Become Cloud Enginee...
How To Become a Cloud Engineer | Step by Step Roadmap To Become Cloud Enginee...How To Become a Cloud Engineer | Step by Step Roadmap To Become Cloud Enginee...
How To Become a Cloud Engineer | Step by Step Roadmap To Become Cloud Enginee...
Simplilearn
 
How To Use AI Tools For Email Marketing | Email Marketing Tutorial For Beginn...
How To Use AI Tools For Email Marketing | Email Marketing Tutorial For Beginn...How To Use AI Tools For Email Marketing | Email Marketing Tutorial For Beginn...
How To Use AI Tools For Email Marketing | Email Marketing Tutorial For Beginn...
Simplilearn
 
Salary of AI Engineer 2023 | AI Engineer Salary 2023 | How Much Do AI Enginee...
Salary of AI Engineer 2023 | AI Engineer Salary 2023 | How Much Do AI Enginee...Salary of AI Engineer 2023 | AI Engineer Salary 2023 | How Much Do AI Enginee...
Salary of AI Engineer 2023 | AI Engineer Salary 2023 | How Much Do AI Enginee...
Simplilearn
 
Penetration Tester Salary In 2023 | Cyber Security Penetration Testing Salary...
Penetration Tester Salary In 2023 | Cyber Security Penetration Testing Salary...Penetration Tester Salary In 2023 | Cyber Security Penetration Testing Salary...
Penetration Tester Salary In 2023 | Cyber Security Penetration Testing Salary...
Simplilearn
 
Digital Marketing Roadmap 2024 | How to Become a Digital Marketer in 2024 ? |...
Digital Marketing Roadmap 2024 | How to Become a Digital Marketer in 2024 ? |...Digital Marketing Roadmap 2024 | How to Become a Digital Marketer in 2024 ? |...
Digital Marketing Roadmap 2024 | How to Become a Digital Marketer in 2024 ? |...
Simplilearn
 
What Is Incident Management | Incident Management Process | ITIL V4 Foundatio...
What Is Incident Management | Incident Management Process | ITIL V4 Foundatio...What Is Incident Management | Incident Management Process | ITIL V4 Foundatio...
What Is Incident Management | Incident Management Process | ITIL V4 Foundatio...
Simplilearn
 
Database Vs Data Warehouse Vs Data Lake : What Is the Difference
Database Vs Data Warehouse Vs Data Lake : What Is the DifferenceDatabase Vs Data Warehouse Vs Data Lake : What Is the Difference
Database Vs Data Warehouse Vs Data Lake : What Is the Difference
Simplilearn
 
Cyber Security Interview Questions and Answers | Cyber Security Interview Tip...
Cyber Security Interview Questions and Answers | Cyber Security Interview Tip...Cyber Security Interview Questions and Answers | Cyber Security Interview Tip...
Cyber Security Interview Questions and Answers | Cyber Security Interview Tip...
Simplilearn
 
🔥 Cyber Security Engineer Vs Ethical Hacker: What's The Difference | Cybersec...
🔥 Cyber Security Engineer Vs Ethical Hacker: What's The Difference | Cybersec...🔥 Cyber Security Engineer Vs Ethical Hacker: What's The Difference | Cybersec...
🔥 Cyber Security Engineer Vs Ethical Hacker: What's The Difference | Cybersec...
Simplilearn
 
Top 10 Companies Hiring Machine Learning Engineer | Machine Learning Jobs | A...
Top 10 Companies Hiring Machine Learning Engineer | Machine Learning Jobs | A...Top 10 Companies Hiring Machine Learning Engineer | Machine Learning Jobs | A...
Top 10 Companies Hiring Machine Learning Engineer | Machine Learning Jobs | A...
Simplilearn
 
How to Become Strategy Manager 2023 ? | Strategic Management | Roadmap | Simp...
How to Become Strategy Manager 2023 ? | Strategic Management | Roadmap | Simp...How to Become Strategy Manager 2023 ? | Strategic Management | Roadmap | Simp...
How to Become Strategy Manager 2023 ? | Strategic Management | Roadmap | Simp...
Simplilearn
 
Top 20 Devops Engineer Interview Questions And Answers For 2023 | Devops Tuto...
Top 20 Devops Engineer Interview Questions And Answers For 2023 | Devops Tuto...Top 20 Devops Engineer Interview Questions And Answers For 2023 | Devops Tuto...
Top 20 Devops Engineer Interview Questions And Answers For 2023 | Devops Tuto...
Simplilearn
 
🔥 Big Data Engineer Roadmap 2023 | How To Become A Big Data Engineer In 2023 ...
🔥 Big Data Engineer Roadmap 2023 | How To Become A Big Data Engineer In 2023 ...🔥 Big Data Engineer Roadmap 2023 | How To Become A Big Data Engineer In 2023 ...
🔥 Big Data Engineer Roadmap 2023 | How To Become A Big Data Engineer In 2023 ...
Simplilearn
 
🔥 AI Engineer Resume For 2023 | CV For AI Engineer | AI Engineer CV 2023 | Si...
🔥 AI Engineer Resume For 2023 | CV For AI Engineer | AI Engineer CV 2023 | Si...🔥 AI Engineer Resume For 2023 | CV For AI Engineer | AI Engineer CV 2023 | Si...
🔥 AI Engineer Resume For 2023 | CV For AI Engineer | AI Engineer CV 2023 | Si...
Simplilearn
 
🔥 Top 5 Skills For Data Engineer In 2023 | Data Engineer Skills Required For ...
🔥 Top 5 Skills For Data Engineer In 2023 | Data Engineer Skills Required For ...🔥 Top 5 Skills For Data Engineer In 2023 | Data Engineer Skills Required For ...
🔥 Top 5 Skills For Data Engineer In 2023 | Data Engineer Skills Required For ...
Simplilearn
 

More from Simplilearn (20)

Top 5 Javascript Libraries You Must Know in 2023 | Javascript Libraries 2023 ...
Top 5 Javascript Libraries You Must Know in 2023 | Javascript Libraries 2023 ...Top 5 Javascript Libraries You Must Know in 2023 | Javascript Libraries 2023 ...
Top 5 Javascript Libraries You Must Know in 2023 | Javascript Libraries 2023 ...
 
Software Engineer VS Web Developer Salary 2024 | Salary | Careers | Projects ...
Software Engineer VS Web Developer Salary 2024 | Salary | Careers | Projects ...Software Engineer VS Web Developer Salary 2024 | Salary | Careers | Projects ...
Software Engineer VS Web Developer Salary 2024 | Salary | Careers | Projects ...
 
DevOps Engineer Roadmap 2024 | DevOps Engineer Career Path For 2024 | Simplil...
DevOps Engineer Roadmap 2024 | DevOps Engineer Career Path For 2024 | Simplil...DevOps Engineer Roadmap 2024 | DevOps Engineer Career Path For 2024 | Simplil...
DevOps Engineer Roadmap 2024 | DevOps Engineer Career Path For 2024 | Simplil...
 
Learn Facebook Ads In 30 Minutes | Facebook Ads 2024 | Digital Marketing Tuto...
Learn Facebook Ads In 30 Minutes | Facebook Ads 2024 | Digital Marketing Tuto...Learn Facebook Ads In 30 Minutes | Facebook Ads 2024 | Digital Marketing Tuto...
Learn Facebook Ads In 30 Minutes | Facebook Ads 2024 | Digital Marketing Tuto...
 
React Tutorial For Beginners | Learn React JS In 45 Minutes | ReactJS Basics ...
React Tutorial For Beginners | Learn React JS In 45 Minutes | ReactJS Basics ...React Tutorial For Beginners | Learn React JS In 45 Minutes | ReactJS Basics ...
React Tutorial For Beginners | Learn React JS In 45 Minutes | ReactJS Basics ...
 
How To Become a Cloud Engineer | Step by Step Roadmap To Become Cloud Enginee...
How To Become a Cloud Engineer | Step by Step Roadmap To Become Cloud Enginee...How To Become a Cloud Engineer | Step by Step Roadmap To Become Cloud Enginee...
How To Become a Cloud Engineer | Step by Step Roadmap To Become Cloud Enginee...
 
How To Use AI Tools For Email Marketing | Email Marketing Tutorial For Beginn...
How To Use AI Tools For Email Marketing | Email Marketing Tutorial For Beginn...How To Use AI Tools For Email Marketing | Email Marketing Tutorial For Beginn...
How To Use AI Tools For Email Marketing | Email Marketing Tutorial For Beginn...
 
Salary of AI Engineer 2023 | AI Engineer Salary 2023 | How Much Do AI Enginee...
Salary of AI Engineer 2023 | AI Engineer Salary 2023 | How Much Do AI Enginee...Salary of AI Engineer 2023 | AI Engineer Salary 2023 | How Much Do AI Enginee...
Salary of AI Engineer 2023 | AI Engineer Salary 2023 | How Much Do AI Enginee...
 
Penetration Tester Salary In 2023 | Cyber Security Penetration Testing Salary...
Penetration Tester Salary In 2023 | Cyber Security Penetration Testing Salary...Penetration Tester Salary In 2023 | Cyber Security Penetration Testing Salary...
Penetration Tester Salary In 2023 | Cyber Security Penetration Testing Salary...
 
Digital Marketing Roadmap 2024 | How to Become a Digital Marketer in 2024 ? |...
Digital Marketing Roadmap 2024 | How to Become a Digital Marketer in 2024 ? |...Digital Marketing Roadmap 2024 | How to Become a Digital Marketer in 2024 ? |...
Digital Marketing Roadmap 2024 | How to Become a Digital Marketer in 2024 ? |...
 
What Is Incident Management | Incident Management Process | ITIL V4 Foundatio...
What Is Incident Management | Incident Management Process | ITIL V4 Foundatio...What Is Incident Management | Incident Management Process | ITIL V4 Foundatio...
What Is Incident Management | Incident Management Process | ITIL V4 Foundatio...
 
Database Vs Data Warehouse Vs Data Lake : What Is the Difference
Database Vs Data Warehouse Vs Data Lake : What Is the DifferenceDatabase Vs Data Warehouse Vs Data Lake : What Is the Difference
Database Vs Data Warehouse Vs Data Lake : What Is the Difference
 
Cyber Security Interview Questions and Answers | Cyber Security Interview Tip...
Cyber Security Interview Questions and Answers | Cyber Security Interview Tip...Cyber Security Interview Questions and Answers | Cyber Security Interview Tip...
Cyber Security Interview Questions and Answers | Cyber Security Interview Tip...
 
🔥 Cyber Security Engineer Vs Ethical Hacker: What's The Difference | Cybersec...
🔥 Cyber Security Engineer Vs Ethical Hacker: What's The Difference | Cybersec...🔥 Cyber Security Engineer Vs Ethical Hacker: What's The Difference | Cybersec...
🔥 Cyber Security Engineer Vs Ethical Hacker: What's The Difference | Cybersec...
 
Top 10 Companies Hiring Machine Learning Engineer | Machine Learning Jobs | A...
Top 10 Companies Hiring Machine Learning Engineer | Machine Learning Jobs | A...Top 10 Companies Hiring Machine Learning Engineer | Machine Learning Jobs | A...
Top 10 Companies Hiring Machine Learning Engineer | Machine Learning Jobs | A...
 
How to Become Strategy Manager 2023 ? | Strategic Management | Roadmap | Simp...
How to Become Strategy Manager 2023 ? | Strategic Management | Roadmap | Simp...How to Become Strategy Manager 2023 ? | Strategic Management | Roadmap | Simp...
How to Become Strategy Manager 2023 ? | Strategic Management | Roadmap | Simp...
 
Top 20 Devops Engineer Interview Questions And Answers For 2023 | Devops Tuto...
Top 20 Devops Engineer Interview Questions And Answers For 2023 | Devops Tuto...Top 20 Devops Engineer Interview Questions And Answers For 2023 | Devops Tuto...
Top 20 Devops Engineer Interview Questions And Answers For 2023 | Devops Tuto...
 
🔥 Big Data Engineer Roadmap 2023 | How To Become A Big Data Engineer In 2023 ...
🔥 Big Data Engineer Roadmap 2023 | How To Become A Big Data Engineer In 2023 ...🔥 Big Data Engineer Roadmap 2023 | How To Become A Big Data Engineer In 2023 ...
🔥 Big Data Engineer Roadmap 2023 | How To Become A Big Data Engineer In 2023 ...
 
🔥 AI Engineer Resume For 2023 | CV For AI Engineer | AI Engineer CV 2023 | Si...
🔥 AI Engineer Resume For 2023 | CV For AI Engineer | AI Engineer CV 2023 | Si...🔥 AI Engineer Resume For 2023 | CV For AI Engineer | AI Engineer CV 2023 | Si...
🔥 AI Engineer Resume For 2023 | CV For AI Engineer | AI Engineer CV 2023 | Si...
 
🔥 Top 5 Skills For Data Engineer In 2023 | Data Engineer Skills Required For ...
🔥 Top 5 Skills For Data Engineer In 2023 | Data Engineer Skills Required For ...🔥 Top 5 Skills For Data Engineer In 2023 | Data Engineer Skills Required For ...
🔥 Top 5 Skills For Data Engineer In 2023 | Data Engineer Skills Required For ...
 

Recently uploaded

Traces of the Holocaust in our communities in Levice Sovakia and Constanta Ro...
Traces of the Holocaust in our communities in Levice Sovakia and Constanta Ro...Traces of the Holocaust in our communities in Levice Sovakia and Constanta Ro...
Traces of the Holocaust in our communities in Levice Sovakia and Constanta Ro...
Zuzana Mészárosová
 
Beyond the Advance Presentation for By the Book 9
Beyond the Advance Presentation for By the Book 9Beyond the Advance Presentation for By the Book 9
Beyond the Advance Presentation for By the Book 9
John Rodzvilla
 
Bedok NEWater Photostory - COM322 Assessment (Story 2)
Bedok NEWater Photostory - COM322 Assessment (Story 2)Bedok NEWater Photostory - COM322 Assessment (Story 2)
Bedok NEWater Photostory - COM322 Assessment (Story 2)
Liyana Rozaini
 
The membership Module in the Odoo 17 ERP
The membership Module in the Odoo 17 ERPThe membership Module in the Odoo 17 ERP
The membership Module in the Odoo 17 ERP
Celine George
 
AI_in_HR_Presentation Part 1 2024 0703.pdf
AI_in_HR_Presentation Part 1 2024 0703.pdfAI_in_HR_Presentation Part 1 2024 0703.pdf
AI_in_HR_Presentation Part 1 2024 0703.pdf
SrimanigandanMadurai
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptx
heathfieldcps1
 
Front Desk Management in the Odoo 17 ERP
Front Desk  Management in the Odoo 17 ERPFront Desk  Management in the Odoo 17 ERP
Front Desk Management in the Odoo 17 ERP
Celine George
 
Is Email Marketing Really Effective In 2024?
Is Email Marketing Really Effective In 2024?Is Email Marketing Really Effective In 2024?
Is Email Marketing Really Effective In 2024?
Rakesh Jalan
 
How to Store Data on the Odoo 17 Website
How to Store Data on the Odoo 17 WebsiteHow to Store Data on the Odoo 17 Website
How to Store Data on the Odoo 17 Website
Celine George
 
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
siemaillard
 
Webinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional SkillsWebinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional Skills
EduSkills OECD
 
Understanding and Interpreting Teachers’ TPACK for Teaching Multimodalities i...
Understanding and Interpreting Teachers’ TPACK for Teaching Multimodalities i...Understanding and Interpreting Teachers’ TPACK for Teaching Multimodalities i...
Understanding and Interpreting Teachers’ TPACK for Teaching Multimodalities i...
Neny Isharyanti
 
ARCHITECTURAL PATTERNS IN HISTOPATHOLOGY pdf- [Autosaved].pdf
ARCHITECTURAL PATTERNS IN HISTOPATHOLOGY  pdf-  [Autosaved].pdfARCHITECTURAL PATTERNS IN HISTOPATHOLOGY  pdf-  [Autosaved].pdf
ARCHITECTURAL PATTERNS IN HISTOPATHOLOGY pdf- [Autosaved].pdf
DharmarajPawar
 
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdfThe Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
JackieSparrow3
 
How to Install Theme in the Odoo 17 ERP
How to  Install Theme in the Odoo 17 ERPHow to  Install Theme in the Odoo 17 ERP
How to Install Theme in the Odoo 17 ERP
Celine George
 
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptxBRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
kambal1234567890
 
Beginner's Guide to Bypassing Falco Container Runtime Security in Kubernetes ...
Beginner's Guide to Bypassing Falco Container Runtime Security in Kubernetes ...Beginner's Guide to Bypassing Falco Container Runtime Security in Kubernetes ...
Beginner's Guide to Bypassing Falco Container Runtime Security in Kubernetes ...
anjaliinfosec
 
Views in Odoo - Advanced Views - Pivot View in Odoo 17
Views in Odoo - Advanced Views - Pivot View in Odoo 17Views in Odoo - Advanced Views - Pivot View in Odoo 17
Views in Odoo - Advanced Views - Pivot View in Odoo 17
Celine George
 
Ardra Nakshatra (आर्द्रा): Understanding its Effects and Remedies
Ardra Nakshatra (आर्द्रा): Understanding its Effects and RemediesArdra Nakshatra (आर्द्रा): Understanding its Effects and Remedies
Ardra Nakshatra (आर्द्रा): Understanding its Effects and Remedies
Astro Pathshala
 
(T.L.E.) Agriculture: Essentials of Gardening
(T.L.E.) Agriculture: Essentials of Gardening(T.L.E.) Agriculture: Essentials of Gardening
(T.L.E.) Agriculture: Essentials of Gardening
MJDuyan
 

Recently uploaded (20)

Traces of the Holocaust in our communities in Levice Sovakia and Constanta Ro...
Traces of the Holocaust in our communities in Levice Sovakia and Constanta Ro...Traces of the Holocaust in our communities in Levice Sovakia and Constanta Ro...
Traces of the Holocaust in our communities in Levice Sovakia and Constanta Ro...
 
Beyond the Advance Presentation for By the Book 9
Beyond the Advance Presentation for By the Book 9Beyond the Advance Presentation for By the Book 9
Beyond the Advance Presentation for By the Book 9
 
Bedok NEWater Photostory - COM322 Assessment (Story 2)
Bedok NEWater Photostory - COM322 Assessment (Story 2)Bedok NEWater Photostory - COM322 Assessment (Story 2)
Bedok NEWater Photostory - COM322 Assessment (Story 2)
 
The membership Module in the Odoo 17 ERP
The membership Module in the Odoo 17 ERPThe membership Module in the Odoo 17 ERP
The membership Module in the Odoo 17 ERP
 
AI_in_HR_Presentation Part 1 2024 0703.pdf
AI_in_HR_Presentation Part 1 2024 0703.pdfAI_in_HR_Presentation Part 1 2024 0703.pdf
AI_in_HR_Presentation Part 1 2024 0703.pdf
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptx
 
Front Desk Management in the Odoo 17 ERP
Front Desk  Management in the Odoo 17 ERPFront Desk  Management in the Odoo 17 ERP
Front Desk Management in the Odoo 17 ERP
 
Is Email Marketing Really Effective In 2024?
Is Email Marketing Really Effective In 2024?Is Email Marketing Really Effective In 2024?
Is Email Marketing Really Effective In 2024?
 
How to Store Data on the Odoo 17 Website
How to Store Data on the Odoo 17 WebsiteHow to Store Data on the Odoo 17 Website
How to Store Data on the Odoo 17 Website
 
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
 
Webinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional SkillsWebinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional Skills
 
Understanding and Interpreting Teachers’ TPACK for Teaching Multimodalities i...
Understanding and Interpreting Teachers’ TPACK for Teaching Multimodalities i...Understanding and Interpreting Teachers’ TPACK for Teaching Multimodalities i...
Understanding and Interpreting Teachers’ TPACK for Teaching Multimodalities i...
 
ARCHITECTURAL PATTERNS IN HISTOPATHOLOGY pdf- [Autosaved].pdf
ARCHITECTURAL PATTERNS IN HISTOPATHOLOGY  pdf-  [Autosaved].pdfARCHITECTURAL PATTERNS IN HISTOPATHOLOGY  pdf-  [Autosaved].pdf
ARCHITECTURAL PATTERNS IN HISTOPATHOLOGY pdf- [Autosaved].pdf
 
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdfThe Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
 
How to Install Theme in the Odoo 17 ERP
How to  Install Theme in the Odoo 17 ERPHow to  Install Theme in the Odoo 17 ERP
How to Install Theme in the Odoo 17 ERP
 
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptxBRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
 
Beginner's Guide to Bypassing Falco Container Runtime Security in Kubernetes ...
Beginner's Guide to Bypassing Falco Container Runtime Security in Kubernetes ...Beginner's Guide to Bypassing Falco Container Runtime Security in Kubernetes ...
Beginner's Guide to Bypassing Falco Container Runtime Security in Kubernetes ...
 
Views in Odoo - Advanced Views - Pivot View in Odoo 17
Views in Odoo - Advanced Views - Pivot View in Odoo 17Views in Odoo - Advanced Views - Pivot View in Odoo 17
Views in Odoo - Advanced Views - Pivot View in Odoo 17
 
Ardra Nakshatra (आर्द्रा): Understanding its Effects and Remedies
Ardra Nakshatra (आर्द्रा): Understanding its Effects and RemediesArdra Nakshatra (आर्द्रा): Understanding its Effects and Remedies
Ardra Nakshatra (आर्द्रा): Understanding its Effects and Remedies
 
(T.L.E.) Agriculture: Essentials of Gardening
(T.L.E.) Agriculture: Essentials of Gardening(T.L.E.) Agriculture: Essentials of Gardening
(T.L.E.) Agriculture: Essentials of Gardening
 

What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial | Simplilearn

  • 2. What’s in it for you? 1. History of Spark What’s in it for you?
  • 3. What’s in it for you? 1. History of Spark 2. What is Spark? What’s in it for you?
  • 4. What’s in it for you? 1. History of Spark 2. What is Spark? 3. Hadoop vs Spark What’s in it for you?
  • 5. What’s in it for you? 1. History of Spark 2. What is Spark? 3. Hadoop vs Spark 4. Components of Apache Spark What’s in it for you? Spark Core Spark SQL Spark Streaming Spark MLlib GraphX
  • 6. What’s in it for you? 1. History of Spark 2. What is Spark? 3. Hadoop vs Spark 4. Components of Apache Spark 5. Spark Architecture What’s in it for you?
  • 7. What’s in it for you? 1. History of Spark 2. What is Spark? 3. Hadoop vs Spark 4. Components of Apache Spark 5. Spark Architecture 6. Applications of Spark What’s in it for you?
  • 8. What’s in it for you? 1. History of Spark 2. What is Spark? 3. Hadoop vs Spark 4. Components of Apache Spark 5. Spark Architecture 6. Applications of Spark 7. Spark Use Case What’s in it for you?
  • 9. History of Apache Spark Started as a project at UC Berkley AMPLab 2009
  • 10. History of Apache Spark Started as a project at UC Berkley AMPLab Open sourced under a BSD license 2009 2010
  • 11. History of Apache Spark Started as a project at UC Berkley AMPLab Open sourced under a BSD license Spark became an Apache top level project 2009 2010 2013
  • 12. History of Apache Spark Started as a project at UC Berkley AMPLab Open sourced under a BSD license Spark became an Apache top level project Used by Databricks to sort large-scale datasets and set a new world record 2009 2010 2013 2014
  • 13. History of Apache Spark What is Apache Spark?
  • 14. What is Apache Spark? Apache Spark is an open-source data processing engine to store and process data in real-time across various clusters of computers using simple programming constructs
  • 15. What is Apache Spark? Support various programming languages Apache Spark is an open-source data processing engine to store and process data in real-time across various clusters of computers using simple programming constructs
  • 16. What is Apache Spark? Support various programming languages Developers and data scientists incorporate Spark into their applications to rapidly query, analyze, and transform data at scale Query Analyze Transform Apache Spark is an open-source data processing engine to store and process data in real-time across various clusters of computers using simple programming constructs
  • 17. History of Apache Spark Hadoop vs Spark
  • 18. Hadoop vs Spark Processing data using MapReduce in Hadoop is slow Spark processes data 100 times faster than MapReduce as it is done in- memory
  • 19. Hadoop vs Spark Processing data using MapReduce in Hadoop is slow Spark processes data 100 times faster than MapReduce as it is done in- memory Performs batch processing of data Performs both batch processing and real-time processing of data
  • 20. Hadoop vs Spark Processing data using MapReduce in Hadoop is slow Spark processes data 100 times faster than MapReduce as it is done in- memory Performs batch processing of data Performs both batch processing and real-time processing of data Hadoop has more lines of code. Since it is written in Java, it takes more time to execute Spark has fewer lines of code as it is implemented in Scala
  • 21. Hadoop vs Spark Processing data using MapReduce in Hadoop is slow Spark processes data 100 times faster than MapReduce as it is done in- memory Performs batch processing of data Performs both batch processing and real-time processing of data Hadoop has more lines of code. Since it is written in Java, it takes more time to execute Spark has fewer lines of code as it is implemented in Scala Hadoop supports Kerberos authentication, which is difficult to manage Spark supports authentication via a shared secret. It can also run on YARN leveraging the capability of Kerberos
  • 22. History of Apache Spark Spark Features
  • 23. Spark Features Fast processing Spark contains Resilient Distributed Datasets (RDD) which saves time taken in reading, and writing operations and hence, it runs almost ten to hundred times faster than Hadoop
  • 24. Spark Features In-memory computing In Spark, data is stored in the RAM, so it can access the data quickly and accelerate the speed of analytics Fast processing
  • 25. Spark Features Flexible Spark supports multiple languages and allows the developers to write applications in Java, Scala, R, or Python In-memory computingFast processing
  • 26. Spark Features Fault tolerance Spark contains Resilient Distributed Datasets (RDD) that are designed to handle the failure of any worker node in the cluster. Thus, it ensures that the loss of data reduces to zero Flexible In-memory computingFast processing
  • 27. Spark Features Better analytics Spark has a rich set of SQL queries, machine learning algorithms, complex analytics, etc. With all these functionalities, analytics can be performed better Fault toleranceFlexible In-memory computingFast processing
  • 28. History of Apache Spark Components of Spark
  • 29. Components of Apache Spark Spark Core
  • 30. Components of Apache Spark Spark Core Spark SQL SQL
  • 31. Components of Apache Spark Spark Streaming Spark Core Spark SQL SQL Streaming
  • 32. Components of Apache Spark MLlib Spark Streaming Spark Core Spark SQL SQL Streaming MLlib
  • 33. Components of Apache Spark MLlib Spark Streaming Spark Core Spark SQL GraphX SQL Streaming MLlib
  • 34. History of Apache Spark Components of Spark – Spark Core
  • 35. Spark Core Spark Core Spark Core is the base engine for large-scale parallel and distributed data processing
  • 36. Spark Core Spark Core Spark Core is the base engine for large-scale parallel and distributed data processing It is responsible for: memory management fault recovery scheduling, distributing and monitoring jobs on a cluster interacting with storage systems
  • 37. Resilient Distributed Dataset Spark Core Spark Core is embedded with RDDs (Resilient Distributed Datasets), an immutable fault-tolerant, distributed collection of objects that can be operated on in parallel RDD Transformation Action These are operations (such as reduce, first, count) that return a value after running a computation on an RDD These are operations (such as map, filter, join, union) that are performed on an RDD that yields a new RDD containing the result
  • 38. History of Apache Spark Components of Spark – Spark SQL
  • 39. Spark SQL Spark SQL framework component is used for structured and semi-structured data processing Spark SQL SQL
  • 40. Spark SQL Spark SQL framework component is used for structured and semi-structured data processing Spark SQL SQL DataFrame DSL Spark SQL and HQL DataFrame API Data Source API CSV JSON JDBC Spark SQL Architecture
  • 41. History of Apache Spark Components of Spark – Spark Streaming
  • 42. Spark Streaming Spark Streaming is a lightweight API that allows developers to perform batch processing and real-time streaming of data with ease Spark Streaming Streaming Provides secure, reliable, and fast processing of live data streams
  • 43. Spark Streaming Spark Streaming is a lightweight API that allows developers to perform batch processing and real-time streaming of data with ease Spark Streaming Streaming Provides secure, reliable, and fast processing of live data streams Streaming Engine Input data stream Batches of input data Batches of processed data
  • 44. History of Apache Spark Components of Spark – Spark MLlib
  • 45. Spark MLlib MLlib is a low-level machine learning library that is simple to use, is scalable, and compatible with various programming languages MLlib MLlib MLlib eases the deployment and development of scalable machine learning algorithms
  • 46. Spark MLlib MLlib is a low-level machine learning library that is simple to use, is scalable, and compatible with various programming languages MLlib MLlib MLlib eases the deployment and development of scalable machine learning algorithms It contains machine learning libraries that have an implementation of various machine learning algorithms Clustering Classification Collaborative Filtering
  • 47. History of Apache Spark Components of Spark – GraphX
  • 48. GraphX GraphX is Spark’s own Graph Computation Engine and data store GraphX
  • 49. GraphX GraphX is Spark’s own Graph Computation Engine and data store GraphX Provides a uniform tool for ETL Exploratory data analysis Interactive graph computations
  • 50. History of Apache Spark Spark Architecture
  • 51. Master Node Driver Program SparkContext • Master Node has a Driver Program • The Spark code behaves as a driver program and creates a SparkContext, which is a gateway to all the Spark functionalities Apache Spark uses a master-slave architecture that consists of a driver, that runs on a master node, and multiple executors which run across the worker nodes in the cluster Spark Architecture
  • 52. Master Node Driver Program SparkContext Cluster Manager • Spark applications run as independent sets of processes on a cluster • The driver program & Spark context takes care of the job execution within the cluster Spark Architecture
  • 53. Master Node Driver Program SparkContext Cluster Manager Cache Task Task Executor Worker Node Cache Task Task Executor Worker Node • A job is split into multiple tasks that are distributed over the worker node • When an RDD is created in Spark context, it can be distributed across various nodes • Worker nodes are slaves that run different tasks Spark Architecture
  • 54. Master Node Driver Program SparkContext Cluster Manager Cache Task Task Executor Worker Node Cache Task Task Executor Worker Node • The Executor is responsible for the execution of these tasks • Worker nodes execute the tasks assigned by the Cluster Manager and return the results back to the SparkContext Spark Architecture
  • 55. Spark Cluster Managers Standalone mode 1 By default, applications submitted to the standalone mode cluster will run in FIFO order, and each application will try to use all available nodes
  • 56. Spark Cluster Managers Standalone mode 1 2 By default, applications submitted to the standalone mode cluster will run in FIFO order, and each application will try to use all available nodes Apache Mesos is an open-source project to manage computer clusters, and can also run Hadoop applications
  • 57. Spark Cluster Managers Standalone mode 1 2 3 By default, applications submitted to the standalone mode cluster will run in FIFO order, and each application will try to use all available nodes Apache Mesos is an open-source project to manage computer clusters, and can also run Hadoop applications Apache YARN is the cluster resource manager of Hadoop 2. Spark can be run on YARN
  • 58. Spark Cluster Managers Standalone mode 1 2 3 4 By default, applications submitted to the standalone mode cluster will run in FIFO order, and each application will try to use all available nodes Apache Mesos is an open-source project to manage computer clusters, and can also run Hadoop applications Apache YARN is the cluster resource manager of Hadoop 2. Spark can be run on YARN Kubernetes is an open- source system for automating deployment, scaling, and management of containerized applications
  • 59. History of Apache Spark Applications of Spark
  • 60. Applications of Spark Banking JPMorgan uses Spark to detect fraudulent transactions, analyze the business spends of an individual to suggest offers, and identify patterns to decide how much to invest and where to invest
  • 61. Applications of Spark Banking E-Commerce JPMorgan uses Spark to detect fraudulent transactions, analyze the business spends of an individual to suggest offers, and identify patterns to decide how much to invest and where to invest Alibaba uses Spark to analyze large sets of data such as real-time transaction details, browsing history, etc. in the form of Spark jobs and provides recommendations to its users
  • 62. Applications of Spark Banking E-Commerce Healthcare JPMorgan uses Spark to detect fraudulent transactions, analyze the business spends of an individual to suggest offers, and identify patterns to decide how much to invest and where to invest Alibaba uses Spark to analyze large sets of data such as real-time transaction details, browsing history, etc. in the form of Spark jobs and provides recommendations to its users IQVIA is a leading healthcare company that uses Spark to analyze patient’s data, identify possible health issues, and diagnose it based on their medical history
  • 63. Applications of Spark Banking E-Commerce Healthcare Entertainment JPMorgan uses Spark to detect fraudulent transactions, analyze the business spends of an individual to suggest offers, and identify patterns to decide how much to invest and where to invest Alibaba uses Spark to analyze large sets of data such as real-time transaction details, browsing history, etc. in the form of Spark jobs and provides recommendations to its users IQVIA is a leading healthcare company that uses Spark to analyze patient’s data, identify possible health issues, and diagnose it based on their medical history Entertainment and gaming companies like Netflix and Riot games use Apache Spark to showcase relevant advertisements to their users based on the videos that they watch, share, and like
  • 64. History of Apache Spark Spark Use Case
  • 65. Spark Use Case Conviva is one of the world’s leading video streaming companies
  • 66. Spark Use Case Conviva is one of the world’s leading video streaming companies Video streaming is a challenge, especially with increasing demand for high-quality streaming experiences
  • 67. Spark Use Case Conviva is one of the world’s leading video streaming companies Video streaming is a challenge, especially with increasing demand for high-quality streaming experiences Conviva collects data about video streaming quality to give their customers visibility into the end- user experience they are delivering
  • 68. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva delivers a better quality of service to its customers by removing the screen buffering and learning in detail about the network conditions in real-time
  • 69. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva delivers a better quality of service to its customers by removing the screen buffering and learning in detail about the network conditions in real-time This information is stored in the video player to manage live video traffic coming from 4 billion video feeds every month, to ensure maximum retention
  • 70. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva has created an auto diagnostics alert
  • 71. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva has created an auto diagnostics alert It automatically detects anomalies along the video streaming pipeline and diagnoses the root cause of the issue
  • 72. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva has created an auto diagnostics alert It automatically detects anomalies along the video streaming pipeline and diagnoses the root cause of the issue Reduces waiting time before the video starts
  • 73. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva has created an auto diagnostics alert It automatically detects anomalies along the video streaming pipeline and diagnoses the root cause of the issue Reduces waiting time before the video starts Avoids buffering and recovers the video from a technical error
  • 74. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva has created an auto diagnostics alert It automatically detects anomalies along the video streaming pipeline and diagnoses the root cause of the issue Reduces waiting time before the video starts Avoids buffering and recovers the video from a technical error Goal is to maximize the viewer engagement

Editor's Notes

  1. Style - 01
  2. Style - 01
  3. Style - 01
  4. Style - 01
  5. Style - 01
  6. Style - 01
  7. Style - 01
  8. Style - 01