Data Engineer Master Program v2
Data Engineer Master Program v2
MASTER’S PROGRAM
In collaboration with IBM
www.simplilearn.com
1 | www.simplilearn.com
Contents
03 About the Course
07 Program Outcomes
09 Courses
24 Electives
25 Certificates
2 | www.simplilearn.com
About the Course
3 | www.simplilearn.com
Key
Features
4 | www.simplilearn.com
About IBM and Simplilearn
collaboration
About Simplilearn
Simplilearn is a leader in digital skills by the industry’s highest completion
training, focused on the emerging rates. Partnering with professionals and
technologies that are transforming our companies, we identify their unique needs
world. Our blended learning approach and provide outcome-centric solutions to
drives learner engagement and backed help them achieve their professional goals.
5 | www.simplilearn.com
Learning Path - Data Engineer
Big Data
PySpark
and Hadoop
Training
Administrator
MongoDB Developer
Apache Cassandra
and Administrator
Electives
Apache Spark • Scala for Data Science
and Scala
• Spark for Scala Analytics
• Simplifying Data Pipelines
with Apache Kafka
6 | www.simplilearn.com
Big Data Engineer Master’s Program
Outcomes
Master tools and skills including Data Learn how Kafka is used in the real
Model Creation, Database Interfaces, world, including its architecture
Advanced Architecture, Spark, Scala, and components, get hands-on
RDD, SparkSQL, Spark Streaming, experience connecting Kafka to
Spark ML, GraphX, Sqoop, Flume, Spark, and work with Kafka Connect
Pig, Hive, Impala and Kafka
Architecture.
7 | www.simplilearn.com
Who Should Enroll in this Program?
IT professionals
8 | www.simplilearn.com
S
T
E
Big Data for Data Engineering P
1
This introductory course from IBM will teach you the basic concepts and 2
terminologies of Big Data, and its real-life applications across multiple
industries. You will gain insights on how to improve business productivity 3
by processing large volumes of data and extract valuable information
from them.
4
5
Key Learning Objectives 6
Understand what Big Data is, sources of Big Data, and real-life 7
examples
8
Learn about the key difference between Big Data and Data Science
9
Master how to use Big Data for operational analysis and better
customer service 10
Know the Ecosystem of Big Data and Hadoop framework
Course curriculum
Lesson 1 - What is Big Data?
9 | www.simplilearn.com
S
T
E
Data Engineering with Hadoop P
1
Apache Hadoop is one of the most in-demand technologies for analyzing 2
Big Data. This introductory Hadoop course by IBM will give you an
overview of what Hadoop is and its components, such as MapReduce and 3
HDFS. Additionally, this course will teach you to explore with large data
sets and use Hadoop’s method of distributed processing.
4
5
Key Learning Objectives 6
Understand Hadoop’s architecture and primary components, such as 7
MapReduce and Hadoop Distributed File System (HDFS)
8
Add and remove nodes from Hadoop clusters, check the available disk
space on each node, and modify configuration parameters 9
Learn about Apache projects that are part of the Hadoop ecosystem, 10
including Pig, Hive, HBase, ZooKeeper, Oozie, Sqoop, Flume, and
more.
Course curriculum
Lesson 1 - Introduction to Hadoop
10 | www.simplilearn.com
S
T
E
Data Engineering with Scala P
1
Kickstart your learning of Scala with this introductory course and 2
familiarize yourself with Scala programming. Carefully crafted by IBM,
upon completion of this course you will be able to write your Scala 3
codes, perform Big Data analysis using Scala , and create your own Scala
projects.
4
5
Key Learning Objectives 6
Create your own Scala Project 7
Understand basic object-oriented programming methodologies in 8
Scala
9
Work with data in Scala such as pattern matching, applying synthetic
methods, handling options, failures, and futures 10
Course curriculum
Lesson 1 - Introduction
Lesson 4 - Collections
11 | www.simplilearn.com
S
T
E
Big Data Hadoop and Spark Developer P
1
Simplilearn’s Big Data Hadoop Training Course helps you master Big 2
Data and Hadoop Ecosystem tools, such as HDFS, YARN, MapReduce,
Hive, Impala, Pig, HBase, Spark, Flume, Sqoop, Hadoop Frameworks, and 3
more concepts of Big Data processing life cycle. Throughout this online
instructor-led Hadoop Training, you will be working on real-time projects
4
on Retail, Tourism, Finance, etc. This Big Data Course also prepares you 5
for Cloudera’s CCA175 Big Data certification.
6
Key Learning Objectives 7
Learn how to navigate the Hadoop Ecosystem and understand how to 8
optimize its use
9
Ingest data using Sqoop, Flume, and Kafka
10
Implement partitioning, bucketing, and indexing in Hive
Course curriculum
Lesson 1 - Introduction to Bigdata and Hadoop
12 | www.simplilearn.com
Lesson 3 - Data Ingestion into Big Data Systems and ETL
13 | www.simplilearn.com
S
T
E
Python for Data Science P
1
Kickstart your learning of Python for Data Science with this introductory 2
course and familiarize yourself with programming. Carefully crafted by
IBM, upon completion of this course you will be able to write your Python 3
scripts, perform fundamental hands-on data analysis using the Jupyter-
based lab environment, and create your own Data Science projects using
4
IBM Watson. 5
6
Key Learning Objectives
7
Write your first Python program by implementing concepts of
variables, strings, functions, loops, conditions 8
Understand the nuances of lists, sets, dictionaries, conditions and 9
branching, objects and classes
10
Work with data in Python such as reading and writing files, loading,
working, and saving data with Pandas
Course curriculum
Lesson 1 - Python Basics
14 | www.simplilearn.com
S
T
E
Pyspark Training P
1
Pyspark Training will provide an in-depth overview of Apache Spark, the 2
open-source query engine for processing large datasets, and how to
integrate it with Python using the PySpark interface. The course will show 3
you how to build and implement data-intensive applications as you dive
into the world of high-performance machine learning leveraging Spark
4
RDD, Spark SQL, Spark MLlib, Spark Streaming, HDFS, Sqoop, Flume, 5
Spark GraphX, and Kafka.
6
Key Learning Objectives 7
Understand how to leverage the functionality of Python as you deploy 8
it in the Spark ecosystem
9
Master Apache Spark architecture and how to set up a Python
environment for Spark 10
Learn about various techniques for collecting data, RDDs and contrast
them with DataFrames, how to read data from files and HDFS, and
how to work with schemas
15 | www.simplilearn.com
Course curriculum
Lesson 1 - A brief primer on Pyspark
16 | www.simplilearn.com
S
T
E
Big Data and Hadoop Administrator P
1
This Big Data and Hadoop Administrator training course will furnish you 2
with the aptitudes and methodologies necessary to excel in the Big Data
Analytics industry. With this Hadoop Admin training, you’ll learn to work 3
with the adaptable, versatile frameworks based on the Apache Hadoop
ecosystem, including Hadoop installation and configuration, cluster 4
management with Sqoop, Flume, Pig, Hive, Impala, and Cloudera. You’ll
learn Big Data implementations that have security, speed, and scale..
5
6
Key Learning Objectives 7
Understand the fundamentals and characteristics of Big Data and 8
various scalability options available to help manage huge quantities of
data 9
Master the concepts of the Hadoop framework, including architecture, 10
Hadoop distributed file system, and deployment of Hadoop clusters
using core or vendor-specific distributions
Work with Hadoop clients, nodes for clients and web interfaces like
HUE to work with Hadoop Cluster
Use cluster planning and tools for data ingestion into Hadoop clusters,
and cluster monitoring activities
17 | www.simplilearn.com
Course curriculum
Lesson 1 - Big Data and Hadoop Introduction
18 | www.simplilearn.com
S
T
E
MongoDB Developer and Administrator P
1
Become an expert MongoDB developer and administrator by gaining 2
an in-depth knowledge of NoSQL and mastering skills of data modeling,
ingestion, query, sharding, and data replication. The course includes 3
industry-based projects in e-learning and telecom domains. It is best
suited for database administrators, software developers, system 4
administrators, and analytics professionals.
5
6
Key Learning Objectives
7
Develop expertise in writing Java and NodeJS applications using
MongoDB 8
Master the skills of Replication and Sharding of data in MongoDB to 9
optimize read/write performance
19 | www.simplilearn.com
Course curriculum
Lesson 1 - Introduction to NoSQL databases
20 | www.simplilearn.com
S
T
E
Apache Cassandra P
1
This Apache Cassandra certification training will develop your expertise 2
in working with high-volume Cassandra database management system
as part of the Big Data Hadoop framework. With this Cassandra training, 3
you will learn Cassandra concepts, features, architecture and data
model, and how to install, configure and monitor open-source databases. 4
The Casandra course is ideal for software developers and analytics
professionals who wish to further their careers in the Big Data field.
5
6
Key Learning Objectives 7
Describe the need for Big Data and NoSQL 8
Explain the fundamental concepts of Cassandra and its architecture 9
Describe the architecture of Cassandra 10
Demonstrate data model creation in Cassandra
Course curriculum
Lesson 1 - Introduction to Big Data and NoSQL Databases
22 | www.simplilearn.com
Course curriculum
Lesson 1 - Introduction to Spark
23 | www.simplilearn.com
Elective Course
24 | www.simplilearn.com
Certificates
Upon completion of this Master’s Program, you will receive the certificates
from IBM and Simplilearn in the Big Data Engineer courses in the learning path.
These certificates will testify to your skills as an expert in Data Engineering.
Upon program completion, you will also receive an industry recognized Master’s
Certificate from Simplilearn.
25 | www.simplilearn.com
Advisory board member
26 | www.simplilearn.com
USA
INDIA
www.simplilearn.com
27 | www.simplilearn.com