Master in Data Science-Syllabus
Master in Data Science-Syllabus
Detailed Syllabus
Key Features:
Experiential Learning
Offline/Online Classes
Case studies and assignments
Hands on Projects
Mentoring Sessions
Job Assistance
Learning Pathway:
• Learn python program from scratch
• Statistical and mathematical essential for Data Science
• Data Science with python
• Machine Learning
• Natural language processing
• Database
• Django, Flask Application Development
• Data visualization techniques
Program Outcomes:
Deep understanding of data structure and data manipulation.
Understand and use linear non-linear regression models and classification techniques for data
analysis.
A comprehensive knowledge of supervised, unsupervised and Reinforcement learning models
such as linear regression, logistic regression, clustering, decision tree, naive bayes, support vector
machines, random forest, K-NN,K-means.
Gain expertise in mathematical computing using the NumPy and Scikit-Learn package.
Gain expertise in Exploratory data analysis using pandas, matplotlib and seaborn.
Gain expertise in time series modeling.
Understand deep reinforcement learning techniques applied in Natural Language Processing
Understand the different components of the Hadoop ecosystem and learn to work with HBase, its
architecture and data storage, learning the difference between HBase and RDBMS, and use Hive
for partitioning.
Understand MapReduce and its characteristics
Page 1
www.smeclabs.com
SMEC
1. Learn Python Program from Scratch
Programming is an increasingly important skill; this program will establish your proficiency in handling
basic programming concepts. By the end of this program, you will understand object-oriented
programming; basic programming concepts such as data types, variables, strings, loops, and functions;
and software engineering using Python. 25+ practices sessions on all modules
1.1 Objectives:
• Course Introduction
• Programming
Statistics is the science of assigning a probability through the collection, classification, and
analysis of data. A foundational part of Data Science, this session will enable you to define statistics
and essential terms related to it, explain measures of central tendency and dispersion, and comprehend
skewness, correlation, regression, distribution. Understanding the data is the key to perform Exploratory
Data analysis and justify your conclusion to the business or scientific problem.
2.1 Objectives:
www.smeclabs.com
SMEC
2.2 Program Curriculum:
1. Introduction
2. Sample or Population Data?
3. The Fundamentals of Descriptive Statistics
4. Measures of Central Tendency, Asymmetry, and Variability
5. Practical Example: Descriptive Statistics
6. Distributions
7. Estimators and Estimates
8. Confidence Intervals Lesson
9. Practical Example: Inferential Statistics
10. Hypothesis Testing: Introduction
11. Practical Example: Hypothesis Testing
12. The Fundamentals of Regression Analysis
13. Assumptions for Linear Regression Analysis
14. Dealing with Categorical Data
15. Practical Example: Regression Analysis
3.1 Objectives:
Write your first Python program by implementing concepts of variables, strings, functions, loops,
conditions
Understand the concepts of lists, sets, dictionaries, conditions and branching, objects and classes .
Work with data in Python such as reading and writing files, loading, working, and saving data
with Pandas
Gain an in-depth understanding of Data Science processes, data wrangling, data exploration, data
visualization, hypothesis building, and testing.
Install the required Python environment and other auxiliary tools and libraries.
Understand the essential concepts of Python programming such as data types, tuples, lists,
dictionaries, basic operators and functions.
Perform high-level mathematical computing using the NumPy package and its vast library of
mathematical functions.
Page 3
www.smeclabs.com
SMEC
Perform data analysis and manipulation using data structures and tools provided in the Pandas
package.
Gain expertise in Machine Learning using the Scikit-Learn package
Gain an in-depth understanding of supervised learning and unsupervised learning models such as
linear regression, logistic regression, clustering, dimensionality reduction, K-NN and pipeline
Use the matplotlib library of Python for data visualization
Extract useful data from websites by performing web scraping using Python.
1. Python Basics
2. Python Data Structures
3. Python Programming Fundamentals
4. Working with Data in Python
5. Data Science Overview
6. Data Analytics Overview
7. Statistical Analysis and Business Applications
8. Python Environment Setup and Essentials
9. Mathematical Computing with Python (NumPy)
10. Data Manipulation with Pandas
11. Machine Learning with Scikit–Learn
12. Natural Language Processing with Scikit Learn
13. Data Visualization in Python using Matplotlib
14. Web Scraping with Beautiful Soup
15. Working with NumPy Arrays
4. Machine Learning
It will make you an expert in Machine Learning, a subclass of Artificial Intelligence that automates data
analysis to enable computers to learn and adapt through experience to do specific tasks without explicit
programming. You will master Machine Learning concepts and techniques, including supervised and
unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms
and prepare you for your role with advanced Machine Learning knowledge.
4.1 Objectives:
Master the concepts of supervised and unsupervised learning, recommendation engine, and time
series modeling
Gain practical mastery over principles, algorithms, and applications of Machine Learning through
hands-on projects
Acquire thorough knowledge of the statistical and heuristic aspects of Machine Learning
Page 4
www.smeclabs.com
SMEC
Implement models such as support vector machines, kernel SVM, naive Bayes, decision tree
classifier, random forest classifier, logistic regression, K-means clustering and more in Python
Validate Machine Learning models and decode various accuracy metrics. Improve the final
models using another set of optimization algorithms
Comprehend the theoretical concepts and how they relate to the practical aspects of Machine
Learning
5.Database
A database is an organized collection of structured information, or data, typically stored electronically in
a computer system. A database is usually controlled by a database management system (DBMS).
Company data are store in databases and later on retrieved using python to develop analytics and bring
insights to business problems.
5.1 Objectives:
Understand the basic fundamentals of SQL database
Methods to structure and configure your database
Structure the author efficient SQL statements and clauses Manage your SQL database
5.2 Program curriculum:
1. Introduction to SQL
2. Database Normalization and Entity-Relationship (ER) Mode
3. Installation configurations to setup MySQL
4. Understanding Database and Tables
5. Learn Operators, Constraints, and Data Types
6. Understanding functions, Subqueries, Operators, and Derived Tables in SQL
Page 5
www.smeclabs.com
SMEC
6. Flask Web Development
Flask is a microframework for developers, designed to enable them to create and scale web apps
quickly and simply. This is a way for web servers to pass requests to web applications or frameworks
6.1 Objectives:
Understand the basic fundamentals Flask
Installation and Creation of a Flask
The structure and Scope of Flask
Deployment
6.2 Program Curriculum:
1. Routing: Flask's routing system maps URLs to Python functions, allowing you to define your
application's URL structure.
2. Templates: Flask uses Jinja2 as its template engine, allowing you to easily generate dynamic HTML
pages.
3. Forms: Flask's form handling makes it easy to process user input and validate data.
4. Sessions and cookies: Flask's session and cookie management features allow you to store user data
and maintain state across multiple requests.
5. Database integration: Flask can be used with a variety of databases, including MySQL
6. RESTful APIs: Flask is a popular choice for building RESTful APIs due to its simplicity and
flexibility.
7. Extensions: Flask has a wide range of extensions available for adding functionality to your
application, including Flask-WTF, Flask-Security, and Flask-Mail.
8. Flask-Login: Flask-Login is an extension for handling user authentication and authorization in Flask
applications.
9. Flask-RESTful: Flask-RESTful is an extension for building RESTful APIs with Flask, providing
additional features for API development.
10. Deployment: Flask can be deployed to a variety of environments, including traditional web hosting,
cloud-based services, and containers.
Page 6
www.smeclabs.com
SMEC
7. Fast Api
FastAPI is a modern, fast (high-performance), web framework for building APIs with Python. Creating
APIs, is an important part of making your software accessible. In machine learning they allow different
applications to share data and work together, saving time and effort.
7.1 Objectives:
Path Parameters
Query Parameters
Cookie parameters
Handling errors
Page 7
www.smeclabs.com
SMEC
8. Django Web Development
Django is a Python-based web framework that allows you to quickly create efficient web applications.
With built-in features for everything including Django Admin Interface, default database – SQLlite3,
etc. Django gives you ready-made components to use and that too for rapid development.
8.1 Objectives:
Understand the basic fundamentals Django
Django’s MVT Architecture and its Influence
Installation and Creation of a Django website
Deployment
4. Forms: Django provides a powerful form handling system, which makes it easy to process user input
and validate data.
5. Admin site: Django comes with a built-in admin site, which provides an easy-to-use interface for
managing your application's data.
6. Authentication: Django provides built-in authentication mechanisms to handle user authentication
and authorization.
7. Middleware: Middleware is a mechanism in Django that allows you to process requests and
responses before they are handled by views.
8. URLs: Django's URL routing system allows you to map URLs to views and organize your
application's URLs.
9. Testing: Django provides a built-in testing framework that makes it easy to write tests for your
application.
10. Deployment: Django can be deployed to a variety of environments, including traditional web
hosting, cloud-based services, and containers.
Page 8
www.smeclabs.com
SMEC
9. Big Data Analytics
9.1 Objectives:
Understand the concepts of big data in the industry
Exploring the data sources of big data and problems associated in solving big data problems
Possibilities of big data and the Hadoop framework to solve wide range of problems
Understand Hadoop’s architecture and primary components, such as MapReduce and (HDFS)
Learn to Add and remove nodes from Hadoop clusters, check the available disk space on each
node, and modify configuration parameter
Data analytics with Scala
Big data analytics using Spark
10.1Objectives:
Page 9
www.smeclabs.com
SMEC
10.2 Program Curriculum:
www.smeclabs.com
SMEC
12. Kafka, MQTT and AWS IOT
Kafka and MQTT are two complementary technologies, together they allow to build IoT end-to-end
integration from the edge to the data center. MQTT is a widely used ISO standard publish-subscribe-
based messaging protocol. MQTT has many implementations such as Mosquitto or HiveMQ. MQTT is
mainly used in Internet of Things scenarios (like connected cars or smart home). However, MQTT is not
built for high scalability, longer storage or easy integration to legacy systems. Apache Kafka is a highly
scalable distributed streaming platform. Kafka ingests, stores, processes and forwards high volumes of
data from thousands of IoT devices.
12.1 Objectives:
Technologies:
Python:
Introduction to Python and Computer Programming, Data Types, Variables, Basic Input- Output
Operations, Basic Operators, Boolean Values, Conditional Execution, Loops, Lists and List Processing,
Logical and Bitwise Operations, Functions, Tuples, Dictionaries, Sets, and Data Processing,Modules,
Packages, String and List Methods, and Exceptions, File Handlings. Regular expressions,
Page 11
www.smeclabs.com
SMEC
database, The Object-Oriented Approach: Classes, Methods, Objects, and the Standard Objective
Features; Exception Handling, and Working with Files.
Matplotlib:
Scatter plot ,Bar charts, histogram ,Stack charts , Legend title Style , Figures and
subplots ,Plotting function in pandas ,Labelling and arranging figures ,Save plots .
Seaborn:
Style functions, Color palettes, Distribution plots, Categorical plots, Regression plots, Axis
grid objects.
NumPy
Creating NumPy arrays, Indexing and slicing in NumPy, Downloading and parsing data
Creating multidimensional arrays, NumPy Data types, Array attributes, Indexing and Slicing, creating
array views copies, Manipulating array shapes I/O .
Pandas:
Using multilevel series, Series and Data Frames, Grouping, aggregating, Merge Data Frames,
Generate summary, Group data into logical pieces, Manipulate dates, Creating metrics for analysis,
Data wrangling, Merging and joining, Data Mugging using Pandas, Building a Predictive Mode.
Flask
Django
Django Installation, Django Project, Django Admin Interface, App, MVT, Model, Views,
Templates, Files Handling, Forms, Validation, File Upload, Database Connectivity, Database
Migrations, Django Middleware, Request and Response, Django Exceptions, Django Session,
Django Redirects
Scikit-learn:
Scikit Learn Overview, Plotting a graph, Identifying features and labels, Saving and
opening a model, Classification, Train / test split, What is KNN? What is SVM?, Linear regression,
Logistic vs linear regression, KMeans, Neural networks, Overfitting and underfitting, Backpropagation,
Cost function and gradient descent, CNNs
Page 12
www.smeclabs.com
SMEC
Keras:
Introduction to Deep Learning - Biological Neural Networks Artificial Neural Networks,
Activation Functions, Introduction to Deep Learning Libraries, Regression Models with Keras,
Classification Models with Keras, Deep Neural Networks, Convolutional Neural Networks, Recurrent
Neural Networks.
TensorFlow
NLTK
MySQL
MySQL – Introduction, Installation, Create Database, Drop Database, Selecting Database, Data
Types, Create Tables, Drop Tables, Insert Query, Select Query, WHERE Clause, Update Query,
DELETE Query, LIKE Clause, Sorting Results, Using Joins, Handling NULL Values, ALTER
Command, Aggregate functions, MySQL Clauses, MySQL Conditions.
MongoDB
No Schema, Install MongoDB, How MongoDB Works? Insert First Data, CRUD
Operations, Insert Many, Update and Update Many, Delete and Delete Many, Diving Deep into find
Difference bbetween update and update Many, Projection, Intro to Embed Documents, Embed
Documents in Action, Adding Arrays, Fetching Data from Structured Data, Schema Types, Types
of Data in Mongo DB, Relationship between data.
Web Scraping:
Java
Features of Java, Java basics, if statement, Loops, Arrays, Switch, Methods, defining a
class, Access Modifiers, Scope and lifetime of variables, Creating an Object, Object invocation,
Page 13
www.smeclabs.com
SMEC
Method Overloading. Constructor, this, Inheritance, overriding, super, final, Local Classes, Anonymous
Classes, Static classes, Inner class, Nested Classes, Abstract class, Interfaces, Packages, Access control,
Basic java.lang Package, Exception Handling, java.util Package, Collection Frameworks, I/O and
streaming, DBMS & RDBMS, JDBC
Introduction to Big Data and Hadoop, ccommand to monitor the cluster, Hadoop Architecture
Distributed Storage (HDFS) and Yarn, What is HDFS, Need for HDFS, Regular File System vs HDFS,
Characteristics of HDFS, HDFS Architecture and Components, High Availability Cluster,
Implementations, HDFS Component File System Namespace, Data Block Split, Data Replication
Topology, HDFS Command Line, Demo: Common HDFS Commands.
HBase
HBase Overview, Data Model, Configuration, Shell, Write, MemStore, General Commands,
Creating a Table using HBase Shell, Creating a Table Using API, Listing a Table using HBase Shell,
Listing Tables Using API, Enabling a Table, Describe & Alter, Drop a Table, Create Data, Update Data,
Read Data, Delete Dat.
Hive
Introduction to Hive, Hive SQL over Hadoop MapReduce, Hive Architecture, Interfaces to
Run Hive Queries, Running Beeline from Command Line, Hive DDL and DML, Creating New Table,
Data Types, File Format Types, Data Serialization, Hive Table and Avro Schema, Hive Optimization
Partitioning Bucketing and Sampling, Data Insertion, Data Representation and Import Using Hive.
Scala
Basics of Functional Programming and Scala, Functional Programming, Programming With
Scala, Basic Literals and Arithmetic Programming, Logical Operators, Arrays, Lists, Tuples, Sets, Maps,
Type Inference, Classes, Objects , Functions in Scala, Type Inference Functions Anonymous Function
and Class, Exception Handling, FILE Operations
Apache Spark
Page 14
www.smeclabs.com
SMEC
Apache Kafka
AWS IOT
Configuring and deploying AWS for IOT data stream processing with services such
as AWS IoT Core, MQTT messages streaming to AWS IoT, AWS IoT Device Management and AWS
IoT Analytics, Understanding AWS IoT APIs and SDKs, Industry use case and applications in Data
science pipelines
Page 15
www.smeclabs.com