DS - Program Curriculum
DS - Program Curriculum
DS - Program Curriculum
Data Science
Introduction
Understanding
Understanding Primary Actions
UpGrad Coding Console
Understanding Statuses & Important Pointers
Introduction
Getting Started - Installation
Introduction to Jupyter Notebook
The Basics
Data Structures in Python
Sharpen your Data Analysis skills Lists
Introduction with Python, which is the choice of
Tuples
to Python language for simplicity, readability
and quick deployment. Dictionaries
Sets
Introduction
If-Elif-Else
Loops
Control Structures and Functions
Comprehensions
Functions
Map, Filter, and Reduce
Introduction
NumPy Basics
Creating NumPy Arrays
Structure and Content of Arrays
Introduction to NumPy
Subset, Slice, Index and Iterate through Arrays
Multidimensional Arrays
Computation Times in NumPy and Standard
Python Lists
Introduction
Basic Operations
Operations on NumPy Arrays
Operations on Arrays
Introduction
Reading Delimited and Relational Databases
Reading Data From Websites
Getting and Cleaning Data
Getting Data From APIs
Reading Data From PDF Files
Cleaning Datasets
Honesty Pledge
Python for Assignment: Problem Statement
An assignment based on the concepts
Data Science: learnt in Python for Data Science
Python Assignment
Assignment Rubrics
Assignment
Assignment: Submission
Introduction
Defining Data Warehouse
Structure of Data Warehouse
Database design OLAP vs. OLTP
Star Schema
How to Use a Star Schema - A Demonstration
Data Warehouse Schema- Industry Example
Introduction
Adding and Deleting Columns
Changing Column Name and Data Type
Creating Table from existing table
Updating Table
DATA S C I E N C E TO O L K I T
Introduction
Introduction to User defined Functions
User Defined Functions and Stored
User defined functions (Application)
Procedures
Introduction to Stored Procedures
Stored Procedures (Application)
Introduction
Optimisation in Select Clause
Optimisation in Where Clause
Query Optimisation
Optimisation in Group by and Order by
Optimisation in Joins
Optimisation in Window Function
Problem Introduction
Apply the basics of investing
and your knowledge of Data Science Data Set
Assignment SQL Assignment - Stock Market Analysis
to determine when to buy and sell a Grading Criteria
stock.
Submission
Introduction
Understanding the Excel Interface
Slicing and Dicing Data - Sort and Filter
Report Making I: Basic Formatting
Introduction
Delimited Files
Discovering Shortcuts
Introduction to Formulae
Taught by one of the most
renowned data scientists in the Complex Functions
Data Analysis country (S.Anand, CEO, Gramener), Data Analysis in Excel - I Cell Referencing and Text Functions
in Excel this module takes you from a begin-
Logical Formulae
ner level Excel user to an almost
professional user. Anand's Anecdotes
Creating and Formatting Charts
Types of Charts
Anecdotes - II
Introduction
Creating a Pivot Table
Analysing Data in a Pivot Table
Filtering Data in a Pivot Table
Anand's Anecdotes - Pivot Tables
Data Analysis in Excel - II
VLOOKUP - Linking Data from multiple files & tables
Anand's Anecdotes - VLOOKUP
Common Errors in Excel
Anand's Anecdotes
Ungraded Assignment
Introduction
Data Formats and Tableau Interface
Introduction
Bar Charts
Visualising and Analysing
Visualisation Learn an important and widely used
Data in Tableau
Scatter Plots and Pie Charts
using Tableau tool for Data Analysts - Tableau. Tree Maps
Dual Axes Charts
Introduction
Histograms
Box Plots
Visualising and Analysing Area Maps
Data with Tableau - II
Calculations in Tableau
Dashboard and Stories
Introduction
Define the Business Problem -
Business Understanding
The CRISP-DM Framework -
This module covers concepts of the Business and Data Understanding Owning an IPL Team - Business Understanding
Analytics CRISP - DM framework for business
Problem Solving Understanding Raw Data
problem solving.
Preparing Data for Analysis
Objectives
Downloads
Apply the basics of investing and
your knowledge of Data Science to Checkpoints - Part 1
Investment case study determine when to buy and sell a
Investment Case Group Project
Checkpoints - Part 2
stock
Evaluation Rubric
Final Submission
PG Program in
Data Science
Introduction to Module
Basics of Probability
Joint Probability and Conditional Probability
Bayes' Theorem
Assessments I
Inferential Statistics - Practice Session
Standardized Normal Distribution and Z- Score
Assessments II
Introduction to Sampling Methods
Sampling and Estimation
Assessments III
Introduction
Understanding Hypothesis Testing
Concepts of Hypothesis Testing - I
Null and Alternate Hypotheses
Making a Decision
Critical Value Method
Critical Value Method - Examples
Introduction
p-value Method
Concepts of Hypothesis Testing - II
p-value Method - Examples
Types of Errors
Introduction
Hypothesis Testing -
Additional Resources Z-test
T-Test
Chi-Square Test
P-Value Approach
F-Test
General Guidelines
Assignment based on the concepts
Assignment: Statistics Problem Statement
learnt in Inferential Statistics and Assignment
S T A T I S T I C S A N D E X P L O R A T O R Y D A T A A N A LY S I S
Introduction to EDA
Introduction
Public and Private Data
Data Sourcing
Private Data
Public Data
Public Data Exercise
Introduction
Fixing Rows and Columns
Missing Values
Data Cleaning
Standardising Values
Invalid Values
Filtering Data
Introduction
Data Description
Unordered Categorical Variables - Univariate Analysis
Univariate Analysis
Ordered Categorical Variables - Univariate Analysis
Introduction
Bivariate Analysis on Continuous Variables
Bivariate Analysis Business Problems Involving Correlation
Practice Questions
Bivariate Analysis on categorical variables
Introduction
What are Derived Metrics?
Types of Derived Metrics: Type Driven Metrics
Derived Metrics
Types of Derived Metrics: Business Driven Metrics
Practice Questions
Types of Derived Metrics: Data Driven Metrics
Course Overview
Introduction: Data Visualisation
Introduction
Data Visualisation Toolkit
Introduction
Uber Suppy-Demand and solve uber supply-demand Uber Supply-Demand Gap Evaluation Rubric
Gap gap problem
Submission
Course Wrap - EDA and Statistics Course Wrap - EDA and Statistics
Here, you will find all the addition-
Additional resources
al content for the course as and
Pre-Reads
when they are added to this Basics of Probability
module Optional Questions
Pre-Reads
Discrete Probability Distributions
Optional Questions
Power Law
Exploratory Data Analysis Recommended Additional Content
Election Data : Case Study
PG Program in
Data Science
Introduction
Introduction to Machine Learning
Regression Line
Simple Linear Regression Best Fit Line
Strength of Simple Linear Regression
Simple Linear Regression in Python
Coding Practice - Simple Linear Regression
Introduction
Multiple Linear Regression
Modelling in Python - I
Modelling in Python - II
Housing Case Study
Derived Variables
Multiple Linear Regression VIF - Variance Inflation Factor
Regression helps us to determine the
strength of the relationship between Housing Case Study Predictions
Linear Regression
one dependent variable and a series Variable Selection Using RFE
of other changing variables.
Assumptions of Linear Regression
Feature Selection
Coding Practice - Building a Multiple Linear
Regression Model
Introduction
Linear Regression: Revision
Prediction vs Projection
Media Company Case Study
Introduction
Making Predictions
Model Building - Coding Exercise
Model Evaluation
Introduction
Commonly Faced Challenges in Implementation
of Logistic Regression
Logistic Regression:
Industry Applications - Part II Model Evaluation (A Second Look)
Model Validation and Importance of Stability
Tracking of Model Performance Over Time
Introduction
Understanding Clustering
Introduction to Clustering
Practical Example of Clustering - Customer
Segmentation
Introduction
Steps of the Algorithm
K Means Algorithm
K Means as Coordinate Descent
K Means Clustering
K Means++ Algorithm
Visualising the K Means Algorithm
Practical Consideration in K Means Algorithm
Cluster Tendency
Introduction
Data Preparation
Introduction
K-Mode Clustering
K-Mode in Python
Other Forms of Clustering K-Prototype in Python
DB Scan Clustering
Practice Question
Gaussian Mixture Model
Introduction
The Why And What of PCA
Building Blocks of PCA
Illustration - Finding Principal Components
Principal Component Analysis Comprehension - Calculating the Principal
This module will cover the concepts Components
of PCA, which is an unsupervised
Unsupervised Singular Value Decomposition
machine learning technique mainly
Learning: Principal used in dimensionality reduction. It SVD Example - Image Compression
Component Analysis will also cover practical applications Practice Questions
of PCA A158:F559 Python and real
Introduction
PCA: Python Implementation
PCA in Python Practical Considerations and Alternatives
Optional Assignment (MNIST Dataset)
Comprehension: PCA, SVD and Eigenvectors
Problem Statement
Use your skills to predict which
HR analytic
employee is going to leave the HR Analytics Case Study Evaluation Rubric
case study
company in the near future. Submission
PG Program in
Data Science
Introduction
Introduction to SVM
Concept of a Hyperplane in 2D
SVM - Maximal Margin Classifier
Practice Questions
Concept of a Hyperplane in 3D
Maximal Margin Classifier
Introduction
The Soft Margin Classifier
The Slack Variable
SVM - Soft Margin Classifier
Comprehension-1: Notion of Slack Variables
Learn the fundamentals of SVMs and
Support Vector Cost of Misclassification
use them to detect spam emails, rec-
Machine ognise alphabets and more! SVM R-Lab
Introduction
Introduction to Kernels
Mapping Nonlinear Data to Linear Data
Feature Transformation
Kernels The Kernel Trick
R Lab - Kernels
Shiny App - Types of kernels
Choosing a Kernel Function
Letter Recognition Using SVM
Problem Statement
Assignment - Support Use image classification to identify
Assignment - Support Vector Machine Evaluation Rubric
Vector Machine handwritten digits
Submission
Introduction
Introduction to Decision Trees
Interpreting a Decision Tree
Introduction to Decision Trees
Comprehension - Decision Tree Classification
in Python
Regression with Decision Trees
Introduction
Concept of Homogeneity
Introduction
Tree models represent the way we
Advantages and Disadvantages
Tree Model make decisions. Learn how decisions
(Optional) are made in this powerful Tree Truncation
classification algorithm.
Tree Pruning
Truncation and Pruning
Building Decision Trees in Python
Choosing Tree Hyperparameters in Python
Coding Practice Questions
Comprehension - Hyperparameters
Introduction
Ensembles
Comprehension - Ensembles
Creating a Random Forest
Random Forests Comprehension - OOB (Out-of-Bag) Error
Comprehension - Time Taken to Build a Random
Forest
Random Forests Lab
Coding Practice Questions
Introduction
Introduction to Boosting
Understanding Stationarity
Understanding White Noise
Acf & Pacf Plots
Working with Stationary Time Series
In this module, you will learn how to Ar & Ma Modelling
Time Series*
analyse and forecast a series that Arma Modelling
(Optional) varies with time.
Model Evaluation
Introduction
Introduction to Model Selection
Model and Learning Algorithm
Principles of Model Selection Simplicity, Complexity and Overfitting
Bias-Variance Tradeoff
Comprehension - Bias Variance Tradeoff
You are preparing for a competitive Regularization
exam. Should you learn some tricks
Model Selection for it or focus on the fundamentals? Introduction
Model Selection has the answer Regularization and Hyperparameters
Model Evaluation and Cross Validation
Model Evaluation
Model Evaluation: Python Demonstration-I
Model Evaluation: Python Demonstration-II
Cross-Validation: Motivation
Cross-Validation: Python Demonstration
Cross-Validation: Hyperparameter Tuning
Introduction
Understanding the Business Problem
Comprehension - Logistic Regression
Comparing Different Machine Learning Models - I
Given a business problem, how do Comparing Different Machine Learning Models - II
Model Selection - you choose the best algorithm?
Practical Learn a few practical tips for doing Model Selection - Best Practices Pros and Cons of Different Machine Learning Models
Considerations this here End-to-End Modelling - I
CART and CHAID Trees
Choosing between Trees and Random Forests - I
Choosing between Trees and Random Forests - II
End-to-End Modelling - II
Introduction
Generalized Regression
Generalized Regression Framework-1
Generalized Linear Regression Generalized Regression Framework-2
Systems of Linear Equations
Generalized Regression Framework-3
Generalized Regression in Python
Introduction
Regularized Regression
Ridge and Lasso Regression - I
Advanced This course takes a more advanced
Regression look at linear regression models. Ridge and Lasso Regression - II
Ridge and Lasso Regression in Python
Model Selection Criteria-I
Course Introduction
Introduction to Understand the big data ecosystem Fundamentals of Big Data
Big Data and the various types of job roles in Understanding Big Data
Identifying Big Data
the industry.
Conventional Data Processing Systems and Big Data
Introduction
History of Hadoop
Distributed Computing
Hadoop Terminologies
Master and Slave
Big Data Storage in Hadoop
Hadoop Distributed File System
Interaction Between Nodes in a Hadoop Cluster
Advantages of Distributed File Systems
Big Data Learn the basics of Hadoop and its
Storage and Processing architecture - a distributed computing Comprehension — Hadoop Distributed File System
(HDFS)
Framework - Hadoop platform.
Introduction
YARN - Yet Another Resource Negotiator
MapReduce
Introduction
Data Ingestion with Apache Sqoop
Advantages and Industry Use Cases of Sqoop
How Sqoop Performs Import
Comprehension: How Sqoop Import Works
Creating an RDS
Introduction to Apache Sqoop
Migrating Databases to the RDS
Running Sqoop in AWS
Adding a MySQL Connector
Sqoop Commands: Listing Databases and Tables
Sqoop Commands: Import and Import-All-Tables
Sqoop Commands: Job and Eval
Introduction
Introduction to Apache Hive
Key Features of Apache Hive
Use Cases of Apache Hive
The Hive Metastore
Big Data In big data ingestion and processing, Introduction to Apache Hive
Ingestion and learn to use various tools for getting Hive Data Models
Processing and processing data. Creating Tables in Hive
B I G DATA
Introduction
Partitions
Hive Data Models - Partitions
Creating and Querying Partitioned Tables
and Buckets
Buckets
Comprehension: Data Models (Graded Assessment)
Introduction
File Formats in Apache Hive File Formats in Apache Hive
ORC and Compression Algorithms
Introduction
EDA and UDFs in Hive
Advanced Data Analysis in Hive Advanced Data Analysis using Hive
Basic Text Analysis using Hive
Handling Complex Data Types using Hive
Introduction
Learn Apache Spark, the newest big Concepts and Fundamentals of Spark Overview of Spark
Big Data
data framework with unprecedented Spark vs MapReduce
Processing using performance and ease of use.
Apache Spark Resilient Distributed Datasets (RDDs)
In-memory Processing
RDD Operations
Programming & Debugging in PySpark
Introduction: Setting Up
Schema-on-Read v/s Schema-on-Write
Comparing Spark With Hive
Working with Spark
Analysis with Spark - I: Reading & Summarising Data
Analysis with Spark - II: Plotting Data
Analysis with Spark - III: Filtering & Grouping
Analysis with Spark - IV: Model-building
Practice Analysis: Airlines Data
MLlib - I: An Overview
MLlib - II: Preparation for Model Building
MLlib - III: Building ML models
PySpark: An Alternative Library to PySpark
Solution to PySpark Practice Questions
Hive LLAP
Introduction
Understanding the Healthcare Market
With all the necessary DA knowledge, Stakeholders of the Primary Healthcare Ecosystem:
Process
Understanding the it is time to get into the domain
Healthcare Domain details. Learn about the healthcare Stakeholders of the Primary Healthcare Ecosystem:
landscape in the US. Introduction to the Healthcare Space Drivers and Metrics
Stakeholders of the Secondary Healthcare Ecosystem:
Process
Stakeholders of the Secondary Healthcare Ecosystem:
Drivers and Metrics
Other Stakeholders of the Healthcare Ecosystem
Introduction
Analytics Related to Patient-Physician Interactions
Clinical Decision Support Systems
Analytics Related to Patient-Hospital Interactions
In this module, you will explore the Management of Patient Traffic - I
different analytics opportunities that
Provider Analytics exist in the healthcare provider Provider Analytics Management of Patient Traffic - II (Comprehension)
space. Management of Patient Traffic - III
Hospital Performance Analysis - I
Hospital Performance Analysis - II (Comprehension)
Hospital Performance Analysis - III
Hospital Compare
Introduction
Payers in the US
Types of Health Insurance
Types of Insurance Plans
Benefits
Getting Familiar with the
Analytics Opportunities in Benefits
US Payer Market
Coordination of Benefits
E L E C T I V E - H E A LT H C A R E
Provider Management - I
Provider Management - II
Pay for Performance (P4P)
Analytics Opportunities in Provider Management
In this module, you will explore the
Payer Analytics different analytics opportunities that Introduction
exist in the healthcare payer space.
Life Cycle of a Health Insurance Claim
Healthcare Coding
Claims Adjudication
Analytics Opportunities in Claims Management
Analytics to Detect Fraudulent Claims
Claims and Care Management
Care Management
Care Management Framework
Risk Stratification
Evaluating a Care Management Program
Accountable Care Organisations (ACOs)
Analytics Opportunities in Care Management
Introduction
Pharmaceutical Market Overview
Drug Development Life Cycle
Areas of Analytics in Pharma
Drug Development and Sales Analytics Pharmaceutical-Selling Process
Field Activity
Analytics in Sales
Analytics in the
Learn how pharmaceutical companies
Pharmaceutical harness the power of data analytics.
Sales Data
Industries Customer Segmentation
Introduction
Structure of a Marketing Organisation
Multichannel Marketing (MCM) Management
Marketing Analytics
Patient Journey Analytics
Analytics Opportunities in Commercial Operations
Market Forecasting
Get a brief overview of how all that Healthcare Course Wrap by Prof. RC
Course Wrap for you have studied in the healthcare
Healthcare domain, finds application in the real Course Wrap
world. Interview tips by Rohit
Problem Statement
Decipher the CMS hospital star rating
Capstone- Healthcare
Capstone Project system using supervised and unsuper- Mid Submission
vised models.
Final Submission
PG Program in
Data Science
Introduction
Business of ecommerce
Get acquainted with the various Inventory Management
Introduction to
applications of Data Analytics in
E-commerce E-commerce business
Marketing in ecommerce
Data Analytics in ecommerce
Improving User Experience
Fraud Detection
Shipment Delivery
Customer Feedback
Introduction
Understanding Recommendation Systems
Content Based Filtering
Learn about the algorithms that
Recommendation
power the recommendation engines Recommendation Systems User Based Collaborative Filtering
systems of the E-commerce sites
Item Based Collaborative Filtering
Issues in Recommendation Systems
Recommender System in Python
Introduction
Understanding Price Markup & Markdown
Introduction
What is Market Mix Modelling (MMM)?
Introduction
Market Mix Learn how to optimise your marketing Modelling the Advertising Effects - Part I
Modelling spends in order to maximise the ROI.
Modelling the Advertising Effects - Part II (Optional)
Modelling the Advertising Effects - Part III
Introduction
Understand the concept behind A/B
A\B Testing test and also learn how to execute an
Understanding A/B testing
A/B Testing
(Optional) A/B test in Optimizely Steps in A/B testing
Setting up an A/B Test in Optimizely
Introduction
Banking Products - Deposits & Lending
Learn how banks make money How Banks Make Money
Profitability of Credit Cards
Introduction to through various banking products and
Banking and also understand the customer P&L of Banks and Financial Institutions
Financial Services lifecycle.
Introduction
Customer Lifecycle
Customer Lifecycle Customer Lifecycle - Acquisition Analytics
Customer Lifecycle - Engagement Analytics
Customer Lifecycle - Risk Analytics
Introduction
Introduction
Engagement Analytics Framework
Cross-Selling Strategies
Engagement Strategies
Types of Cross-Selling
ELECTIVE - BFSI
Cross-Selling Opportunities
Now that you have learnt how to Customer Lifetime Value (CLV)
Engagement acquire customers, learn how to
Introduction
Analytics engage them and prevent their
attrition Cross-Selling Lab Cross-Selling - Business Objectives
Cross-Selling Analysis
Introduction
Types of Attrition
Retention and Loyalty Management
Attrition - Credit Card
Interpreting a Credit Card Attrition Model
Introduction
Regulatory Risk Analytics
Regulatory Risk Analytics - A Brief Introduction
(Optional)