Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

SML Syllabus

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Statistical Machine Learning Syllabus

COURSE CONTEXT

COURSE CONTEXT
SCHOOL SCSET VERSION NO. OF V1
CURRICULUM/SYLLABUS
THAT THIS COURSE IS A PART
OF
DEPARTMENT DATE THIS COURSE WILL BE July-Dec,
EFFECTIVE FROM 2023
DEGREE B. Tech. VERSION NUMBER OF THIS 2
COURSE

COURSE BRIEF
COURSE TITLE Statistical Machine Learning PRE-REQUISITES NA
COURSE CODE CSET211 TOTAL CREDITS 4
COURSE TYPE Specialized Core-I L-T-P FORMAT 3-0-2

COURSE SUMMARY
This course includes machine learning concepts, Statistical Theories, Supervised learning, high dimensional data
and the role of sparsity, Learning theory, Risk minimization, Classification and regression, and EM algorithm. It
also covers important topics such as parametric and non-parametric methods, theory of generalization,
regularization, the role of sparsity in high dimensional data, and surrogate loss functions. In a broader sense, the
course offers a thorough understanding of statistical ML concepts that help students design and implement daily
life learning applications.

COURSE-SPECIFIC LEARNING OUTCOMES (CO)


By the end of this program, students should have the following knowledge, skills and values:
CO1: To articulate key features and methods of Statistical Machine Learning (SML).
CO2: To formulate and design the given application as a statistical machine learning problem.
CO3: To implement and evaluate common statistical machine learning techniques.

How are the above COs aligned with the Program-Specific Objectives (POs) of the degree?
The course outcomes are aligned to inculcating inquisitiveness in understanding cutting edge areas of
computer science engineering and allied disciplines with their potential impacts.

CO - PO Mapping

COs  POs PO1 PO2 PO3 PO4 PO5 PO6


CO1

CO2

CO3

Detailed Syllabus
Module 1 (8 hours)
Designing a learning system, Types of machine learning: Problem based learning, Supervised learning,
Unsupervised learning, Reinforcement learning, Linear Regression: Weights and Features,
Applications, Cost Functions, Finding best fit line, Gradient Descent Algorithm: Learning Algorithm,
First order derivatives, Linear regression using gradient descent, Learning theory, Risk minimization,
Learning rate, Logistic Regression, Sigmoid Function, Cost Function for Logistic Regression, Multi-
class classification, Probability Distribution, Polynomial Regression, Performance Metrics:
Classification (Confusion Matrix, Accuracy, Precision, Recall, F1-score, ROC-AUC), Regression
(MSE, MAE, RMSE, R2 Score).

Module 2 (10 hours)


Decision Tree, Selecting Best Splitting Attribute, CART (Gini Index). ID3 (Entropy, Information
Gain), Hyperparameters in Decision tree, Issues in Decision tree learning. Overfitting and
Underfitting, Bias and Variance, Theory of generalization, regularization, Cross Validation. Ensemble
Learning (Concord’s Theorem), Bagging, Bootstrap and Aggregation, Random Forest. Boosting,
AdaBoost, Gradient Boost. Feature Engineering, Feature Selection, Feature Extraction, Convexity and
Optimization, PAC learning

Module 3 (12 hours)


Artificial Neural Network, Neural network representation, Perceptron model, Stepwise v/s Sigmoid
function, Multilayer perceptron model, Matrix Calculus (Jacobian, Hessian Matrix), Computation
Graph, Backpropagation Algorithm, Activation Functions, Stochastic Gradient Descent, Batch
Gradient Descent, Mini-Batch Gradient Descent, Vanishing and Exploding Gradients, Overfitting
Problem, Sparsity, Regularization (Ridge, Lasso, Elastic), Dropout and Early Stopping, Bayesian
Learning: Bayes theorem and concept learning, Naïve Bayes classifier, Gibbs Algorithm, Bayesian
belief networks, The EM algorithm, Gaussian Distributions, Gaussian Mixture Models, Gaussian
Discriminant Analysis, Support Vector Machines, Hyperplane, Support Vectors, Kernels, Non-
Parametric Regression, Locally weighted regression, K-nearest neighbour.

Module 4 (12 hours)


Unsupervised learning (clustering, Association rule learning, Dimensionality reduction), Common
distance Measures, k-means clustering, Elbow method, Hierarchical Clustering – agglomerative and
divisive, Dendogram, Similarity measures for hierarchical clustering, DBSCAN, Cluster Quality (R
index, Silhouette Coefficient), Dimensionality Reduction, Principal Component Analysis, Singular
Vector Decomposition, T-distributed Stochastic Neighbour Embedding.

STUDIO WORK / LABORATORY EXPERIMENTS:


Students will gain practical experience with the implementation of different statistical methods by using different
statistical machine learning tools. Eventually, the lab works formulate the problem as a statistical machine learning
problem followed by its implementation.

TEXTBOOKS/LEARNING RESOURCES:
a) Masashi Sugiyama, Introduction to Statistical Machine Learning (1 ed.), Morgan Kaufmann, 2017.
ISBN 978-0128021217.
b) T. M. Mitchell, Machine Learning (1 ed.), McGraw Hill, 2017. ISBN 978-1259096952.

REFERENCE BOOKS/LEARNING RESOURCES:


a) Richard Golden, Statistical Machine Learning A Unified Framework (1 ed.), CRC Press, 2020. ISBN
9781351051490.

TEACHING-LEARNING STRATEGIES
The course will be taught using a combination of the best practices of teaching-learning. Multiple environments
will be used to enhance the outcomes such as seminar, self-learning, MOOCs, group discussions and ICT based
tools for class participation along with the classroom sessions. The teaching pedagogy being followed includes
more exposure to hands-on experiment and practical implementations done in the lab sessions. To match with the
latest trend in academics, case study, advanced topics and research oriented topics are covered to lay down the
foundation and develop the interest in the students leading to further exploration of the related topics. To make the
students aware of the industry trends, one session of expert lecture will be organized to provide a platform to the
students for understanding the relevant industry needs.
EVALUATION POLICY

Components of Course Evaluation Percentage Distribution

Midterm 15
End Semester Examination 35
Continuous Lab Evaluation 30
Quiz 5
Certification 15
Total 100

You might also like