Assignment 1

This document provides instructions for 4 exercises on machine learning algorithms including linear regression, ridge regression, logistic regression, and neural networks. Students are asked to write Python code to implement the algorithms, calculate relevant metrics like loss and error, and analyze results. Code submissions should include analytical work, parameter tuning, and plotting of learning curves with identification of optimal models. Proper formatting and naming of files is required.

Uploaded by

Arjun

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views

Assignment 1

Uploaded by

Arjun

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

AI: Machine Learning EC3066D Semester 6

Instructor : Anup Aprem Year: 2020–2021

Assignment 1

Instructions:

1. Your answers to the questions below, including plots and mathematical work, should be submitted as a single
PDF file (named studentrollnum_asgn1.pdf).
2. The codes should be submitted following the naming convention mentioned in each question. In addition,
please submit only your python code (with .py extension); in particular, do not submit in jupyter notebook
format.

3. Zip all the files and name it as studentrollnum_asgn1.zip (use only zip; no other formats are permitted)
4. Usage of machine learning libraries are not permitted. The assignment can be completed using (base) python,
numpy and matplotlib. Pandas can be used for loading the csv dataset.
5. Honor code: The assignments should be submitted individually. Students are encouraged to discuss the assign-
ment. However, copying is not permitted. Students found copying will be subjected to disciplinary action. In
addition, during the course viva, if you are unable to explain your assignment code, you will forfeit your credit
for the assignment.

Exercice 1: Linear Regression (with squared loss) This exercise uses the Boston house-prices dataset1 . One
option to load the Boston dataset is by calling sklearn.datasets.load_boston. The aim of the exercise is to
build a linear regression for predicting the median value of home (medv).
(a) One feature and one label: Derive an analytical expression for the coefficients of the linear regression
of the form y = wx + b. Using the derived analytical expression, write a python program to compute
the linear regression coefficients between the variables lstat (features) and medv (label) in the Boston
dataset. The python file should be name studentrollnum_exp1_a.py. For computing the inverse, you
may use the numpy.linalg.inv().
(b) Multiple features: Derive an analytical expression for the gradient of the training error for the linear
regression of the form y = wT x + b. Using gradient descent (or stochastic gradient descent), write a
python program to compute the linear regression coefficients between all 13 features and medv (label) in
the Boston dataset. The python file should be name studentrollnum_exp1_b.py.
Exercice 2: Ridge Regression This exercise again uses the Boston house-prices dataset. Ridge regression is linear
regression with `2 penalty. The training error for ridge regression is given by
n
1X 2
wT xi + b − yi + λwT w.

J(w, b) =
n i=1

(a) Compute the gradient of J(w, b), and write down the gradient descent algorithm for computing optimal w
and b.
(b) Fix a value of λ. Split the dataset into training set and test set, and write a python program to compute
regression coefficients using gradient descent. At the end, your program should print the training error
and test error. The python file should be named studentrollnum_exp2_b.py.
(c) Choose a reasonable step size. Plot the training loss and the test loss as a function of λ. The goal is to
find λ that minimizes the test loss. It is hard to predict what λ will be,
so you should start your search very
broadly, looking over several orders of magnitude. For example, λ ∈ 10−7 , 10−5 , 10−3 , 10−1 , 1, 10, 100 .

Once you find a range that works better,keep zooming in. Include this plot in your report. The python
file should be named studentrollnum_exp2_c.py.
(d) For the optimum value of λ, compare the coefficents obtained from ridge regression to that obtained using
linear regression. What is your inference?
1 Please check details of the dataset in http://lib.stat.cmu.edu/datasets/boston
Exercice 3: Logistic Regression The data set to be used is Smarket. This data set consists of percentage returns
for the S&P 500 stock index over 1,250 days, from the beginning of 2001 until the end of 2005. For each date,
the percentage returns for each of the five previous trading days were recorded, Lag1 through Lag5. Also
recorded is the Volume (the number of shares traded on the previous day, in billions), Today (the percentage
return on the date in question) and Direction (whether the market was Up or Down on this date).

(a) Write a python program to fit a logistic regression model in order to predict Direction using Lag1 through
Lag5 and Volume. The python file should be named studentrollnum_exp3_a.py.
(b) Confusion Matrix: A common performance measure of classification problems is the confusion ma-
trix. Read up more about the confusion matrix and other classification performance measures in the
following wikipedia link https://en.wikipedia.org/wiki/Confusion_matrix. Write a python program
to compute the confusion matrix of the trained logistic regression model. The python file should be
named studentrollnum_exp3_b.py.
Exercice 4: Neural Network In this exercise, we will build a 2 layer neural network model (one hidden layer)
for binary classification with nonlinear class boundaries. A dataset for this can be generated using sklearn
as sklearn.datasets.make_moons(100, noise=0.1). In the dataset, the input features are the x1 and x2
co-ordinates, and the labels (y) are binary (either 0 or 1).
Given a feature vector (the x1 and x2 co-ordinates), the neural network model should produce 2 outputs,
corresponding to the probability of belonging to either 0 or 1. The equations are as given below:

h = σ(V φ),
score = W h + b,

where, φ = [1, x1 , x2 ], V ∈ RL×3 , W ∈ R2×L , b ∈ R2 , and score ∈ R2 . Here, L denotes the number of latent
features or neurons, and σ is the activation function. You can choose either the tanh, or ReLU activation
function. The score is converted to the output using the following equation:

ŷ = softmax(score),

where the softmax function23 takes as input a vector z of real numbers, and normalizes it into a probability
distribution and is given by
ezi
softmax(z)i = P zi .
ie

In this exercise, we will use the cross-entropy loss4 given by

X
`(y, ŷ) = y log(ŷ(i))
i

(a) Derive the gradient of the loss using the backpropogation algorithm.
(b) Write a python program to train the neural network using gradient descent algorithm. The code should
be named studentrollnum_exp4_b.py.
(c) Split the dataset into train/test sets. Plot the training error and test error as a function of number of
neurons (L). The code should be named studentrollnum_exp4_c.py.

2 https://en.wikipedia.org/wiki/Softmax_function
3 You should implement softmax function carefully to avoid overflow; for details see https://www.tutorialexample.com/

implement-softmax-function-without-underflow-and-overflow-deep-learning-tutorial/
4 https://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html

The Encyclopedia of Applied Linguistics
87% (23)
The Encyclopedia of Applied Linguistics
6,836 pages
Design and Analysis of Algorithm Question Bank
No ratings yet
Design and Analysis of Algorithm Question Bank
15 pages
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
ML Lab 06 Manual - Linear Regression 1 (Version 6)
No ratings yet
ML Lab 06 Manual - Linear Regression 1 (Version 6)
8 pages
The Tip of The Iceberg: 1 Before You Start
No ratings yet
The Tip of The Iceberg: 1 Before You Start
18 pages
Ps 4
No ratings yet
Ps 4
12 pages
CIS 520, Machine Learning, Fall 2015: Assignment 2 Due: Friday, September 18th, 11:59pm (Via Turnin)
No ratings yet
CIS 520, Machine Learning, Fall 2015: Assignment 2 Due: Friday, September 18th, 11:59pm (Via Turnin)
3 pages
Regression Exercises
No ratings yet
Regression Exercises
2 pages
Machine Learning Assignments
No ratings yet
Machine Learning Assignments
3 pages
ML Coursera Python Assignments
No ratings yet
ML Coursera Python Assignments
20 pages
ML Lab 07 Manual - Linear Regression 2 (Updated Version 4)
No ratings yet
ML Lab 07 Manual - Linear Regression 2 (Updated Version 4)
8 pages
HW 3
No ratings yet
HW 3
5 pages
Exercise 03
No ratings yet
Exercise 03
5 pages
p4
No ratings yet
p4
4 pages
Lab 03 - Linear Regression
No ratings yet
Lab 03 - Linear Regression
5 pages
178 hw3
No ratings yet
178 hw3
3 pages
Linear Regression Using Python
No ratings yet
Linear Regression Using Python
15 pages
ML Lab 11 Manual - Neural Networks (Ver4)
No ratings yet
ML Lab 11 Manual - Neural Networks (Ver4)
8 pages
Lab 04 - Logisitic Regression
No ratings yet
Lab 04 - Logisitic Regression
5 pages
ML Lab 08 Manual - Logisitic Regression (Ver7)
No ratings yet
ML Lab 08 Manual - Logisitic Regression (Ver7)
9 pages
2802ICT Programming Assignment 2
No ratings yet
2802ICT Programming Assignment 2
6 pages
DL - Assignment 1
No ratings yet
DL - Assignment 1
12 pages
chapter_4_assignment (6)
No ratings yet
chapter_4_assignment (6)
5 pages
Matlab Review PDF
No ratings yet
Matlab Review PDF
19 pages
XGboost Tutorial
100% (1)
XGboost Tutorial
13 pages
Alg Ch1 Part3
No ratings yet
Alg Ch1 Part3
22 pages
1 Analytical Part (3 Percent Grade) : + + + 1 N I: y +1 I 1 N I: y 1 I
No ratings yet
1 Analytical Part (3 Percent Grade) : + + + 1 N I: y +1 I 1 N I: y 1 I
5 pages
ps2
No ratings yet
ps2
9 pages
DL_Midterm_Rubrics
No ratings yet
DL_Midterm_Rubrics
5 pages
St. Xavier's College: Python Language Lab
No ratings yet
St. Xavier's College: Python Language Lab
13 pages
Xii Cs Question Bank For Bright Students Chapter Wise Set-II
No ratings yet
Xii Cs Question Bank For Bright Students Chapter Wise Set-II
40 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Class Xii Computer Science (Question Bank - MLL)
No ratings yet
Class Xii Computer Science (Question Bank - MLL)
40 pages
Lima Spring School All Exercises Nov 3 2019
No ratings yet
Lima Spring School All Exercises Nov 3 2019
13 pages
CIS 419/519 Introduction To Machine Learning Assignment 2: Instructions
No ratings yet
CIS 419/519 Introduction To Machine Learning Assignment 2: Instructions
12 pages
FIT1053 Workshop01 Questions
No ratings yet
FIT1053 Workshop01 Questions
5 pages
CVD Lab Manual
No ratings yet
CVD Lab Manual
33 pages
Project 1
No ratings yet
Project 1
8 pages
Data Lab
No ratings yet
Data Lab
6 pages
CS335 Lab6
No ratings yet
CS335 Lab6
7 pages
HW1_Handout
No ratings yet
HW1_Handout
5 pages
CS4740/5740 Introduction To NLP Fall 2017 Neural Language Models and Classifiers
No ratings yet
CS4740/5740 Introduction To NLP Fall 2017 Neural Language Models and Classifiers
7 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
As A Single PDF
No ratings yet
As A Single PDF
3 pages
WS1
No ratings yet
WS1
5 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
CS 229, Autumn 2017 Problem Set #4: EM, DL & RL
No ratings yet
CS 229, Autumn 2017 Problem Set #4: EM, DL & RL
10 pages
Lab 2. Bisection Method: 1 Instructions
No ratings yet
Lab 2. Bisection Method: 1 Instructions
3 pages
Computer Oriented Numerical Methods
No ratings yet
Computer Oriented Numerical Methods
5 pages
Home Exercise 3: Dynamic Programming and Randomized Algorithms
No ratings yet
Home Exercise 3: Dynamic Programming and Randomized Algorithms
5 pages
MIT6 189IAP11 hw2
No ratings yet
MIT6 189IAP11 hw2
8 pages
Question Bank With Answers
No ratings yet
Question Bank With Answers
49 pages
Regression Ex
No ratings yet
Regression Ex
8 pages
Programming Exercise 4: Neural Networks Learning
No ratings yet
Programming Exercise 4: Neural Networks Learning
15 pages
Ex 2
No ratings yet
Ex 2
13 pages
Deep Learning
No ratings yet
Deep Learning
100 pages
Chapter 1
No ratings yet
Chapter 1
88 pages
C2_W1_Lab01_Neurons_and_Layers
No ratings yet
C2_W1_Lab01_Neurons_and_Layers
7 pages
Linear Regression
No ratings yet
Linear Regression
14 pages
Ex 3
No ratings yet
Ex 3
12 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Infinite Series - 2
100% (1)
Infinite Series - 2
17 pages
Numerical Integratio N: Prepared By: Engr. Cielito V. Maligalig
No ratings yet
Numerical Integratio N: Prepared By: Engr. Cielito V. Maligalig
22 pages
Department of M Athe
No ratings yet
Department of M Athe
23 pages
JBI RCTs Appraisal Tool2017 2
No ratings yet
JBI RCTs Appraisal Tool2017 2
9 pages
Analytic Hierarchy Process (AHP)
No ratings yet
Analytic Hierarchy Process (AHP)
8 pages
Linear Buckling Analysis - Points To Must Remember
No ratings yet
Linear Buckling Analysis - Points To Must Remember
2 pages
ILM Assessment Terminology PDF
No ratings yet
ILM Assessment Terminology PDF
2 pages
Online Learning: T T T T T T T T
No ratings yet
Online Learning: T T T T T T T T
8 pages
Pollution) : A New Device For Facilitating The Sequestration and Mineralization of (CO2)
No ratings yet
Pollution) : A New Device For Facilitating The Sequestration and Mineralization of (CO2)
3 pages
Prof. Januario Flores JR
No ratings yet
Prof. Januario Flores JR
14 pages
Statistics Minitab - Lab 8: and Standard Deviation
No ratings yet
Statistics Minitab - Lab 8: and Standard Deviation
5 pages
Volumetric Analysis
No ratings yet
Volumetric Analysis
2 pages
MAST 687 Control Theory
No ratings yet
MAST 687 Control Theory
2 pages
Psychological Statistics Midterm
No ratings yet
Psychological Statistics Midterm
29 pages
Chapter 19 PowerPoint
No ratings yet
Chapter 19 PowerPoint
27 pages
Continuous Functions
No ratings yet
Continuous Functions
11 pages
Numerical Integration: Md. Mehedi Hasan
No ratings yet
Numerical Integration: Md. Mehedi Hasan
14 pages
Mach Learning Qs
No ratings yet
Mach Learning Qs
7 pages
1D Spring/Truss Elements: MCEN 4173/5173
No ratings yet
1D Spring/Truss Elements: MCEN 4173/5173
32 pages
A Non-Linear Black-Scholes Equation
No ratings yet
A Non-Linear Black-Scholes Equation
8 pages
Thesis Presentation Slides UNIBG
No ratings yet
Thesis Presentation Slides UNIBG
33 pages
Juliana Kalaany (Assignment) DCSN
No ratings yet
Juliana Kalaany (Assignment) DCSN
5 pages
Exp 12-B-Sp 07
No ratings yet
Exp 12-B-Sp 07
7 pages
Book Reviews
No ratings yet
Book Reviews
28 pages
Quality Management System Equipment and Calibration Quiz Answer Key 2017 - v0.1 PDF
100% (1)
Quality Management System Equipment and Calibration Quiz Answer Key 2017 - v0.1 PDF
1 page
Spectrophotometric Determination of Acid Dissociation of Methyl Red
No ratings yet
Spectrophotometric Determination of Acid Dissociation of Methyl Red
2 pages