Assignment 2
Assignment 2
Assignment 2
Problem Statement:
A trading company X needs to predict the purchasing power of consumers in a given city, so as to
decide the company’s investment in that city. The company has some historical data that has different
attributes of various cities and corresponding purchasing power of consumers (more details in
Dataset section). Develop linear and ridge regression methods to help the company in predicting
purchasing powers of the customers.
Implementation: [2+4+2+2=10]
● Exploratory data analysis and Feature scaling
● Implementation of closed form solution approach towards linear regression
[LIN_MODEL_CLOSED]
● Implementation of gradient descent approach towards linear regression [LIN_MODEL_GRAD]
with Minibatch
● Implementation of gradient descent approach towards linear regression with regularization
(ridge regression) with Minibatch [LIN_MODEL_RIDGE]
**Implement [LIN_MODEL_CLOSED] from scratch. You may make use of the numpy library to
perform matrix operations.
**In general, you may use libraries to process and handle data.
**For training [LIN_MODEL_GRAD] and [LIN_MODEL_RIDGE], use minibatch size of 256 and total
no of epochs 50.
**Performance Metric to be used for evaluating the models is Mean Square Error (MSE)
Experiments: [2+2+3+3+3=13]
The dataset will be split into Train:Validation:Test with 60:20:20 ratio.
1. Experiment 1: EDA: Show the distribution of the features and their pair-wise correlation
5. Experiment 5:
a. Derive performance of the three models [LIN_MODEL_CLOSED],
[LIN_MODEL_GRAD] and [LIN_MODEL_RIDGE] with the optimal hyperparameters
found in the earlier experiments. Use held out 20% data (test) for estimating the
performance measured with MSE.
b. Report your observations with appropriate explanations.
Datasets:
This dataset comprises sales transactions captured at a retail store.You can find the dataset here.
Data Overview:
1. A single python code (.py) containing the implementations of the models and experiments with
comments at function level. The first two lines should contain your name and roll no.
Responsible TAs:
Please write to the following TAs for any doubt or clarification regarding Assignment 2
Aditya Chawla: adityachawla700@gmail.com
Ramishetti Sai sreeja: ramishetti@iitkgp.ac.in
Deadline:
The deadline for submission is 26th August (Saturday), 11:55PM, IST. Irrespective of the time in
your device, once submission in moodle is closed, no request for submission post-deadline will be
entertained. No email submission will be considered. So, it is suggested that you start submitting the
solution at least one hour before the deadline.