Python For Data Science - Unit 6 - Week 4
Python For Data Science - Unit 6 - Week 4
(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)
rashmirs.ec@hkbk.edu.in
Register for Certification exam
(https://examform.nptel.ac.in/2023_01/exam_form/dashboard)
Course outline
Week 0 ()
Week 1 ()
Week 2 ()
Week 3 ()
Week 4 ()
Download Videos ()
Books ()
Text Transcripts ()
Week 4: Assignment 4
Your last recorded submission was on 2023-02-20, 22:42 IST Due date: 2023-02-22, 23:59 IST.
1) Which of the following are regression problems? Assume that appropriate data is 1 point
given.
Classify web text into one of the following categories: Sports, Entertainment, or
Technology.
3) If a linear regression model achieves zero training error, can we say that all the data 1 point
points lie on a hyperplane in the (d+1)-dimensional space? Here, d is the number of features.
Yes
No
Read the information given below and answer the questions from 4 to 6:
Data Description:
An automotive service chain is launching its new grand service station this weekend.They offer
to service a wide variety of cars. The current capacity of the station is to check 315 cars
thoroughly per day. As an inaugural offer, they claim to freely check all cars that arrive on their
launch day, and report whether they need servicing or not!
Unexpectedly, they get 450 cars. The servicemen will not work longer than the working hours, but
the data analysts have to!
Can you save the day for the new service station?
He has been given a data set, ‘ServiceTrain.csv’ that contains some attributes of the car that can
be easily measured and a conclusion that if a service is needed or not.
Now for the cars they cannot check in detail, they measure those attributes and store them in
‘ ServiceTest.csv
(https://drive.google.com/file/d/1RGrJC55RXuK2Z7TBO6vGuOYSWnclZ2ZI/view?usp=sharing) ’
Problem Statement: Use machine learning techniques to identify whether the cars require
service or not
4) Which of the following machine learning techniques would NOT be appropriate to 1 point
solve the problem given in the problem statement?
kNN
Random Forest
Logistic Regression
Linear regression
Prepare the data by following the steps given below, and answer questions 6 and 7.
Encode categorical variable, Service - Yes as 1 and No as 0 for both the train and test
datasets.
Split the set of independent features and the dependent feature on both the train and test
datasets.
Set random_state for the instance of the logistic regression class as 0.
5) After applying logistic regression, what is/are the correct observations from the 1 point
resultant confusion matrix?
6) The logistic regression model built between the input and output variables is 1 point
checked for its prediction accuracy of the test data. What is the accuracy range (in %) of the
predictions made over test data?
60 - 79
90 - 95
30 – 59
80 – 89
Standardization
Dummy variables
Correlation
The Global Happiness Index report contains the Happiness Score data with multiple features
(namely the Economy, Family, Health, and Freedom) that could affect the target variable value.
Prepare the data by following the steps given below, and answer question 8
Split the set of independent features and the dependent feature on the given dataset
Create training and testing data from the set of independent features and dependent
feature by splitting the original data in the ratio 3:1 respectively, and set the value for
random_state of the training/test split method’s instance as 1
8) A multiple linear regression model is built on the Global Happiness Index dataset 1 point
“GHI Report.csv
(https://drive.google.com/file/d/1oUCX0DztVDCah_AajtYrn1KzuVGCMndk/view?usp=sharing)”.
What is the RMSE of the baseline model?
2.00
0.50
1.06
0.75
9) A regression model with the following function y = 60 + 5.2x was built to understand 1 point
the impact of humidity (x) on rainfall (y). The humidity this week is 30 more than the previous
week. What is the predicted difference in rainfall?
156 mm
15.6 mm
-156 mm
10) X and Y are two variables that have a strong linear relationship. Which of the 1 point
following statements are incorrect?
One variable may or may not cause a change in the other variable.
You may submit any number of times before the due date. The final submission will be
considered for grading.
Submit Answers