Assignment 2
Assignment 2
Q.1. Construct a Linear Regression Model for the following data set. Predict the price for a house with area
2200 sq ft and 3 bedrooms.
Q.2. Construct a Linear Regression Model for the following data set. Predict the price for a car with age 3 years
and mileage 50,000 miles.
Q.3. Actual and predicted values are given for 10 students in the following data set.
Q.4. Use Logistic Regression to predict the result (pass or fail) of a student based on number of study hour. Use
the model to predict whether a student will pass if the number of study hours is
(a) 4 (b) 6
𝟎, 𝒙 < 𝟎. 𝟓
Use the Threshold Function as 𝑻𝒉𝒓𝒆𝒔𝒉(𝒙) = { and 𝒘 = (𝒘𝟎 , 𝒘𝟏 ) = (−𝟔, 𝟏. 𝟓).
𝟏, 𝒙 ≥ 𝟎. 𝟓
Q.5. Construct a Decision Tree for predicting whether a student will pass an exam based on the features: Study
Hours and Number of Homework Completed.
Q. 6. Suppose we have the following dataset. In this dataset, there are four attributes. And on the basis of these
attributes, make a Decision Tree.
mid No hardware Up
mid No software Up
new No hardware Up
new No software Up
Q.7. Derive the Mean Squared Error function in matrix form for the Linear Regression Model, i.e. Derive
1
𝐿(𝑤) = (𝑋𝑤 − 𝑦𝑡𝑟𝑢𝑒 )𝑇 (𝑋𝑤 − 𝑦𝑡𝑟𝑢𝑒 ).
𝑚
Q.8. Derive the minimum value for the MSE 𝐿(𝑤) using gradient. i.e. Derive 𝑤 = (𝑋 𝑇 𝑋)−1 𝑋 𝑇 𝑦𝑡𝑟𝑢𝑒 .
Q. 9. Use K-Means Algorithm (up to the second iteration) to cluster the following eight points into three
clusters: A(2, 10), B(2, 5), C(8, 4), D(5, 8), E(7, 5), F(6, 4), G(1, 2), H(4, 9) with A(2, 10), D(5, 8) and G(1, 2)
as the initial cluster centroids. Consider the following two distance functions in two different cases.
(a) The distance function between two points 𝑎 = (𝑥1 , 𝑦1 ) and 𝑏 = (𝑥2 , 𝑦2 ) is defined as 𝑑(𝑎, 𝑏) = |𝑥1 − 𝑥2 | +
|𝑦1 − 𝑦2 |.
(b) The distance function between two points 𝑎 = (𝑥1 , 𝑦1 ) and 𝑏 = (𝑥2 , 𝑦2 ) is defined as 𝑑(𝑎, 𝑏) =
√(𝑥1 − 𝑥2 )2 + (𝑦1 − 𝑦2 )2 .
Q.10. The dataset of pass or fail in an exam of 6 students is given in the table.
Study Hour 29 16 33 28 29 30
(X)
Pass (1), 0 0 1 0 1 1
Fail (0)
Use logistic regression as classifier, calculate the probability of pass for the student who studied 30 hours, where
𝒘 = (𝒘𝟎 , 𝒘𝟏 ) = (−𝟔𝟒, 𝟐).