5 Multiple Linear Regression
5 Multiple Linear Regression
Here, we take more than 1 independent variables (x1, x2, x3, … etc).
Problem: We are given Home prices in Monroe Township, NJ (USA). We should predict the prices
for the following homes:
a) 3000 sft area, 3 bed rooms, 40 years old
b) 2500 sft area, 4 bed rooms, 5 years old
dataset
homeprices.csv
Note: Since the one piece of data in no. of bedrooms missing in the dataset, we have to clean the
data. We want to find median of that column and substitute it in the missing cell.
Linear equation
program
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model
# intercept
reg.intercept_ # 383725.
# predict the price of 3000 sft area, 3 bed rooms, 40 years old house
reg.predict([[3000, 3, 40]]) # 444400.
# predict the price of 2500 sft area, 4 bed rooms, 5 years old house
reg.predict([[2500, 4, 5]]) # 588625.
hiring.csv file contains hiring statics for a firm such as experience of candidate, his written test score
and personal interview score. Based on these 3 factors, HR will decide the salary. Given this data,
you need to build a machine learning model for HR department that can help them decide salaries
for future candidates. Using this predict salaries for following candidates,
dataset
hiring.csv.