Lab 1

The document describes implementing linear regression on a diabetes dataset and logistic regression on a breast cancer dataset. For linear regression, the diabetes data is split into training and test sets and the model is fit and evaluated, achieving a score of 0.58. Plots of each feature vs targets show some linear relationships. For logistic regression, the breast cancer data is split and a model is trained and predicted on, achieving 93% accuracy.

Uploaded by

sheibha

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

75 views

Lab 1

Uploaded by

sheibha

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Lab1: Implementation of Linear Regression and

Logistic Regression Models

Linear Regression
Aim:
To Implement Linear regression on diabetes dataset.
Algorithm:
1. From sklearn library import the diabetes dataset and linear regression tool.
2. Load the diabetes dataset onto your notebook.
3. Create x_train, y_train, x_test, y_test. Train the x-axis with data and y-axis with target.
4. Fit the regression model on the dataset and get the coefficient values and predicted x_test
values in an array.
5. Get the linear regression score of the model.
6. Plot a graph for the above data.
7. Then create subplots for each attribute. Check the attribute that model works best with.
Program:
import numpy as np
import pandas as pd
import matplotlib. pyplot as plt
from sklearn import linear_model
from sklearn import datasets
diabetes = datasets.load_diabetes()
linreg = linear_model.LinearRegression()
x_train = diabetes.data[:-10]
y_train = diabetes.target[:-10]
x_test = diabetes.data[-10:]
y_test = diabetes.target[-10:]
linreg.fit(x_train,y_train)
linreg.coef_ ypred = linreg.predict(x_test)
ypred
y_test
linreg.score(x_test,y_test)
plt.figure(figsize=(8,12))
for f in range(0,10):
xi_test = x_test[:,f]
xi_train = x_train[:,f]
xi_test = xi_test[:,np.newaxis]
xi_train = xi_train[:,np.newaxis]
linreg.fit(xi_train, y_train)
y = linreg.predict(xi_test)
plt.subplot(5,2,f+1)
plt.scatter(xi_test,y_test,color='k')
plt.plot(xi_test, y, color='b', linewidth=3)
plt. show()

Figure 1: A linear regression represents a linear correlation between a feature and the targets

Experimental Results:
DATA DESCRIPTION:

The Diabetes dataset consider in this experiment contains physiological data collected on 442
patients and as a corresponding target an indicator of the disease progression after a year. The
physiological data occupy the first 10 columns with values that indicate respectively the
following: • Age • Sex • Body mass index • Blood pressure • S1, S2, S3, S4, S5, and S6 (six
blood serum measurements. When we execute the below line, we get the linear regression
score(R2). The linear regression score gives the variance of the feature (R2) ie the linear
relationship between the feature and the targets. The values of R2 range from 0 to 1. An R2 close
to zero indicates a model with very little explanatory power. An R2 close to one indicates a model
with more explanatory power. But in our experiment, we got 0.58 as the score which shows that
the model is providing medium explanatory power.

linreg.score(x_test,y_test)
Out[ ]: 0.58507530226905713
Next, we have drawn regression line for all 10 features, creating 10 models and seeing the
result for each of them through a linear chart.

Figure 2. Ten Linear charts showing the correlations between physiological factors and
the progression of diabetes
From the above 10 charts, we observed that features 'age', 'bmi', 'blood
pressure','s1','s2','s3','s4','s5','s6' are showing regression line with linear relationship whereas the
feature 'sex' is not showing linear relationship with the target variable(disease).

Result:
Hence we implemented Linear Regression model on diabetes dataset .
Logistic Regression

Aim: To implement logistic regression on Breast Cancer dataset.

Algorithm:
Step 1: Start the program
Step 2: Import the required libraries
Step 3: Load the breast cancer dataset
Step 4: Identify the dependent variable and independent variables and determine if the problem
is a binary classification problem.
Step 5: Clean and pre-process the data, and make sure the data is suitable for logistic regression
modeling.
Step 6: Train the logistic regression model on the selected independent variables and estimate
the coefficients of the model.
Step 7: Evaluate the performance of the logistic regression model using appropriate metrics
such as accuracy.
Step 6: End the program
Code:
#Logistic Regression.py

import numpy as np

def sigmoid(x):
return 1/(1+np.exp(-x))

class LogisticRegression():

def init(self, lr=0.001, n_iters=1000):

self.lr = lr
self.n_iters = n_iters
self.weights = None
self.bias = None

def fit(self, X, y):

n_samples, n_features = X.shape
self.weights = np.zeros(n_features)
self.bias = 0

for _ in range(self.n_iters):
linear_pred = np.dot(X, self.weights) + self.bias
predictions = sigmoid(linear_pred)

dw = (1/n_samples) * np.dot(X.T, (predictions - y))

db = (1/n_samples) * np.sum(predictions-y)

self.weights = self.weights - self.lr*dw

self.bias = self.bias - self.lr*db

def predict(self, X):

linear_pred = np.dot(X, self.weights) + self.bias
y_pred = sigmoid(linear_pred)
class_pred = [0 if y<=0.5 else 1 for y in y_pred]
return class_pred
#Train.py

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import datasets
import matplotlib.pyplot as plt
from LogisticRegression import LogisticRegression

bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)

clf = LogisticRegression(lr=0.01)
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)

def accuracy(y_pred, y_test):

return np.sum(y_pred==y_test)/len(y_test)

acc = accuracy(y_pred, y_test)

print(acc)
Experimental Results:
Data Description:
The Breast Cancer dataset consider in this experiment contains 2 classes ie Benign and
Malignant. The total number of samples considered is 569 and the total number of features is 30.
If the probability of the classifier is above 0.5 then it belongs to benign class and if the probability
of the classifier is below 0.5 then it will be considered for malignant class. Logistic Regression
gave the accuracy of 93% on breast cancer data.
Output:
Accuracy -93.056

Result:
From the output results, we found out that Logistic Regression has given 93% on breast cancer
dataset.

Mod 6 and 7 Worksheet
0% (4)
Mod 6 and 7 Worksheet
8 pages
QAch 07
50% (2)
QAch 07
23 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
Assignment 1:: Intro To Machine Learning
No ratings yet
Assignment 1:: Intro To Machine Learning
6 pages
Lesson Plan Template
No ratings yet
Lesson Plan Template
6 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
Lecture Material 11
No ratings yet
Lecture Material 11
14 pages
Lecture Notes - Linear Regression
No ratings yet
Lecture Notes - Linear Regression
26 pages
ML PR-2
No ratings yet
ML PR-2
11 pages
Lab#10 Ai
No ratings yet
Lab#10 Ai
3 pages
Linear regression
No ratings yet
Linear regression
1 page
EXP-4 DMusingPYTHON
No ratings yet
EXP-4 DMusingPYTHON
7 pages
Lab Experiments Vi Sem-1
No ratings yet
Lab Experiments Vi Sem-1
10 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Regression Dataset Example
No ratings yet
Regression Dataset Example
14 pages
Intro To Forecasting
No ratings yet
Intro To Forecasting
15 pages
Regression Analysis
No ratings yet
Regression Analysis
16 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
ML Minors Exp8
No ratings yet
ML Minors Exp8
6 pages
Intro to Linear and Logistic Reg
No ratings yet
Intro to Linear and Logistic Reg
5 pages
22IZ023 Nikhil - Exercise 6_ Linear Regression
No ratings yet
22IZ023 Nikhil - Exercise 6_ Linear Regression
4 pages
Exp2 Milf
No ratings yet
Exp2 Milf
7 pages
Machine Learning LAB
No ratings yet
Machine Learning LAB
20 pages
Whole ML PDF 1614408656
100% (1)
Whole ML PDF 1614408656
214 pages
Experiment 5
100% (1)
Experiment 5
6 pages
machine learning final manual
No ratings yet
machine learning final manual
45 pages
B-56 Sanket Jambhulkar MLA-2
No ratings yet
B-56 Sanket Jambhulkar MLA-2
8 pages
Monika Sree 11-07-2024
No ratings yet
Monika Sree 11-07-2024
36 pages
Em Semester Project
No ratings yet
Em Semester Project
21 pages
SC&RP - Unit 5
No ratings yet
SC&RP - Unit 5
36 pages
2.1 ML (Implementation of Simple Linear Regression in Python)
No ratings yet
2.1 ML (Implementation of Simple Linear Regression in Python)
8 pages
Linear Model
No ratings yet
Linear Model
10 pages
Data Analysis
No ratings yet
Data Analysis
8 pages
FDS Lab Question Bank
No ratings yet
FDS Lab Question Bank
11 pages
Himmatun Najah - SVM SVR Data Mining
No ratings yet
Himmatun Najah - SVM SVR Data Mining
12 pages
Logistic Regression
100% (1)
Logistic Regression
10 pages
Multiple Regression
No ratings yet
Multiple Regression
7 pages
AIH_LAB1
No ratings yet
AIH_LAB1
10 pages
utf-8''C2M1 Assignment
No ratings yet
utf-8''C2M1 Assignment
24 pages
Unit 2 Regression Analysis
No ratings yet
Unit 2 Regression Analysis
16 pages
lecture2-supervised-learning slides
No ratings yet
lecture2-supervised-learning slides
56 pages
lab-5-nguyenngocmaithi-20130120
No ratings yet
lab-5-nguyenngocmaithi-20130120
20 pages
DS Unit 2 Essay Answers
No ratings yet
DS Unit 2 Essay Answers
17 pages
Simple Linear Regression - Assignn5
No ratings yet
Simple Linear Regression - Assignn5
8 pages
Nonlinear Model
No ratings yet
Nonlinear Model
3 pages
Machine Learning
100% (5)
Machine Learning
56 pages
Introduction To Machine Learning Algorithms: Linear Regression
No ratings yet
Introduction To Machine Learning Algorithms: Linear Regression
1 page
Simple Linear Regression - Assign3
No ratings yet
Simple Linear Regression - Assign3
8 pages
Simple Linear Regression - Assign3
No ratings yet
Simple Linear Regression - Assign3
8 pages
Bussiness Report PM
No ratings yet
Bussiness Report PM
44 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
Simple Linear Regression: Math Behind
No ratings yet
Simple Linear Regression: Math Behind
6 pages
Experiment1 Explanation
No ratings yet
Experiment1 Explanation
6 pages
ML 5th
No ratings yet
ML 5th
8 pages
MACHINE LEARNING AND DATA ANALYTICS USING PYTHON LAB
No ratings yet
MACHINE LEARNING AND DATA ANALYTICS USING PYTHON LAB
36 pages
PythonForML2023 Laboratory07 08 Regression Classification Update2
No ratings yet
PythonForML2023 Laboratory07 08 Regression Classification Update2
6 pages
How to Perform Simple Linear Regression in Python
No ratings yet
How to Perform Simple Linear Regression in Python
8 pages
Linear Regression
No ratings yet
Linear Regression
13 pages
Chapter 6: How To Do Forecasting by Regression Analysis
No ratings yet
Chapter 6: How To Do Forecasting by Regression Analysis
7 pages
UNIT-1 Polynomial Regression
No ratings yet
UNIT-1 Polynomial Regression
7 pages
State Space Analysis and Control Design
No ratings yet
State Space Analysis and Control Design
5 pages
Robust Bi-Objective Shortest Path Problem in Real Road Networks
No ratings yet
Robust Bi-Objective Shortest Path Problem in Real Road Networks
9 pages
Lung Cancer Detection
No ratings yet
Lung Cancer Detection
8 pages
Exercise 12: XX X XX X
No ratings yet
Exercise 12: XX X XX X
1 page
SVM - An Essay
No ratings yet
SVM - An Essay
1 page
AWGN Channel PDF
No ratings yet
AWGN Channel PDF
3 pages
4 6filter Banks
No ratings yet
4 6filter Banks
9 pages
9.5 The Simplex Method: Mixed Constraints: Dard Form. The Constraints For The Maximization Problems All Involved
No ratings yet
9.5 The Simplex Method: Mixed Constraints: Dard Form. The Constraints For The Maximization Problems All Involved
11 pages
CH02 PDF
No ratings yet
CH02 PDF
108 pages
TC 503 Digital Communication Theory: Course Teacher: Dr. Muhammad Imran Aslam
No ratings yet
TC 503 Digital Communication Theory: Course Teacher: Dr. Muhammad Imran Aslam
47 pages
Integer Programing
100% (1)
Integer Programing
22 pages
Exponential Equations Day 3
No ratings yet
Exponential Equations Day 3
6 pages
Types of Database Management Systems
No ratings yet
Types of Database Management Systems
4 pages
2.05 Application of Matrices in Cryptography
0% (1)
2.05 Application of Matrices in Cryptography
4 pages
CSC304 Lecture 6
No ratings yet
CSC304 Lecture 6
21 pages
Tuning of PID Controller: What Is A PID Control?
No ratings yet
Tuning of PID Controller: What Is A PID Control?
4 pages
Generalized Method of Moments Estimation PDF
No ratings yet
Generalized Method of Moments Estimation PDF
29 pages
Hussain, Mahmud - 2019 - pyMannKendall A Python Package For Non Parametric Mann Kendall Family of Trend Tests
No ratings yet
Hussain, Mahmud - 2019 - pyMannKendall A Python Package For Non Parametric Mann Kendall Family of Trend Tests
3 pages
03 New To Algorithms and Programming
No ratings yet
03 New To Algorithms and Programming
2 pages
SK Learn 1
No ratings yet
SK Learn 1
11 pages
Lecture06 - Interpolation by Spline Functions
No ratings yet
Lecture06 - Interpolation by Spline Functions
39 pages
Dynamic Programming: Design and Analysis of Algorithms
No ratings yet
Dynamic Programming: Design and Analysis of Algorithms
41 pages
AML Using AntiBenford subgraphs Summary
No ratings yet
AML Using AntiBenford subgraphs Summary
4 pages
Time Series Analysis
No ratings yet
Time Series Analysis
58 pages
Chp4 Crisp and Fuzzy Relation, Membership Functions1
No ratings yet
Chp4 Crisp and Fuzzy Relation, Membership Functions1
34 pages
Padeepz App AD3491 Syllabus
No ratings yet
Padeepz App AD3491 Syllabus
2 pages
Differentiation Markscheme Word
No ratings yet
Differentiation Markscheme Word
8 pages
Chapter 3.3 Project Management
No ratings yet
Chapter 3.3 Project Management
8 pages
Domain 1-cap mcq
No ratings yet
Domain 1-cap mcq
2 pages
Linked List Delete A Node
No ratings yet
Linked List Delete A Node
4 pages