Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
100% found this document useful (1 vote)
220 views

Regression Models Course Project

This report analyzes the relationship between transmission type (manual or automatic) and miles per gallon (MPG) in vehicles using regression models on the mtcars dataset. A t-test showed that manual transmission vehicles have 7.245 greater MPG than automatic vehicles. However, a multiple linear regression found that while manual transmission contributed to slightly higher MPG (1.81 MPG), other factors like weight, horsepower, and number of cylinders were more significant predictors of MPG. Diagnostic checks of the regression model met assumptions of linear regression and showed it adequately explained the relationship between predictors and MPG.

Uploaded by

Kibrom Haftu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
220 views

Regression Models Course Project

This report analyzes the relationship between transmission type (manual or automatic) and miles per gallon (MPG) in vehicles using regression models on the mtcars dataset. A t-test showed that manual transmission vehicles have 7.245 greater MPG than automatic vehicles. However, a multiple linear regression found that while manual transmission contributed to slightly higher MPG (1.81 MPG), other factors like weight, horsepower, and number of cylinders were more significant predictors of MPG. Diagnostic checks of the regression model met assumptions of linear regression and showed it adequately explained the relationship between predictors and MPG.

Uploaded by

Kibrom Haftu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Regression Models Project -Motor Trend Data Analysis Report

Kibrom Haftu
March 22, 2018

Executive Summary
This report analyzes and explore the relationship between transmission type (manual or automatic) and miles per
gallon (MPG). The report set out to determine which transmission type produces a higher MPG. The
mtcars dataset was used for this analysis. A t-test between automatic and manual transmission vehicles
shows that manual transmission vehicles have a 7.245 greater MPG than automatic transmission vehicles.
After fitting multiple linear regressions, analysis showed that the manual transmission contributed less
significantly to MPG, only an improvement of 1.81 MPG. Other variables, weight, horsepower, and number
of cylinders contributed more significantly to the overall MPG of vehicles.

Loading Data
Load the dataset and convert categorical variables to factors.
First, we load the data set mtcars and change some variables from numeric class to factor class.

library(ggplot2)
data(mtcars)
head(mtcars, n=3) # Sample Data
dim(mtcars)
mtcars$cyl <- as.factor(mtcars$cyl)
mtcars$vs <- as.factor(mtcars$vs)
mtcars$am <- factor(mtcars$am)
mtcars$gear <- factor(mtcars$gear)
mtcars$carb <- factor(mtcars$carb)
attach(mtcars)

Exploratory Analysis
See Appendix Figure I EDA that compares Automatic and Manual transmission MPG.
The graph leads us to believe that there is a significant increase in MPG when for vehicles with a manual
transmission vs automatic.
Inference
At this step, we make the null hypothesis as the MPG of the automatic and manual transmissions are from the same.
We use the two sample T-test to show it.

testResults <- t.test(mpg ~ am)


testResults$p.value

## [1] 0.001373638

The T-Test rejects the null hypothesis that the difference between transmission types is 0.

testResults$estimate
## mean in group 0 mean in group 1
## 17.14737 24.39231

The difference estimate between the 2 transmissions is 7.24494 MPG in favor of manual.
Regression Analysis
First, we fit the full model as the following.

fullModelFit <- lm(mpg ~ ., data = mtcars)


summary(fullModelFit) # results hidden
summary(fullModelFit)$coeff # results hidden

Since none of the coefficients have a p-value less than 0.05 we cannot conclude which variables are more
statistically significant.
Then, we use backward selection to select some statistically significant variables.

stepFit <- step(fullModelFit)


summary(stepFit) # results hidden
summary(stepFit)$coeff # results hidden

This new model has 4 variables (cylinders, horsepower, weight, transmission). The R-squared value of
0.8659 confirms that this model explains about 87% of the variance in MPG. The p-values also are
statistically significantly because they have a p-value less than 0.05. The coefficients conclude that
increasing the number of cylinders from 4 to 6 with decrease the MPG by 3.03. Further increasing the
cylinders to 8 with decrease the MPG by 2.16. Increasing the horsepower is decreases MPG 3.21 for
every 100 horsepower. Weight decreases the MPG by 2.5 for each 1000 lbs increase. A Manual
transmission improves the MPG by 1.81.
Residuals and Diagnostics
Residual Plot See Appendix Figure II

According to the residual plots, we can verify the following underlying assumptions:

1. The randomness of the Residuals vs. Fitted plot supports the assumption of independence
2. The points of the Normal Q-Q plot following closely to the line conclude that the distribution of
residuals is normal
3. The Scale-Location plot random distribution confirms the constant variance assumption
4. Since all points are within the 0.05 lines, the Residuals vs. Leverage concludes that there are no
outliers

sum((abs(dfbetas(stepFit)))>1)

## [1] 0

Therefore, the above analyses meet all basic assumptions of linear regression and well answer the questions.
Conclusion
There is a difference in MPG based on transmission type. A manual transmission will have a slight MPG
boost. However, it seems that weight, horsepower, & number of cylinders are more statistically significant
when determining MPG.
Appendix Figures I

II

You might also like