Spss Assignment 2
Spss Assignment 2
24110253
QNQ
SPSS Assignment 2
Question 1 : Formulate a simple and multiple linear regression equations for the given
datasets. (Use any one of the independent variables for simple regression equation)
Dataset#2.csv:
This dataset consists of medical costs billed by health insurance based on several factors of a
beneficiary such as age, bmi (Body Mass Index), number of children, and whether the person
is a smoker or not (1 for yes, 0 for no). Insurance costs act as a dependent variable in this
dataset.
bmi: Body mass index, providing an understanding of body, weights that are relatively high
or low relative to height
smoker: whether the beneficiary smokes or not (1 for yes, 0 for no)
Question 2 (ztest)
Data file = “data for exercise 2”
Answer:
For a simple linear regression model. I have made a model in which my independent variable
is smoking, and the dependent variable is insurance charges. After running the model, we can
see that the R-square value is 0.62 which means that the smoking variables explains our
independent variable by 62% which is a good percentage. Also, the significance level in our
model comes out to be less than 0.05, so we can reject our null hypothesis and say that there
is a relationship between the insurance expenses and smoking. If we look at the regression
equation for this model it is
Insurance Expense = 8343 + 23615 (Smoking) + error
This equation tells us that if we have a patient who is a smoker, it will increase its smoking
charges by 23615.
Now, for a multiple linear regression model, the R-square value is 0.75, the adjusted R-
square value is 0.749, and the significance level is less than 0.05 so we can reject our null
hypothesis. The regression equation is
Insurance Expense = -12102 + 23811(Smoking) + 257.8 (Age) + 321.8 (BMI) + 473.5
(Children) + error.
Research Scenario
In the population, the average AQI is 124.66 with a standard deviation of 29 . the
environment protection company manager wants to run a new environment campaign to see
if it either increase, decrease or doesn’t affect the AQI at all. A sample of 30 sales
observations were taken after running the new marketing campaign that has a mean of
138.75
Research Question
Answer:
We can use a z-test in this case because the standard deviation of the population is known,
and the sample size is also 30. The z-value comes out to be 2.644 with a significance level
of 0.00818 which is less than 0.05. Thus, we can easily say that the new environmental
campaign has an effect on the AQI. The Cohens_d value is 0.48280 which tells us the effect
size of the new values from the previous values.