Sas Baseball PROJECT
Sas Baseball PROJECT
Sas Baseball PROJECT
The Baseball dataset contains details of baseball players in the year 1986. The data also has
parameters depicting performance of the players and their career records.
out=work.baseball
DBMS=xlsx
replace;
run;
run;
run;
c) Generate a list of the top 5 Home Run Players.
out=baseball_data;
by descending nHome;
run;
data top_5H;
run;
run;
out=baseball2;
by descending Salary;
run;
data Top_paid;
run;
run;
Model Salary=nHome;
run;
f) Add more explanatory variables nAtBat, nHits, nHome, nRuns, nRB, nBB, NBB, nOuts, nError.
run;
g) Identify from the results, which factors have high impact on Salary in comparison to Home
Runs.
Solution: From the above results we can see that nHits, Nbb, nOuts,nAtBat are significant
factors that have impact on salary as p value for thaem is less than 0.05 While p-value for
nHome is 0.7838 (>0.05). So nHome is insignificant and does not impact the Salary.Also For
Factors like nRuns ,Nrbi and nError p-value >0.05 So these factors are also insignificant. So
nHits, Nbb, nOuts,nAtBat have high impact on Salary as compared to nHome.
set work.baseball;
end;
run;
run;
i) Calculate the impact of Performance Scores (ps) on Salary.
model Salary=ps;
run;
j) Explain the results.
Solution: From the above results we can see that although ps is significant as p-value for ps
(<0.0001) is less than 0.05 but adjusted R-square value is 0.1573 i.e. adjusted R-square <0.7 so
the regression model is insignificant this implies that salary is correlated with ps but ps does
not explain much of variability in salary.