Using Excel
Using Excel
1
1. Histograms in Exel
1
Select
Tools/Data Analysis
2
1. Histograms in Exel (contiued)
2
Choose Histogram
3
Input data and bin ranges
3
Exercise 1:
• Given below are the heigts, in centimetres of 50 students:
164 155 160 162 172 171 162 160 162 159
160 158 166 172 158 163 165 164 161 158
160 170 168 157 168 166 160 162 163 167
171 164 167 158 159 160 163 167 168 159
160 162 170 168 164 160 168 165 165 160
4
Excercise 1 (continued)
• 1. Place the data in ordered array.
• 2. Set up a stem-and-leaf display for these data.
• 3. Construct the frequency distribution and the
percentage distribution for these data.
• 4. Construct a grouped frequency distribution table
with the width of classe interval of 5cm.
• 5. From this frequency distribution table, construct
the bar graphs and the pie charts.
5
2. Descriptive statistics
• Use menu choice:
tools / data analysis / descriptive statistics
• Enter details in dialog box
6
2. Descriptive statistics (continued)
7
2. Descriptive statistics(continued)
Click OK
8
2. Descriptive statistics (continued)
Microsoft Excel
descriptive statistics output,
using the house price data:
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
9
3. Simple Linear Regression
Sample Data for House Price Model:
House Price in $1000s Square Feet
(y) (x)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700 10
3. Simple linear regression
• Tools / Data Analysis /Regression
11
3. Simple linear regression : Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082 The regression equation is:
Adjusted R
Square 0.52842 house price 98.24833 0.10977 (square feet)
Standard Error 41.33032
Observations 10
ANOVA Significance
df SS MS F F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Upper
Coefficients Standard Error t Stat P-value Lower 95% 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
12
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
3. Regression Excel Output
(continued) (Graphical Presentation)
• House price model: scatter plot and regression line
450
400
House Price ($1000s)
350
Slope
300
250
= 0.10977
200
150
100
50
Intercept 0
= 98.248 0 500 1000 1500 2000 2500 3000
Square Feet
ANOVA Significance
df SS MS F F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Upper
Coefficients Standard Error t Stat P-value Lower 95% 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
16
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
3. Simple LinearRegression:
t Test for the Slope, b1
• t test for a population slope
– Is there a linear relationship between x and y?
• Hypotheses
– H0: β1 = 0 (no linear relationship)
– H1: β1 0 (linear relationship does exist)
• Test statistic
where:
– b1 β1 b1 = Sample regression slope
t coefficient
sb1 β1 = Hypothesized slope
sb1 = Estimator of the standard
– d.f. n 2 error of the slope 17
3. t Test for the Slope, the standard error Excel
Output
Regression Statistics sε 41.33032
Multiple R 0.76211
R Square 0.58082
Adjusted R
Square 0.52842 sb1 0.03297
Standard Error 41.33032
Observations 10
ANOVA Significance
df SS MS F F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Upper
Coefficients Standard Error t Stat P-value Lower 95% 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 18
232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
3. t Test for the Slope (continued)
Test Statistic: t = 3.329
b1 sb1 t
H0:β1 =0 From Excel output:
H1:β1 0 Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
d.f. = 10-2 = 8
Decision:
/2=.025 /2=.025 Reject H0
Conclusion:
Reject H0 Do not reject H0 Reject H
There is sufficient evidence
-tα/2 tα/2 0
20
Multiple Linear Regression Equation
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R 0.44172
Standard Error 47.46341
Sales 306.526 - 24.975(Price) 74.131(Advertising)
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
21
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
4. Multiple Regression : (continued)
Multiple Linear Regression Equation
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 24
555.46404
4. Multiple Regression : (continued)
Correlation matrix
Pie Price Advertising
Week Sales ($) ($100s) Multiple regression model:
1 350 5.50 3.3
2 460 7.50 3.3 Sales = b0 + b1 (Price)
3 350 8.00 3.0
4 430 8.00 4.5
+ b2 (Advertising)
5 350 6.80 3.0
6 380 7.50 4.0
7 430 4.50 3.0 Correlation matrix:
8 470 6.40 3.7
Pie Sales Price Advertising
9 450 7.00 3.5
Pie Sales 1
10 490 5.00 4.0
Price -0.44327 1
11 340 7.20 3.5
Advertising 0.55632 0.03044 1
12 300 7.90 3.2
13 440 5.90 4.0
14 450 5.00 3.5 25
15 300 7.00 2.7
4. Multiple Regression : (continued)
Correlation matrix
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
27
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
4. Multiple Regression : t-tests of
individual variable slopes, b1 and b2
From Excel output:
H0: βi = 0
Coefficients Standard Error t Stat P-value
H1: βi 0 Price -24.97509 10.83213 -2.30565 0.03979
Advertising 74.13096 25.96732 2.85478 0.01449
d.f. = 15-2-1 = 12
= .05 The test statistic for each variable falls
t/2 = 2.1788 in the rejection region (p-values < .05)
Decision:
/2=.025 /2=.025
Reject H0 for each variable
Conclusion:
Reject H0 Do not reject H0 Reject H0
There is evidence that both
-tα/2 0
tα/2 Price and Advertising affect
-2.1788 2.1788 pie sales at = .05 28
4. Multiple Regression :
Standard Deviation of the Regression Model
Regression Statistics
Multiple R 0.72213
R Square 0.52148
The standard deviation of the
Adjusted R 0.44172
regression model is 47.46
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
29
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
4. Multiple Regression:
F-Test for Overall Significance of(continued)
the Model
Regression Statistics
Multiple R 0.72213
MSR 14730.0
R Square 0.52148 F 6.5386
Adjusted R MSE 2252.8
Square 0.44172
With 2 and 12 degrees P-value for
Standard Error 47.46341 of freedom the F-Test
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 30
555.46404
4. Multiple Regression:
(continued)
F-Test for Overall Significance of the Model
32
• ** Tools – Data Analysis Anova:
single factor
33