0% found this document useful (0 votes)

25 views

ExamFinal Topics

The document summarizes major topics that will be covered in the final exam for STA 371G, including concepts like mean, variance, standard deviation, probability, normal distribution, sampling, simple linear regression, and hypothesis testing. Key aspects of simple linear regression are least squares estimation, interpreting regression coefficients, assumptions of the regression model, and prediction intervals.

Uploaded by

amt801

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

ExamFinal Topics

Uploaded by

amt801

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Summary of Topics for the Final Exam

STA 371G, Fall 2017

Listed below are the major topics covered in class that are likely to be in
the Final Exam. Good Luck!

Mean (expectation), variance and standard deviation of a random variable.

n
X n
X p
E[X] = xi P (X = xi ), Var[X] = (xi E[X])2 P (X = xi ), sd[X] = Var[X]
i=1 i=1

Add a constant to a random variable, multiply a random variable by a constant.

If Y = a + bX, then

E[Y ] = a + bE[X], Var[Y ] = b2 Var[X], sd[Y ] = | b | sd[X].

Conditional, joint and marginal probabilities.

P (Y = y, X = x)
P (Y = y | X = x) =
P (X = x)

P (Y = y, X = x) = P (Y = y | X = x)P (X = x)
X
P (Y = y) = P (Y = y, X = x)
x

Independent random variables, sum of independent random variables.

Two random variables X and Y are independent if P (Y = y | X = x) = P (Y = y)

for all possible x and y.
If X and Y are independent , then P (Y = y, X = x) = P (Y = y)P (X = x).
If Y = a0 + a1 X1 + a2 X2 + + an Xn , then

E[Y ] = a0 + a1 E[X1 ] + a2 E[X2 ] + + an E[Xn ].

If Xi and Xj are independent for i 6= j, then we further have

Var[Y ] = a21 Var[X1 ] + a22 Var[X2 ] + + a2n Var[Xn ].

Normal distribution X N (, 2 ), where is the mean, 2 is the variance, and is

the standard deviation.

Probability density function: area under the curve represents probability.

Standard normal distribution Z N (0, 1).
X
Standardizing a normal random variable Z = N (0, 1).
P (X < x) = P ( X
<
x
) = P (Z < x
).

1
P (2 < Z < 2) 0.95; P ( 2 < X < + 2) 0.95.

Estimate and 2 when X N (, 2 ).

Pn
Xi
Use the sample mean X = i=1
n to estimate .
Pn 2
i=1 (Xi X)
Use the sample variance s2 = n1 to estimate 2 .

Sampling distribution of a sample mean X:

2 = 2
X N (, X n ).
The sampling distribution of X is useful in determining the quality of X as an
estimator for the population mean .
As the
Pnpopulation variance 2 is usually unknown, we use the sample variance
(X X)2
s2 = i=1n1i to estimate 2 and hence s2 /n to estimate X
2 .
q
2
95% confidence interval of (approximately): X 2 sn .

Simple Linear Regression

Least squares estimation: given n observations (x1 , y1 ), , (xn , yn ), we estimate

the intercept b0 and slope b1 by finding a straight line yi = b0 +b1 xi that minimizes
n
X n
X n
X
e2i = (yi yi ) = 2
[yi (b0 + b1 xi )]2 .
i=1 i=1 i=1

Sample means of X and Y

Pn Pn
i=1 xi i=1 yi
x = , y = .
n n
Sample covariance
Pn
i=1 (xi x)(yi y)
Cov(X, Y ) =
n1
Sample correlation
Cov(X, Y ) Cov(X, Y )
rxy = Corr(X, Y ) = q = .
s2x s2y sx sy

Pn
x)2
i=1 (xi
p
s2x = , sx = s2x
n1
Pn
(yi y)2 q
s2y = i=1 , sy = s2y
n1
Least squares estimation:
sy
b0 = y b1 x, b1 = rxy
sx

2
Interpreting covariance, correlation and regression coefficients.
SST, SSR, SSE
n
X
SST = (yi y)2
i=1
n
X n
X
SSR = (yi y)2 = [(b0 + b1 xi ) y]2
i=1 i=1
n
X n
X n
X
SSE = e2i = (yi yi )2 = [yi (b0 + b1 xi )]2
i=1 i=1 i=1

SST = SSR + SSE

Note that y = b0 + b1 x, yi = b0 + b1 xi , y = y, and ei = yi yi = (yi y) (yi y).
Coefficient of determination:
SSR SSE
R2 = =1 2
= rxy
SST SST
Regression assumptions and statistical model.

Y = 0 + 1 X + , N (0, 2 )

yi = 0 + 1 xi + i , i N (0, 2 )
yi N (0 + 1 xi , 2 )

Assuming 0 , 1 and 2 are known, given xi , the 95% prediction interval of yi is

(0 + 1 xi ) 2.

We estimate with the regression standard error s as

sP
n
r
2
i=1 e SSE
s= = .
n2 n2

Approximately we have b1 N (1 , s2b1 ) and b0 N (0 , s2b0 ), where the standard

errors of b1 and b0 are
s s
s2 X 2

2
1
sb1 = , sb0 = s + .
(n 1)s2x n (n 1)s2x

Thus approximately we have the 95% confidence intervals for 1 and 0 as

b1 2sb1 , b0 2sb0 .

Hypothesis testing:
We test the null hypothesis H0 : 1 = 10 versus the alternative H1 : 1 6= 10 .

3
b1 10
The t-stat t = s b1 measures the number of standard errors the estimate
b1 is from the proposed value 10 .
The p-value provides a measure of how weird your estimate b1 is if the null
hypothesis is true.
We usually reject the null hypothesis if |t| > 2, p < 0.05, or 10 is not within
the 95% confidence interval (b1 2sb1 , b1 + 2sb1 ).
Significance level and type I error.
Forecasting:
Given Xf , the 95% plug-in prediction interval of Yf is (b0 + b1 Xf ) 2s.
A large predictive error variance (high uncertainty) comes from a large s, a
small n, a small sx and a large difference between Xf and X.

Multiple Linear Regression

Statistical model:

Y = 0 + 1 X1 + 2 X2 + + p Xp + , N (0, 2 )

Y |X1 . . . Xp N (0 + 1 X1 + + p Xp , 2 )

Interpretation of regression coefficients.

Fitted values: yi = b0 + b1 xi1 + + bp xip
Least squares Pn find b0 ,2b1 , , bp that minimize the sum of squared
Pn estimation:
2
residuals i=1 ei = i=1 (yi yi ) .
Regression standard error:
s P sP
n 2 n 2
e
i=1 i i=1 (yi yi )
s= = .
np1 np1

e = 0, Corr(Xj , e) = 0, Corr(Y , e) = 0
2
R2 = Corr(Y, Y ) = SSR SST = 1 SST
SSE

Approximately we have bj N (j , s2bj ).

95% confidence interval for j : bj 2sbj
bj j0
t-stat: tj = s bj .

H0 : j = j0 versus H1 : j 6= j0 . Reject H0 if |tj | > 2, p-value < 0.05, or

j0 is not within (bj 2sbj , bj + 2sbj )
F-test of overall significance.
H0 : 1 = 2 = = p = 0 versus H1 : at least one j 6= 0.
R2 /p SSR/p
f= (1R2 )/(np1)
= SSE/(np1)
If H0 is true, then f > 4 is very significant in general.

4
If f is large (p-value is small), we reject H0 .
Understanding multiple linear regression
Correlation is not causation
Multiple linear regression allows us to control all important variables by
including them into the regression model
Dependencies between the explanatory variables (Xs) will affect our inter-
pretation of regression coefficients
Dependencies between the explanatory variables (Xs) will inflate the stan-
dard errors of regression coefficients

s2
s2bj =
variation in Xj not associated with other Xs

Dummy Variables and Interactions

Dummy variables
Gender: Male, Female; Education level: High-school, Bachelor, Master, Doc-
tor; Month: Jan, Feb, , Dec
A variable of n categories can be included into multiple linear regression
using C dummy variables, where 1 C n 1
Representing a variable of n categories with n dummy variables will lead to
the problem of perfect multicollinearity
Interpretation: the same slope but different intercepts
Interactions
Interpretation: different intercepts and slopes

Diagnostics and Transformations

Diagnostics
Model assumptions:
Statistical model:

Y = 0 + 1 X1 + 2 X2 + + p Xp + , N (0, 2 )

The mean of Y is a linear combination of the Xs

The errors i (deviations from the true mean) are independent, and iden-
tically normally distributed as i N (0, 2 )
Understanding the consequences of violating the model assumptions
Detecting and explaining common model assumption violations using the
residual plots.
Modeling non-linearity with polynomial regression
Statistical model:

Y = 0 + 1 X + 2 X 2 + + m X m + , N (0, 2 )

5
We can always increase m if necessary, but m = 2 is usually enough.
Be very careful about over-fitting and doing prediction outside the data
range, especially if m is large.
For Y = 0 + 1 X + 2 X 2 + , the marginal effect of X on Y is

E[Y |X]
= 1 + 22 X,
X
which means the slope is a function of X (no longer a constant).
Handing non-constant variance with Log-Log transformation
Statistical model:

log(Y ) = 0 + 1 log(X) + , N (0, 2 )

Y = e0 X 1 e , N (0, 2 )
Interpretation: about 1 % change in Y per 1% change in X.
Example: price elasticity
95% plug-in prediction interval of log(Y )

(0 + 1 log(X)) 2s

95% plug-in prediction interval of Y

e0 +1 log(X)2s , e0 +1 log(X)+2s = e0 2s X 1 , e0 +2s X 1

Log transformation of Y
Statistical model:

log(Y ) = 0 + 1 X + , N (0, 2 )

Y = e0 e1 X e , N (0, 2 )
Interpretation: about (1001 )% change in Y per unit change in X (if 1 is
small).
Example: exponential growth

Time Series

Trend, seasonal, cyclical, and random components of a time series

Fitting a trend
Linear trend:
Yt = 0 + 1 t + t , t N (0, 2 )
Exponential trend:
Model: log(Yt ) = 0 + 1 t + t , t N (0, 2 )
Interpretation: Yt increases by about (1001 )% per unit time increase.

6
Modeling non-linearity by adding t2 into the regression model: the slope
changes as time changes.
95% plug-in prediction interval
Autoregressive models
Random walk model: Yt = 0 + Yt1 + t , t N (0, 2 )
Autoregressive model of order 1 (AR(1)):

Yt = 0 + 1 Yt1 + t , t N (0, 2 )

Autocorrelation of residuals: Corr(et , et1 )

Trend+AR(1):

Yt = 0 + 1 Yt1 + 2 t + t , t N (0, 2 )

Logtransformation + trend + AR(1):

log(Yt ) = 0 + 1 log(Yt1 ) + 2 t + t , t N (0, 2 )

Modeling seasonality
Using no more than 11 dummy variables for 12 months; using no more than
3 dummy variables for 4 quarters
Seasonal model:

Yt = 0 + 1 Jan + + 11 N ov + t

Seasonal + AR(1) + linear trend:

Yt = 0 + 1 Jan + + 11 N ov + 12 Yt1 + 13 t + t

Model for t in December: Yt = 0 + 12 Yt1 + 13 t + t

Model for t in Jan: Yt = (0 + 1 ) + 12 Yt1 + 13 t + t
Model for t in October: Yt = (0 + 10 ) + 12 Yt1 + 13 t + t
Logtransformation + Seasonal + AR(1) + trend

log(Yt ) = 0 + 1 Jan + + 11 N ov + 12 log(Yt1 ) + 13 t + t

Diagnose the residual plot of a time series regression model:

Are there any clear temporal patterns?
Are the residuals autocorrelated?
What kind of model assumptions have been violated?
Understand when and how to include log transformation, non-linearity, dummy
variables, interactions, AR(1) and trend to improve a time series regression
model.

Model Selection

7
Validate a model using out-of-sample prediction
Model selection criteria (AIC, BIC, Adjusted R2 )
Forward regression, backward regression, stepwise regression

Decision Making Under Uncertainty

Frequency interpretation and subjective interpretation of probability.

Probabilities and betting odds
Payoff table
Payoffs and losses, loss table
Nonprobabilistic decision criteria
maximin
maximax
minimax loss
Probabilistic decision criteria: expected payoff (ER) or expected loss (EL)
Utility
Risk avoider, risk neutral, risk taker
A typical utility function: U (x) = 1 ex/R
Expected utility
Bayes theorem

P (B|Ai )P (Ai ) P (B|Ai )P (Ai )

P (Ai |B) = =
P (B) P (B|A1 )P (A1 ) + + P (B|Ap )p(Ap )

Decision tree
Represent a payoff table with a decision tree
Time proceeds from left to right
Folding back procedure
Risk profile
Sensitivity analysis
Decision making and Bayes theorem
The value of information
Value of perfect information
Expected value of perfect information (EVPI)
Value of sample information
Expected value of sample information (EVSI)
Bayes theorem and the value of information

Introduction to Monte Carlo Simulation

Uniform random numbers

8
Flip a coin, toss a die, flip two coins, toss tow dice
Normal random numbers, Students t random numbers
Understand how to simulate from a discrete distribution
Understand how to use simulation to estimate P (X < x), E[X] and Var[X],
where X is a random variable following some distribution.
Understand how to use simulation to demonstrate Law of Large Numbers
Understand how to use simulation to demonstrate the sampling distribution of
sample mean
Understand how to use simulation to demonstrate the Central Limit Theorem
Simulation and decision making
Simulate an AR(1)+Trend+Logtransformation time series model
Using simulation to estimate the prediction intervals
Understand how to construct a random experiment and find relevant answers by
simulating the same experiment repeatedly under identical conditions

Stat231 Final Formula Sheet
No ratings yet
Stat231 Final Formula Sheet
15 pages
Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017
No ratings yet
Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017
6 pages
Notes For Use in The Statistics Examination
No ratings yet
Notes For Use in The Statistics Examination
1 page
Formulae
No ratings yet
Formulae
2 pages
ECMT1020 Formulas 2021
No ratings yet
ECMT1020 Formulas 2021
9 pages
ECMT1020 2023S1 Formulas
No ratings yet
ECMT1020 2023S1 Formulas
10 pages
Formulae-gulag-free
No ratings yet
Formulae-gulag-free
1 page
FIT5197 2021 S1 Formula Sheet
No ratings yet
FIT5197 2021 S1 Formula Sheet
20 pages
ST102 Solutions 1
No ratings yet
ST102 Solutions 1
4 pages
2930-final-2025W-formula
No ratings yet
2930-final-2025W-formula
1 page
Formula Sheet Math236
No ratings yet
Formula Sheet Math236
2 pages
Formula Sheet
No ratings yet
Formula Sheet
2 pages
Formuleblad-statistiek
No ratings yet
Formuleblad-statistiek
10 pages
2 Gaussian Process PDF
No ratings yet
2 Gaussian Process PDF
55 pages
ECMT1020
No ratings yet
ECMT1020
4 pages
List of Formulas
No ratings yet
List of Formulas
7 pages
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
No ratings yet
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
2 pages
Formulae Sheet
No ratings yet
Formulae Sheet
11 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
intro to regression
No ratings yet
intro to regression
4 pages
Sums and Means - Practice Problems: Pierre Brochu Department of Economics University of Ottawa
No ratings yet
Sums and Means - Practice Problems: Pierre Brochu Department of Economics University of Ottawa
6 pages
formulasheetensvnew
No ratings yet
formulasheetensvnew
15 pages
Formulasheetforfinal
No ratings yet
Formulasheetforfinal
3 pages
Equation Sheet
No ratings yet
Equation Sheet
5 pages
STAT2102_Chapter6
No ratings yet
STAT2102_Chapter6
5 pages
Stimation: Statistic
No ratings yet
Stimation: Statistic
46 pages
4Gaussian Discriminant
No ratings yet
4Gaussian Discriminant
50 pages
Part 5
No ratings yet
Part 5
27 pages
Formulas for Use on Assessments Copy
No ratings yet
Formulas for Use on Assessments Copy
1 page
Measurement Technique Cheat Sheet
No ratings yet
Measurement Technique Cheat Sheet
7 pages
Summary
No ratings yet
Summary
3 pages
Slides 535 Day 5 SPR 2014
No ratings yet
Slides 535 Day 5 SPR 2014
13 pages
Formula Sheet
No ratings yet
Formula Sheet
19 pages
Joint Distribution: Eral Rvs
No ratings yet
Joint Distribution: Eral Rvs
12 pages
Sur15 3 Sol
No ratings yet
Sur15 3 Sol
27 pages
Stat520 Ch.4
No ratings yet
Stat520 Ch.4
5 pages
AP stat
No ratings yet
AP stat
1 page
Formula_List_Statistics_2
No ratings yet
Formula_List_Statistics_2
4 pages
Formulari Est
No ratings yet
Formulari Est
7 pages
Formulario GEI
No ratings yet
Formulario GEI
3 pages
Linear Transformations If Y Ax + B, Then y A X+B and S
No ratings yet
Linear Transformations If Y Ax + B, Then y A X+B and S
2 pages
Exercises Solutions Based on Estimation 173927212199862208967ab2fb9139dd
No ratings yet
Exercises Solutions Based on Estimation 173927212199862208967ab2fb9139dd
9 pages
Formula Sheet Final Exam
No ratings yet
Formula Sheet Final Exam
5 pages
Stat 400, Section 6.1b Point Estimates of Mean and Variance
100% (1)
Stat 400, Section 6.1b Point Estimates of Mean and Variance
3 pages
Lec 19
No ratings yet
Lec 19
6 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
Functional Analysis (PSMATC - 301) - by Shallu Mam
No ratings yet
Functional Analysis (PSMATC - 301) - by Shallu Mam
99 pages
IB Internal Assessment Formulas
No ratings yet
IB Internal Assessment Formulas
1 page
Neymar Pearson
No ratings yet
Neymar Pearson
2 pages
Assignment3SolNew_Fall2024 (1)
No ratings yet
Assignment3SolNew_Fall2024 (1)
9 pages
HASTS215 - HSTS215 NOTES Chapter3
No ratings yet
HASTS215 - HSTS215 NOTES Chapter3
7 pages
Section06 Solutions
No ratings yet
Section06 Solutions
11 pages
MGSC 1108 Formula Sheet
No ratings yet
MGSC 1108 Formula Sheet
2 pages
Econ 140 (Spring 2018) - Section 1: 1 Random Variable (RV)
No ratings yet
Econ 140 (Spring 2018) - Section 1: 1 Random Variable (RV)
7 pages
Formula Sheet
No ratings yet
Formula Sheet
3 pages
Mathematical Foundations of Computer Science Lecture Outline
No ratings yet
Mathematical Foundations of Computer Science Lecture Outline
4 pages
Sample Statistics: N I N I
No ratings yet
Sample Statistics: N I N I
13 pages
My Notes For Discrete and Continuous Distributions 987654
No ratings yet
My Notes For Discrete and Continuous Distributions 987654
28 pages
GaussianRV
No ratings yet
GaussianRV
11 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Moving Child Is A Learning Child PPT Handout
No ratings yet
Moving Child Is A Learning Child PPT Handout
53 pages
Homework 5 Solutions
No ratings yet
Homework 5 Solutions
10 pages
Customer State Winningbid Numberofbusesincontract Orionsestimatedcost
No ratings yet
Customer State Winningbid Numberofbusesincontract Orionsestimatedcost
2 pages
Homework 4 Solutions
No ratings yet
Homework 4 Solutions
14 pages
Homework 3 Solutions
No ratings yet
Homework 3 Solutions
9 pages
Homework 2 Solutions
No ratings yet
Homework 2 Solutions
3 pages
STA371G Homework Assignment 2: Problem 1 (5 Points)
No ratings yet
STA371G Homework Assignment 2: Problem 1 (5 Points)
2 pages
Homework 1 Solutions
No ratings yet
Homework 1 Solutions
3 pages
Government Exam 2 ESSAY SHIT
No ratings yet
Government Exam 2 ESSAY SHIT
14 pages
Econometric S
No ratings yet
Econometric S
11 pages
FM - Applications Toulon - UTM
No ratings yet
FM - Applications Toulon - UTM
10 pages
GT 5 Repeated Games
No ratings yet
GT 5 Repeated Games
9 pages
Business Decision Making II Simple Linear Regression: Dr. Nguyen Ngoc Phan
No ratings yet
Business Decision Making II Simple Linear Regression: Dr. Nguyen Ngoc Phan
69 pages
Instant Download Statistical Concepts A Second Course 5th Edition Debbie L. Hahs-Vaughn PDF All Chapter
100% (7)
Instant Download Statistical Concepts A Second Course 5th Edition Debbie L. Hahs-Vaughn PDF All Chapter
84 pages
13
No ratings yet
13
18 pages
Intro To Mixture Models
No ratings yet
Intro To Mixture Models
5 pages
MBQT1001
No ratings yet
MBQT1001
1 page
Problem Set On TVM
No ratings yet
Problem Set On TVM
1 page
Project Management & Enterprenurship: Total No. of Lectures: 44
No ratings yet
Project Management & Enterprenurship: Total No. of Lectures: 44
4 pages
Assignment 4 PS
100% (1)
Assignment 4 PS
2 pages
PDF An Introduction to Econometrics A Self Contained Approach 1st Edition Frank Westhoff download
100% (4)
PDF An Introduction to Econometrics A Self Contained Approach 1st Edition Frank Westhoff download
81 pages
W8-9 Module 008 Linear Regression and Correlation PDF
No ratings yet
W8-9 Module 008 Linear Regression and Correlation PDF
3 pages
Simple & Multiple Regression
No ratings yet
Simple & Multiple Regression
12 pages
Chicken and Stag Hunt (Lastest)
No ratings yet
Chicken and Stag Hunt (Lastest)
12 pages
Operations Research Template Subject Introduction
No ratings yet
Operations Research Template Subject Introduction
5 pages
Dummy Variable Regression Models
No ratings yet
Dummy Variable Regression Models
19 pages
The Event Study Methodology Since 1969
No ratings yet
The Event Study Methodology Since 1969
20 pages
1 19
No ratings yet
1 19
8 pages
Simple Linear Regression
100% (1)
Simple Linear Regression
50 pages
Taguchi Case Study
No ratings yet
Taguchi Case Study
64 pages
HW 2.4 - Influential Points and Departures From Linearity
No ratings yet
HW 2.4 - Influential Points and Departures From Linearity
2 pages
Z-Test: Montilla, Kurt Kubain M. Cabugatan, Hanif S. Mendez, Mariano Ryan J
No ratings yet
Z-Test: Montilla, Kurt Kubain M. Cabugatan, Hanif S. Mendez, Mariano Ryan J
18 pages
Sop Table
No ratings yet
Sop Table
3 pages
Problems On Investment Analysis
No ratings yet
Problems On Investment Analysis
52 pages
Introduction To Negative Binomial Distribution
No ratings yet
Introduction To Negative Binomial Distribution
8 pages
Hasil Output SPSS 22 Menggunakan
No ratings yet
Hasil Output SPSS 22 Menggunakan
7 pages
Discounting Cash Flows
No ratings yet
Discounting Cash Flows
5 pages
Econometrics Lecture Note Chapter 3 (1)
No ratings yet
Econometrics Lecture Note Chapter 3 (1)
39 pages