Introduction To Logistic Regression
Introduction To Logistic Regression
Suppose you have a binary outcome variable. The problem of having a non-
continuous dependent variable becomes apparent when you create a scatterplot of
the relationship. Here, we see that it is very difficult to decipher a relationship
among these variables.
3
A Problem with Linear Regression
We could severely simplify the plot by drawing a line between the means for the
two dependent variable levels, but this is problematic in two ways: (a) the line
seems to oversimplify the relationship and (b) it gives predictions that cannot be
observable values of Y for extreme values of X.
4
A Problem with Linear Regression
5
The Linear Probability Model
In the OLS regression:
Y = β0 + β1X + e ; where Y = (0, 1)
The error terms are heteroskedastic
e is not normally distributed because Y
takes on only two values
The predicted probabilities can be
greater than 1 or less than 0
6
A Problem with Linear Regression
8
The Logistic Regression Model
The "logit" model solves these problems:
9
Odds & Odds Ratios
p
Recall the definitions of an odds: odds
1 p
The odds has a range of 0 to with values greater than 1 associated with
an event being more likely to occur than to not occur and values less than 1
associated with an event that is less likely to occur than not occur.
12
Introducing the Odds Ratio for
the Logistic Transformation
• If there is a 75% chance that it will rain
tomorrow, then 3 out of 4 times we say this it will
rain. That means for every three times it rains
once it will not. The odds of it raining tomorrow
are 3 to 1. This can also be understood as
(¾)/¼=3/1.
• If the odds that my pony will win the race is 1 to
3, that means for every 4 races it runs, it will win
1 and lose 3. Therefore I should be paid $3 for
every dollar I bet.
13
Example Interpretation of coefficient
15
Running logistic in SPSS for child has IEP or not in
ECLS-K
ln[p/(1-p)] = 0 + 1X= ln[p/(1-p)] = -2.424 - Change in odds =e0 + 1/e 0=e1 e-.46 =.63
.46X 16
Hypothesis Testing
Wald = [ /s.e.B]2
which is distributed chi-square with 1
degree of freedom.
17
Running logistic in SPSS for child has IEP or not in
ECLS-K
18
Logistic Regression Reflection
• What part is most confusing to you?
• What are the possible interpretations for
the part that is confusing?
• Find a partner or two and share your
questions
19
References
• http://personal.ecu.edu/whiteheadj/data/logit/
• Video for running logistic in spss
– http://www.youtube.com/watch?v=ICN6CMDxHwg&noredirect=1
• power points
– http://personal.ecu.edu/whiteheadj/data/logit/logit.ppt
– http://www.google.com/search?q=logistic+regression+ppt&ie=utf-8&oe=utf-
8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
– http://www.google.com/search?q=logistic+regression+ppt&ie=utf-8&oe=utf-
8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
• with sas:
– http://www.math.yorku.ca/SCS/Courses/grcat/grc6.html
– http://www.ats.ucla.edu/stat/sas/seminars/sas_logistic/logistic1.htm
– http://www.pauldickman.com/teaching/sas/sas_logistic_seminar8.pdf
• for poisson
– http://www.uwm.edu/IMT/Computing/sasdoc8/sashtml/insight/chap17/sect1.htm
• In stata
– http://psg_mac43.ucsf.edu/ticr/syllabus/courses/38/2004/11/02/Lecture/notes/Session%204%20lectur
e%20slides.ppt