Logistic Regression & Practice
Logistic Regression & Practice
1
Logistic regression
4
How can we analyse these data?
5
The scatter plot
Yes 1
coronary
Signs of
No 0
0 20 40 60 80 100
AGE(years)
6
Linear regression for Binary variable
7
Let’s try another way
Table: Prevalence (%) of signs of CD according to age group
Diseased
Age group # in group # %
20 -29 5 0 0
30 - 39 6 1 17
40 - 49 7 2 29
50 - 59 7 4 57
60 - 69 5 4 80
70 - 79 2 2 100
80 - 89 1 1 100
8
The scatter plot now
100
Diseased %
80
60
40
20
0
0 2 4 6 8
Age group
10
The Logistic Function
1.0
Probability
of disease 0.8
0.6 x
e
P( y x ) x
0.4
1 e
0.2
0.0
x
So we fit the model of the form
This is called the logistic function
11
Logistic regression
e 1 = odds ratio
14
Fitting equation to the data
• Iterative computing
- Choice of an arbitrary value for the coefficients (usually 0)
- Computing of log-likelihood
- Variation of coefficients’ values
- Reiteration until maximisation (plateau)
• Results
- Maximum Likelihood Estimates (MLE) for 0 and 1
- Estimates of P(y) for a given value of x
16
Multiple logistic regression
• Question
- Does model including given independent variable provide
more information about dependent variable than model
without this variable?
• Three tests
- Likelihood ratio statistic (LRS)
- Wald test
- Score test
18
Likelihood ratio statistic
21
Interpretation
If an infant weighs 750 grams at birth, what is the
probability that he develops the disease?
The logit:
22
Notes
23
Logistic regression example
24
Logistic regression example
Which variables are binary (dichotomous)?
25
Logistic regression example
26
Logistic regression example
29
Predict that all subjects will
decide to stop correct 59.4%
Logistic regression exampleof the time
21
30
• The Omnibus Tests is used to check
that the new model is an improvement
over the baseline model.
• It uses chi-square tests to see if there is
a significant difference between the
Log-likelihoods of the baseline model
and the new model.
• If the new model has a significantly
reduced -2LL compared to the baseline
then it suggests that the new model is
explaining more of the variance in the
outcome and is an improvement!
This statistic measures how poorly the model predicts the decisions -- the smaller
the statistic the better the model used to compare nested (reduced) models
31
Based on this result, what is the regression equation?
32
24
35
27
Our model predicts that 59% of men will decide to continue the
research 36
The probability that men will decide to continue the research = 0.59
The probability that women will decide to continue the research = 0.30
With the cut value of 0.50:
If the probability ≥ 0.50, the subject is classified into “Continue the research”
all male subjects (115) are predicted to continue the research
If the probability < 0.50, the subject is classified into “Stop the research”
all female subject (200) are predicted to stop the research
This rule allows us to correctly classify 68/128 = 53.1% of the subjects where
the predicted event (deciding to continue the research) was observed
Sensitivity of prediction, P(correct | event did occur) = 53.1%
This rule allows us to correctly classify 140/187 = 74.9% of the subjects where
the predicted event was not observed. This is known as the specificity of
prediction, the P(correct | event did not occur) = 74.9%
Overall our predictions were correct 208 out of 315 times, for an overall
success rate of 66% 37
29
38
95% CI for the predicted OR (odds ratio)
39
95% CI for the predicted OR (odds ratio)
Interpretation?
40
95% CI for the predicted OR (odds ratio)
41
Exercise
Use the low birth weight data set (lowbwt.sav).
42
Exercise
Let’s consider age of the mother as independent variable
to predict the low birth weight of her infant (low).
Model 1: Perform a simple logistic regression to derive an
equation to compute the probability of the low birth weight
infants from the age of their mothers.
Is the effect of mother age on low birth weight infants
significant?
What is the predicted odds ratio? Interpret it
What is 95% CI of the odds ratio? Interpret it
What is the predicted probability of having a low birth
weight infant of a woman at 35 years old?
43
35
44
45
Result
• Model 1:
50
http://bis.net.vn/forums/t/484.aspx
51