Logistic Regression
Logistic Regression
Machine Learning
Lecture Slides
Introduction
• A popular statistical technique to predict binomial outcomes (y = 0 or 1) is
Logistic Regression. Logistic regression predicts categorical outcomes
(binomial / multinomial values of y), whereas linear Regression is good for
predicting continuous-valued outcomes (such as weight of a person in kg, the
amount of rainfall in cm).
• The predictions of Logistic Regression (henceforth, LogR in this article) are in
the form of probabilities of an event occurring, ie the probability of y=1,
given certain values of input variables x. Thus, the results of LogR range
between 0-1.
• LogR models the data points using the standard logistic function, which is an
S- shaped curve given by the equation:
Concepts
• where:
• p = probability that y=1 given the values of the input features, x.
• x1,x2,..,xk = set of input features, x.
• B0,B1,..,Bk = parameter values to be estimated via the maximum likelihood method. B0,B1,..Bk
are estimated as the ‘log-odds’ of a unit change in the input feature it is associated with.
• Bt = vector of coefficients
• X = vector of input features
• Estimating the values of B0,B1,..,Bk involves the concepts of probability, odds and log odds. Let
us note their ranges first:
• Probability ranges from 0 to 1
• Odds range from 0 to ∞
• Log odds range from -∞ to +∞
Example
The task is to predict which students graduated with honours or not (y = 1 or 0), for 200
students with fields female, read, write, math, hon, femalexmath . The fields describe the
gender (female=1 if female), reading scores, writing scores, math scores, honours status
(hon=1 if graduated with honours) and femalexmath showing the math score if female=1
.
The crosstab of the variable hon with female shows that there are 109 males and 91
females; 32 of those 109 females secured honours.
Probability:
The probability of an event is the number of instances of that event divided by the total number of
instances present.
Thus, the probability of females securing honours:
= 32/ 109
= 0.29
Odds:
• The odds of an event is the probability of that event occurring
(probability that y=1), divided by the probability that it does not occur.
• Thus, the odds of females securing honours:
= 32/77
= 0.4155
≈ 0.42
This is interpreted as:
1. 32/77 => For every 32 females that secure honours, there are 77 females that do not
secure honours.
2. 32/77 => There are 32 females that secure honours, for every 109 (i.e 32+77) females.
Log odds:
The Logit or log-odds of an event is the log of the odds. This refers to the
natural log (base ‘e’). Thus,
Where :
1. B0,B1,..Bk are estimated as the ‘log-odds’ of a unit change in the input feature it is
associated with.
2. As B0 is the coefficient not associated with any input feature, B0= log-odds of the
reference variable, x=0 (ie x=male). ie Here, B0= log[odds(male graduating with
honours)]
3. As B1 is the coefficient of the input feature ‘female’,
1. B1= log-odds obtained with a unit change in x= female.
2. B1= log-odds obtained when x=female and x=male.
Calculations:
• From the calculation in the section ‘odds ratio(OR)’,
• B1= log (1.82)
• B1= 0.593
• Thus, the LogR equation becomes
• y= -1.47 + 0.593* female
• where the value of female is substituted as 0 or 1 for male and female respectively.
• Now, let us try to find out the probability of a female securing honours when there
is only 1 input feature present-‘female’.
• Substitute female=1 in: y= -1.47 + 0.593* female
• Thus, y=log[odds(female)]= -1.47 + 0.593*1 = -0.877
• As log-odds = -0.877.
• Thus, odds= e^ (Bt.X)= e^ (-0.877)= 0.416
• And, probability is calculated as:
Thus, the probability of a female
securing honours when there is
only 1 input feature
present-‘female’, is 0.29.