Discrete Choice Analysis I: Moshe Ben-Akiva
Discrete Choice Analysis I: Moshe Ben-Akiva
Discrete Choice Analysis I: Moshe Ben-Akiva
Moshe Ben-Akiva
Fall 2008
● Introduction
● A Simple Example
● The Random Utility Model
● Specification and Estimation
● Forecasting
● IIA Property
● Nested Logit
2
Outline of this Lecture
● Introduction
● A simple example – route choice
● The Random Utility Model
– Systematic utility
– Random components
● Derivation of the Probit and Logit models
– Binary Probit
– Binary Logit
– Multinomial Logit
3
Continuous vs. Discrete Goods
Continuous Goods Discrete Goods
x2 auto
Indifference
u curves
3
u2
u1
x1
bus
4
Discrete Choice Framework
● Decision-Maker
– Individual (person/household)
– Socio-economic characteristics (e.g. Age, gender,income, vehicle
ownership)
● Alternatives
– Decision-maker n selects one and only one alternative from a choice
set Cn={1,2,…,i,…,Jn} with Jn alternatives
● Attributes of alternatives (e.g.Travel time, cost)
● Decision Rule
– Dominance, satisfaction, utility etc.
5
Choice: Travel Mode to Work
6
Consumer Choice
• Consumers maximize utility
– Choose the alternative that has the maximum utility (and
falls within the income constraint)
U(bus)=?
U(auto)=?
7
Constructing the Utility Function
● U(bus) = U(walk time, in-vehicle time, fare, …)
U(auto) = U(travel time, parking cost, …)
● Assume linear (in the parameters)
U(bus) = β1×(walk time) + β2 ×(in-vehicle time) + …
● Parameters represent tastes, which may vary over people.
β3 ×(cost/income) + …
8
Deterministic Binary Choice
P(bus)
1
0
0 U(bus)-U(auto)
9
Probabilistic Choice
● Random utility model
Ui = V(attributes of i; parameters) + epsiloni
● What is in the epsilon?
Analysts’ imperfect knowledge:
– Unobserved attributes
– Unobserved taste variations
– Measurement errors
– Use of proxy variables
● U(bus) = β1 ×(walk time) + β2 ×(in-vehicle time +
β3 ×(cost/income) + … + epsilon_bus
10
Probabilistic Binary Choice
P(bus)
0
0 V(bus)-V(auto)
11
A Simple Example: Route Choice
Route Income
choice Low (k=1) Medium (k=2) High (k=3)
Tolled (i=1) 10 100 90 200
Free (i=2) 140 200 60 400
150 300 150 600
12
A Simple Example: Route Choice
Probabilities
● (Marginal) probability of choosing toll road P(i = 1)
P̂(i = 1) = 200 / 600 = 1/3
● (Joint) probability of choosing toll road and having medium
income: P(i=1, k=2)
P̂(i = 1, k = 2) = 100 / 600 = 1/6
2 3
∑∑ P(i, k
) = 1
i=1 k =1
13
Conditional Probability P(i|k)
P(i | k ) = P(i)
P(k ) = ∑ P(i, k )
i
P(k | i) = P(k )
P(k | i ) =
P(i, k )
, P(i ) ≠ 0
P(i )
P(i, k )
P(i | k ) = , P(k ) ≠ 0
P(k )
14
Model : P(i|k)
● Behavioral Model~
Probability (Route Choice|Income) = P(i|k)
● Unknown parameters
P(i = 1| k = 1) = π 1
P(i = 1| k = 2) = π 2
P(i = 1| k = 3) = π 3
15
Example: Model Estimation
● Estimation
frequency
Sampling
πˆ 1= 151 , πˆ 2= 13 , πˆ 3= 53 distribution
πˆ 3 ⋅ (1− πˆ 3 ) 3 / 5⋅ (1− 3 / 5)
s3 = = = 0.040
N3 150
16
Example: Forecasting
● Toll Road share under existing income distribution: 33%
● New income distribution
Route Income
choice Low (k=1) Medium (k=2) High (k=3)
Tolled (i=1) 1/15*45=3 1/3*300=100 3/5*255=153 256 43%
Free (i=2) 42 200 102 344 57%
New income
45 300 255 600
distribution
Existing
income 150 300 150 600
distribution
17
The Random Utility Model
18
The Random Utility Model
● Choice probability:
P(i|Cn) = P(Uin ≥ Ujn, ∀ j ∈ Cn)
= P(Uin - Ujn ≥ 0, ∀ j ∈ Cn)
= P(Uin = maxj Ujn,∀ j ∈ Cn)
● For binary choice:
Pn(1) = P(U1n ≥ U2n)
= P(U1n – U2n ≥ 0)
19
The Random Utility Model
U1 = − β1t1 − β 2 c1 + ε 1
U 2 = − β1t 2 − β 2 c2 + ε 2
β1 , β 2 > 0
20
The Random Utility Model
● Ordinal utility
- Decisions are based on utility differences
- Unique up to order preserving transformation
U1 = (− β1t1 − β 2c1 + ε1 + K )λ
U 2 = (− β1t2 − β 2 c2 + ε 2 + K )λ
β1 , β 2 , λ > 0
21
The Random Utility Model
c1-c2
V1 < V2
Alt. 2 is dominant
+
+
+
+ + + + β
++ • + ++ U1 = − β 12 ⋅ t1 − c1 + ε1
•+• • • +• +
V1 > V2
• + • •
+ + t1-t2
β
β
U 2 = − β 12 ⋅ t2 − c2 + ε 2
+•+ +
1 + +•
• • •
β1
• • +• •
•
β= β2
= "value of time"
• • • •
• +
Alt. 1 is dominant
V1 = V2
• Choice = 1
+ Choice = 2
U1 − U 2 = − ββ12 ⋅ (t1 − t2 ) − (c1 − c2 ) + (ε1 − ε 2 )
22
The Systematic Utility
– Socio-economic variables
• Examples: income,gender,education
23
Random Terms
24
Binary Choice
● Choice set Cn = {1,2} ∀n
Pn(1) = P(1|Cn) = P(U1n ≥ U2n)
= P(V1n + ε1n ≥ V2n + ε2n)
= P(V1n - V2n ≥ ε2n - ε1n)
= P(V1n - V2n ≥ εn) = P(Vn ≥ εn) = Fε(Vn)
25
Binary Probit
● “Probit” name comes from Probability Unit
σ
ε1n ~ N(0, 1 )
2
ε2n ~ N(0,σ 2 )
2
1 −
f (ε )
=
e 2
σ
σ 2π
1 ε
2
V
n 1 −
V
n
Pn (1) = Fε (Vn ) =
∫ e dε
=
Φ
2
σ
−∞
σ 2π
σ
26
Binary Probit Normalization
● Relationship between Utility scale µ* and Scale Parameter σ :
Var(µ*εn) = 1
iff
µ *2 var(ε n ) = 1
1 1
⇒ µ* = =
Var(ε n ) σ
● Usual normalization: σ = 1, implying µ*= 1
27
Binary Logit Model
● “Logit” name comes from Logistic Probability Unit
ε1n ~ ExtremeValue (0,µ) [ ]
Fε (ε1n ) = exp − e − µε1n
ε2n ~ ExtremeValue (0,µ) Fε (ε 2n) = exp[− e ] − µε 2 n
1
εn ~ Logistic (0,µ) Fε (ε n ) =
1+ e − µε n
1
Pn (1) = Fε (Vn ) =
1+ e − µVn
28
Why Logit?
● Probit does not have a closed form – the choice probability is an
integral.
● The logistic distribution is used because:
– It approximates a normal distribution quite well.
– It is analytically convenient
– Gumbel can also be “justified” as an extreme value
distribution
29
Logit Model Normalization
Var(εn )
where Var(εn)=Var(ε2n-ε1n)=2π2/6µ2
30
Logit Model Normalization
• Usual normalization: µ =1, implying µ*= 3
π
31
Limiting Cases
● Recall: Pn(1) = P(Vn ≥ εn)
= Fε(V1n – V2n) µ = 10
1 e µV 1n
● With logit, Fε (Vn ) = − µV n
= µV 1n
1+ e e + e µV 2 n
● What happens as µ 0 ?
µ = .1
Vn = V1n – V2n
32
Re-formulation
● Pn(i) = P(Uin ≥ Ujn)
1
= − µ (V in − V
1+ e jn )
µ V in
=
e
µV
e µ V in + e jn
● If Vin and Vjn are linear in their parameters:
e µβ 'xin
Pn (i) = µβ 'x µβ ' x jn
e in
+e
33
Multiple Choice
● Choice set Cn: Jn alternatives, Jn ≥ 2
P(i | Cn ) = P[Vin + ε in ≥ V jn + ε jn , ∀j ∈ Cn ]
[
= P (Vin + ε in ) = max j∈Cn (V jn + ε jn ) ]
= P[ε jn − ε in ≤ Vin −V jn ,∀j ∈Cn ]
34
Multiple Choice
f (ε ) = µe − µε exp[− e − µε ]
– Variance: π2/6µ2
e µVin
P(i | Cn ) = µV jn
∑ e
j∈Cn
35
Multiple Choice – An Example
● Choice Set Cn = {auto ,bus, walk} ∀ n
µVauto ,n
e
P(auto | Cn ) = µVauto ,n µVbus ,n µVwalk ,n
e +e +e
36
Next Lecture
37
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.