Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
18 views

Micro Lecture Notes

Lecture notes for Core Microeconomics

Uploaded by

Justus Meyer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Micro Lecture Notes

Lecture notes for Core Microeconomics

Uploaded by

Justus Meyer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

ECON0065: Microeconomics

Lecture Notes

Konrad Mierendor↵
These lecture notes are based on Ian Preston’s notes from previous
years who I thank for making them available to me. All errors in the
present version are, of course, entirely mine.

2019/2020, Version 1, University College London


Copyright © 2014–2019 Konrad Mierendor↵
INTRODUCTORY REMARKS 3

Introductory Remarks
The lecture notes follow closely the material taught in class. Due
to limited time, not all topics can be presented in depth. These notes
contain some additional material that fills some gaps. They also pro-
vide some additional more mathematical discussions and proofs that
are intended for those who have the necessary background. These ad-
ditions are typeset in a di↵erent font to indicate clearly that they are
supplemental material that is not examinable.
CHAPTER 1

Consumer Theory

1.1. Consumption Set and Budget Set


The consumption set X is the set of all conceivable consump-
tion bundles x. Usually we set X = Rn+ , where n is the number of
goods. The elements of X are vectors x = (x1 , . . . , xn ), and xi is the
consumption level of good i. We use superscripts to denote di↵erent
consumption bundles x1 , x2 , . . .. The budget set B ⇢ X is the set of
a↵ordable bundles.
In the standard model of consumer choice, individuals can purchase
unlimited quantities at constant prices p 2 Rn++ , subject to a total
budget y 0. Notice that we require all prices to be strictly positive,
i.e., p 0. The budget set is the Walrasian, competitive or linear
budget set:
B(y, p) = {x 2 Rn+ | p| x  y}
We may sometimes write B instead of B(y, p) when it is clear from
the context which budget and prices are given, but if in doubt, it is
advisable to use the more precise notation B(y, p) that includes the
arguments.
Lemma 1. B(y, p) is a convex and compact set with affine (upper)
boundary p| x = y.
Proof. If you are unsure why this is true, use the proof as an
exercise to review the definitions of convex sets and compact sets. ⇤
The maximum a↵ordable quantity of any good is y/pi and the slopes
dxi /dxj |p| x=y, x ij =const = pj /pi for i, j = 1, . . . n are constant and
independent of the total budget y.
In practical applications budget constraints are frequently kinked
or discontinuous as a consequence for example of taxation or non-linear
pricing. In this course we will not consider such complications.
1.2. Marshallian demands, elasticities and types of goods
For given prices p and budget y, a consumer chooses a bundle
x = f (y, p) 2 B known as Marshallian demand, uncompensated
demand, competitive demand, or market demand. In general,
the consumer may be prepared to choose more than one bundle and
f (y, p) is a demand correspondence. In this introduction we will ig-
nore this possibility and will impose assumptions that guarantee that
5
6 1. CONSUMER THEORY

a single bundle is chosen. Therefore we will always treat f (y, p) as a


(vector-valued) function. We denote the individual demands for each
good i by fi (y, p), so that f (y, p) = (f1 (y, p), . . . , fn (y, p)).
It is useful to remember some terminology to classify the e↵ects of
changes in y and p on demand for, say, the ith good:
E↵ects of changes in the total budget y.
. The path traced out by demands in x-space as y increases
is called the income expansion path whereas the graph of
fi (y, p) as a function of y is called the Engel curve. If Engel
curves and income expansion paths are linear, then an indi-
vidual consumes all goods in the same proportion, indepen-
dent of their wealth. This is empirically not very plausible.
The following notions are used to describe how the amount
of consumption of each good and the share of each good in
total expenditure changes when consumers becomes richer or
poorer.
. For di↵erentiable demand functions we can summarise the de-
pendence of xi in y in the total budget elasticity
y @xi @ ln xi
✏i = =
xi @y @ ln y
Notice that an elasticity captures the percentage change of
some variable (here qi ) in response to a percentage change in
some other variable (here y).
Using elasticities instead of derivatives has the advantage that
they are independent of the units in which both quantities are
measured.
If changes are discrete, the total budget elasticity is
xi /xi y xi
= ,
y/y xi y
where xi = fi (y + y, p) fi (y, p) and y are the discrete
changes. Taking the limit as y ! 0, we obtain the di↵erential
expression above.
. If demand for a good rises with the total budget (✏i > 0), then
we say it is a normal good and if it falls (✏i < 0), we say it
is an inferior good.
. If the budget share of a good, wi = pi xi /y, rises with the
total budget (✏i > 1), then we say it is a luxury or income
elastic and if it falls (✏i < 1), we say it is a necessity or
income inelastic.
E↵ects of changes in the own price pi .
. The path traced out by demands f (y, p) in x-space as pi in-
creases is called the o↵er curve whereas the graph of fi (y, p)
as a function of pi is called the demand curve.
1.2. MARSHALLIAN DEMANDS, ELASTICITIES AND TYPES OF GOODS 7

. For di↵erentiable demands we can summarise the dependence


of xi on its own price in the (uncompensated) own price
elasticity
pi @xi @ ln xi
⌘ii = = .
xi @pi @ ln pi
. If uncompensated demand for a good rises with its own price,
⌘ii > 0, then we say it is a Gi↵en good.
. If budget share of a good rises with its own price (⌘ii > 1),
then we say it is price inelastic and if it falls (⌘ii < 1), we
say it is price elastic.
E↵ects of changes in another price pj , j 6= i.
. For di↵erentiable demands we can summarise dependence of xi
on other prices in the (uncompensated) cross price elasticity
pj @xi @ ln xi
⌘ij = = .
xi @pj @ ln pj
. If uncompensated demand for a good rises with the price of
another, ⌘ij > 0, then we can say it is an (uncompen-
sated) substitute whereas if it falls with the price of an-
other, ⌘ij < 0, then we can say it is an (uncompensated)
complement. These definitions of complementarity and sub-
stitutability, however, are not ideal since they may not be sym-
metric, i.e., xi could be a substitute for xj while xj is a com-
plement for xj . (An example can be found in the exercises
below.) A better definition, guaranteed to be symmetric, is
one based on the concept of compensated demand to be
introduced below.

1.2.1. Additional Exercises / Review Questions.


(1) Draw income expansion paths and Engel curves to illustrate
normal and inferior goods, luxuries, and necessities. Give ex-
amples for each case.
(2) Show that dwi /dy > 0 is equivalent to ✏i > 1 as claimed in the
text.
(3) Show that a good is price elastic if and only if ⌘ii > 1 as
claimed in the text.
(4) Consider a consumer who consumes two goods (n = 2) and has
a the utility function u(x1 , x2 ) = ln x1 + x2 , where ln stands
for the natural logarithm. Derive Marshallian demand. Show
that the notions of (uncompensated) complementary or substi-
tutability are not symmetric. (This problem requires to derive
Marshallian demand from utility maximization, see Section
1.6.5.)
8 1. CONSUMER THEORY

1.3. Preferences
We will now develop the standard theory of demand, based on the
idea that demand is revealed preference. In other words, if we ob-
serve that a consumer chooses the bundle f (y, p), we assume that
she prefers f (y, p) over all other bundles x 2 B(y, p). If we also as-
sume that the preferences of consumers are stable across di↵erent choice
problems, we can try to recover the preferences from observed choices
and then use the knowledge about preferences to predict behaviour.
For example, this will be useful if we want to predict the e↵ect of
changes in indirect taxes (which a↵ects the prices the consumer faces).
The interpretation of demand as revealed preference also allows us to
make welfare statements. Based on information about preferences that
we can recover from choice data, we can for example determine whether
a proposed tax reform will make consumers better o↵ or worse o↵.
Formally we define preferences as a binary relation % over con-
sumption bundles x 2 X.
Definition 1. A binary relation R on X is given by a subset of
the ordered pairs (x1 , x2 ) 2 X ⇥ X:
R ⇢ X ⇥ X.
If (x1 , x2 ) 2 R we write x1 Rx2 , and if (x1 , x2 ) 2
/ R we write x1 6 Rx2 .
Before we look at preference relations, let us consider a few simpler
examples.
(1) The linear order on R given by “ ”: We have a binary relation
R ⇢ R ⇥ R given by
(x1 , x2 ) 2 R if and only if x1 x2 .
In this case we can write x1 x2 or x1 R x2 interchangeably.
Note that this binary relation has several nice properties: It
is complete, which means that for every x1 , x2 2 R we have
x1 x2 , or x2 x1 , or both. It is also transitive, which
means that for all x1 , x2 , x3 2 R, if x1 x2 and x2 x3 , then
we have x1 x3 .
(2) Not every binary relation is complete. One example is the
relation on R given by “>”. Here we have
(x1 , x2 ) 2 R> if and only if x1 > x2 .
Clearly this relation is incomplete because for x1 = x2 , we
have neither x1 > x2 nor x2 > x1 .
We note that R> has a property that is not satisfied by R : it
is asymmetric, which means that for all x1 , x2 2 R, x1 > x2
implies x2 ⇧ x1 .
(3) If we want to formalize preferences, we need to consider binary
relations on Rn . Examples of binary relations on Rn are the
1.3. PREFERENCES 9

component wise orders “ ” or “ ”. We have


x1 x2 () 8i : x1i x2i ,
and
x1 x2 () 8i : x1i > x2i .
It is a good exercise to verify that “ ” is transitive but neither
complete nor asymmetric, and “ ” is transitive and asymmet-
ric but not complete.
Before we continue, let is summarize the three properties of binary
relations discussed so far:
Definition 2. A binary relation R on a set X is called
. complete if for every x1 , x2 2 X we have x1 Rx2 , or x2 Rx1 ,
or both,
. transitive if for every x1 , x2 , x3 2 X, x1 Rx2 and x2 Rx3 im-
plies x1 Rx3 ,
. asymmetric if for all x1 , x2 2 X, x1 Rx2 implies x2 6 Rx1 .
Now let us turn to preferences. We will consider a binary relation
% on X which we interpret as weak preference. For x1 , x2 2 X,
x1 % x2 means that bundle x1 is at least as good as bundle x2 . The
preferences of a consumer are thus represented by a binary relation.
How should we think about that? The binary relation % is a list of all
ordered pairs of consumption bundles (x1 , x2 ) such that x1 is weakly
preferred to x2 . If some pair (x1 , x2 ) is not in the list, then x1 is
not weakly preferred to x2 . The list is an exhaustive description of
the consumer’s preferences. If we assume that the binary relation %
is complete, then (x2 , x1 ) must be in the list if (x1 , x2 ) is not in the
list. Meaning the preferences for each pair of bundles the consumer
weakly prefers one over the other. Similarly, if the binary relation %
is transitive, then if (x1 , x2 ) and (x2 , x3 ) are in the list, then (x1 , x3 )
must also be in the list. We will call a consumer rational if she has
weak preferences over all pairs of bundles and if her preferences are
transitive.
Definition 3. A binary relation % on X is called a (weak) pref-
erence relation if it is complete and transitive.
For a given weak preference relation % we define the strict preference
relation by
x1 x2 () x1 % x2 and x2 ✏ x1 ,
and the indi↵erence relation ⇠ by
x1 ⇠ x2 () x1 % x2 and x2 % x1 .
We could have started by considering the strict preference relation
and derived % and ⇠ from . In this case we would have to make
10 1. CONSUMER THEORY

assumptions on that guarantee that that % is complete and transi-


tive.1
An intuitive property of the strict preference relation is asymmetry:
If a bundle x1 is strictly preferred over x2 , then x2 cannot be strictly
preferred over x1 . We show this formally:
Lemma 2. is asymmetric.2
Proof. Suppose x1 x2 for some x1 , x2 2 X. Then we have from
the definition of that x1 % x2 and x2 ✏ x1 . This implies that we
cannot have x 2
x because that would require x1 ✏ x2 and x2 % x1 .
1

Hence we have x2 ⌥ x1 . Since this argument holds for all x1 , x2 2 X


with x1 x2 , the binary relation is asymmetric. ⇤
A bit more difficult to see is that the strict preference relation is
transitive if the weak preference relation is transitive:
Lemma 3. If % is transitive, then is also transitive.
Proof. We have to show that for any x1 , x2 , x3 such that x1
2
x x3 , we also have x1 x3 . Suppose by contradiction that
x1 x2 x3 but x1 ⌥ x3 . This implies from the definition of
the strict preference relation that (a) x1 ✏ x3 or (b) x3 % x1 . Case
(a) is impossible because x1 x2 x3 implies x1 % x2 % x3 and
transitivity of % implies x % x . Hence we must have x3 % x1 (case
1 3

(b)), if x1 x2 x3 and x1 ⌥ x3 . But x3 % x1 together with


x % x % x implies x3 % x1 % x2 % x3 which implies x2 ⇠ x3 . This
1 2 3

is a contradiction because we assumed x2 x3 . Hence case (b) is also


impossible and we cannot have x1 ⌥ x3 if x1 x2 x3 . Hence we
must have x1 x3 . Since x1 , x2 , x3 are arbitrary, this shows that
is transitive. ⇤
Transitivity of strict preferences implies that we cannot have a strict
cycle in the preferences of an individual, i.e. we cannot have
x3 x1 x2 x3 .
This is an intuitive aspect of rationality and it is reassuring that the
assumptions that % is complete and transitive (which we called “ratio-
nal”), rules out such strict preference cycles.
Finally we note the (intuitive) property that a bundle is not strictly
preferred to itself: x ⌥ x. A binary relation with this property is called
irreflexive.
Lemma 4. is irreflexive.
Proof. The (simple) proof is left as an exercise. ⇤
1For a development of the theory of preferences that starts with , see the
textbook by Kreps (1988).
2Note that in the proof we do not use that % is complete and transitive. It
follows directly from the definition of .
1.4. UTILITY REPRESENTATIONS 11

It will be useful to think about preferences in terms of the indif-


ference sets for any given bundle x0 . This is the set of all bundles
x1 for which the decision maker is indi↵erent between x0 and x1 . In
similar fashion we define a weakly preferred set and a weakly dis-
preferred set.
Definition 4. Let % be a preference relation. Then for any bundle
0
x 2 X, the weakly preferred set, upper contour set, or no-
worse-then set is defined as
% (x0 ) := x1 2 X x1 % x0 ,
the weakly dispreferred set, lower contour set, or no-better-
then set is defined as
- (x0 ) := x1 2 X x0 % x1 ,
and the indi↵erence set is defined as
⇠ (x0 ) :=% (x0 ) \ - (x0 )
Note that we have defined the indi↵erence set with reference to a
particular bundle x0 , but if we have two bundles x0 ⇠ x1 we will just
say that they are in the same indi↵erence set because ⇠ (x0 ) =⇠
(x1 ). Indi↵erence sets are ordered by the preference relation. Instead
of specifying preferences for each pair of bundles x0 , x1 2 X, we may
just as well describe all indi↵erence sets and how they are ordered.
This contains exactly the same information as the original preference
relation. When we draw indi↵erence curves and note the direction of
preference, this is exactly what we are doing.

1.4. Utility Representations


Working with preference relations can be cumbersome. Therefore
we define utility functions u : X ! R that represent preferences.
Definition 5. A function u : X ! R is a utility function repre-
senting the preference relation % if for all x1 , x2 2 X
u(x1 ) u(x2 ) () x1 % x2 .
How do we find a utility function that represents a given preference
relation? We have to assign a unique real number to each indi↵erence
set. If x1 ⇠ x2 then clearly x1 and x2 must have the same utility level.
Conversely, if x1 6⇠ x2 , we must have u(x1 ) 6= u(x2 ). Furthermore, the
utility levels must represent the same ordering of indi↵erence sets as
the preference relation.
It is easy to find a utility representation if we have preferences
defined over a set X with only a finite number of elements. Suppose
we have a finite number of bundles ordered by a preference ordering as
follows:
x0 % x1 % . . . % xk % . . . % xk+` % . . . % xK .
12 1. CONSUMER THEORY

Then we can choose numbers (these will be the utility levels)


v0 v1 ... vK
If two consecutive bundles xi , xi+1 satisfy xi xi+1 , they must get
i i+1
di↵erent utility levels because clearly x 6⇠ x . Hence we have to
chose v i > v i+1 in this case. Conversely if there are bundles k, . . . , k + `
such that xk ⇠ . . . ⇠ xk+` , these bundles are all in the same indi↵erence
set, and therefore we must set v k = . . . = v k+` . Using these numbers,
we can define a utility function that represents %: u(xk ) = v k . It is easy
to check that this function indeed represents the preference relation %.
Is it always possible to find a utility function that represents a
given preference relation %? (In the following discussion, I will appeal
to your intuition and do not present rigorous arguments because they
would be beyond the mathematical level of this course.)3 We saw that
for a set X with a finite number of possible bundles the answer is
yes. Our model of consumer behaviour, however, is formulated for
X = Rn+ which is infinite.4 This is convenient because we can work
with derivatives etc., but it has the unfortunate side-e↵ect that there
exist preference relations for which it is impossible to define a utility
function. Here is an example of such a preference relation.
Example 1 (Lexicographic Preferences). Suppose that X = R2+
and % is given by:
x1 % x2 () x11 > x21
or x11 = x21 and x12 x22 .
This is a well defined preference relation—that is, it is complete and
transitive. (Carefully check that these properties are fulfilled!) How-
ever, it is impossible to assign utility levels to all bundles in X = R2+ in
a way that the utilities represent the preferences. What is the problem?
Notice that there are no two bundles x1 6= x2 such that x1 ⇠ x2 . If x1
and x2 di↵er in at least one component, the consumer is not indi↵erent.
Each bundle is has its own separate indi↵erence set. But this implies
that we have to assign a di↵erent utility level for each bundle x 2 R2+ .
This is impossible!

Here is why (for the mathematically savvy): To define the utility function
just for bundles where the first component is fixed at some value x1 re-
quires an interval of utility levels [u(x1 , 0), u(x1 , 1)), where u(x1 , 1) =
limx2 !1 u(x1 , x2 ) and we have u(x1 , 1) > u(x1 , 0). All other bundles
must be assigned utility levels either strictly below u(x1 , 0) or (weakly)
3Ifyou want to know more, a good starting point is again the book by Kreps
(1988).
4More precisely, the problems described in the following arise because the real
numbers are uncountably infinite. (You can safely ignore this remark if you do not
know what that means.)
1.4. UTILITY REPRESENTATIONS 13

above u(x1 , 1). Therefore, for every x1 2 R+ the utility function must
“occupy” an interval of positive length and all intervals must be disjoint.
This is impossible because there can only be a countable number of disjoint
intervals that have positive length in the real numbers.

To avoid preferences that have “too many” indi↵erence sets, we


assume that preferences are continuous. We give three equivalent
definitions of continuity that we use interchangeably.
Definition 6. A preference relation % is continuous if any one
of the following properties is satisfied (they are all equivalent):
(1) For all x0 2 X, both % (x0 ) and - (x0 ) are closed sets.
(2) For any sequences of bundles xi and r i such that xi % r i for
all i that have limits x = lim xi and r = lim r i , we have x % r.
(3) For any bundles x0 , x1 and x2 such that x0 x1 x2 there
exists ↵ 2 (0, 1) such that
↵x0 + (1 ↵)x2 ⇠ x1 .
Note that if % (x0 ) and - (x0 ) are closed, then they contain their
boundaries and therefore the intersection of % (x0 ) and - (x0 ), which
is ⇠ (x0 ), also contains their boundaries. This implies that ⇠ (x0 )
is “large.” For example for n = 2, the boundary of R(x0 ) must be a
line. It turns out that continuity of a preference relation % guarantees
that indi↵erence sets are “large enough” so that there exists a utility
function representing %. Moreover, the utility function can be chosen
to be continuous. This is a Theorem proven by Debreu (1959).
Theorem 1. If % is a continuous preference relation, then there
exists a continuous function u : X ! R such that u(x0 ) u(x1 ) if and
only if x0 % x1 .
The proof of this theorem is quite difficult. In the textbook you
find an easier proof under the additional assumption that preferences
are strictly monotonic.
Definition 7. A preference relation % is strictly monotonic if for
all x1 , x2 2 X: x1 x2 implies x1 % x2 , and x1 x2 implies
x1 x2 .
If % is a continuous and strictly monotonic preference relation, then
the indi↵erence take a simple form. Consider for example the case of
two goods (n = 2). In a diagram with x1 on the horizontal axis and
x2 on the vertical axis, they are weakly downward sloping lines. More-
over each indi↵erence curve crosses the 45-degree line and because of
monotonicity, more preferred consumption bundles are on indi↵erence
curves that are further away from the origin. (It is a good exercise to
14 1. CONSUMER THEORY

think though why all these properties must hold if % is continuous and
strictly monotonic.) With all these properties, constructing a utility
function becomes quite simple. Consider the bundles on a fixed indif-
ference curves. We define the utility of all these bundles as the distance
between (0, 0) and the point where the indi↵erence curve crosses the
45-degree line. Doing so for all indi↵erence curves yields a utility func-
tion that represents % and one can show that it is also continuous. To
summarize we have the following Theorem:
Theorem 2. If % is a continuous and strictly monotonic preference
relation, then there exists a continuous and strictly monotonic function
u : X ! R that represents %.5
The formal proof is beyond the scope of this lecture but the argu-
ments outlined for the case that n = 2 capture the main steps of the
argument.
Note that the utility function is not unique: if u(·) represents pref-
erences then so does any function (u(·)) where (.) is increasing. All
that matters for describing choice is the ordering over bundles induced
by the utility function and it is therefore said to be an ordinal utility
representation.

1.5. Demand
Now we want to connect preferences and Marshallian demand. We
start by deriving a Marshallian demand function f % (y, p) from a given
continuous preference relation % which is represented by a continuous
utility function u. Since we assume that the consumer chooses an
optimal bundle x 2 B(y, p), x = f % (y, p) must satisfy
(1.5.1) x % x0 for all x0 2 B(y, p).
We can formulate this in terms of the utility function:
x 2 arg 0 max u(x0 ).
x 2B(y,p)

This is the utility maximization problem. Remember that we as-


sumed that u is continuous and moreover, Lemma 1 shows that B(y, p)
is compact. If we maximize a continuous function over a compact set,
Weierstrass’ Theorem implies that there exists an optimal element in
B(y, p). For the utility maximization problem with implies the exis-
tence of an optimal consumption bundle in the budget set.
The optimal choice x may not be unique for some (y, p) because the
consumer may be indi↵erent between several bundles x1 ⇠ x2 ⇠ x3 ⇠
. . . and at the same time, all of these bundles xk can be optimal. In this
5A function f (x) is called strictly monotonic if f (x) f (y) for x y and
f (x) > f (y) for x y.
1.6. PROPERTIES OF DEMANDS 15

case we would have to define f % (y, p) as a demand correspondence


as follows:
x 2 f % (y, p) if x % x0 for all x0 2 B(y, p).
In the following we will assume that the preferences are such that
for each budget set B(y, p), there is a unique optimal bundle. Therefore
we will always treat f % (y, p) as a function. To guarantee uniqueness,
we assume that preferences are strictly convex:
Definition 8. The preference relation % is
. convex if for all x1 % x0 , ↵x1 + (1 ↵)x0 % x0 for all
↵ 2 [0, 1].
. It is strictly convex if for all x1 % x0 such that x1 6= x0 ,
↵x1 + (1 ↵)x0 x0 for all ↵ 2 (0, 1).
We can state this property equivalently in terms of utility functions.
Lemma 5. Let % be a preference relation with utility representation
u. Then % is (strictly) convex if and only if u is (strictly) quasi-
concave.
Proof. The proof is left for you as an exercise. ⇤
Note that the budget set is convex. If we maximize a strictly quasi-
concave function on a convex set, the maximizer is unique. To sum-
marize we have
Theorem 3. Let % be a strictly monotonic, continuous and strictly
convex preferences relation represented by a strictly monotonic, con-
tinuous and strictly quasi-concave utility function u. Then, for each
budget B(y, p), y 0, p 0, there exists a unique optimal bundle
x that satisfies (1.5.1) and also is the unique optimal solution to the
utility maximization problem. The unique optimal bundle defines the
Marshallian demand f % (y, p) that corresponds to %.
Note that if % is convex, x = f % (y, p) must satisfy (1.5.1) with %
replaced by (where we exclude x0 = x):
(1.5.2) x x0 for all x0 2 B(y, p) such that x0 6= x.

1.6. Properties of demands


1.6.1. Budget-Balancedness. We know that demands must lie
within the budget set: p| f % (y, p)  y. If the consumer’s expenditure
exhausts the total budget then this holds as an equality, p| f % (y, p) =
y, which is known as adding up, Walras’ law or budget balancedness.
Clearly this must hold if preferences are strictly monotonic.
Lemma 6. Let % be a continuous and strictly monotonic preference
relation. Then f % (y, p) satisfies budget balancedness.
16 1. CONSUMER THEORY

Proof. Suppose by contradiction that p| f % (y, p) < y for some


y > 0 and p 0. Then the consumer
Pn can a↵ord to increase con-
| %
sumption by " = (y p f (y, p))/ i=1 pi for all goods so that she
consumes fi% (y, p) + ". This is a↵ordable because
n
X ⇣ ⌘ n
X
pi fi% (y, p) | %
+ " = p f (y, p)+" pi = p| f % (y, p)+y p| f % (y, p) = y.
i=1 i=1

Strict monotonicity implies that the new bundle is strictly preferred to


the old bundle. Therefore f (y, p) is not optimal which is a contradic-
tion. ⇤

If you think about the proof carefully, you will notice that strict
monotonicity is a much stronger assumption on % than what is needed
for the proof. All that is necessary is that for each bundle x 2 X, there
exists some goods such that the consumer strictly prefers to change the
consumption levels of these goods by small amounts (e.g. increase what
she likes and decrease what she dislikes, and in small quantities so that
the overall expenditure is not changed much). Preferences with this
property are called locally non-satiated.
Definition 9. A preference relation % is locally non-satiated
if for any bundle x0 2 X and any ✏ > 0 there exists another bundle
x1 2 X where |x1 x0 | < " and x1 x0 .
1.6.2. Homogeneity. Utility maximization or preference maxi-
mization implies that the demand of a consumer depends on y and p
only insofar as these determine the budget set B. For example in the
utility maximization problem, p and y only appear in the constraint
as arguments of B(y, p). This implies that changes in the values of y
and p that leave budget set unchanged should not change demands.
Hence, since scaling y and p simultaneously by the same factor does
not a↵ect B, demands are homogeneous of degree zero.
Lemma 7. Marshallian demand f % (y, p) is homogeneous of degree
zero, that is
f % ( y, p) = f % (y, p), 8 >0
1.6.3. The Weak Axiom of Revealed Preferences (WARP).
The weak axiom of revealed preference (WARP) is a very basic prop-
erty of f % can be derived from the asymmetry of the strict preference
relation.
Definition 10. A Marshallian demand function f (y, p) satisfies
the Weak Axiom of Revealed Preference if for any two budgets
sets B(y 0 , p0 ) and B(y 1 , p1 ) with x0 = f (y 0 , p0 ) 6= x1 = f (y 1 , p1 ), we
have that x0 2 / B(y 1 , p1 ) whenever x1 2 B(y 0 , p0 ).
1.6. PROPERTIES OF DEMANDS 17

In words this means that if x1 is a↵ordable at budget B(y 0 , p0 ),


then x0 must not be a↵ordable at budget B(y 1 , p1 ). This is a natural
consistency property of demand: If x1 is a↵ordable at budget B(y 0 , p0 ),
then the consumer chose x0 when both x0 and x1 were a↵ordable.
Hence x0 is revealed preferred to x1 . Clearly, if we have another
budget where both x0 and x1 are a↵ordable, then it cannot be the
case that x1 is chosen. Hence, if at the second budget B(y 1 , p1 ) the
consumer chooses x1 , it must be the case that x0 is not a↵ordable.
Otherwise the choice of x1 would be sub-optimal because x0 is already
revealed preferred to x1 —the consumer would be better o↵ choosing
x0 at B(y 1 , p1 ).
This consistency property is fulfilled for demand derived from pref-
erences:
Proposition 1. Let % be a continuous and strictly convex prefer-
ence relation. Then f % satisfies WARP.6
Proof. Continuity and strict convexity together imply that f % (y, p)
is given by (1.5.2). Consider two budgets B(y 0 , p0 ) and B(y 1 , p1 ) with
x0 = f % (y 0 , p0 ) 6= x1 = f % (y 1 , p1 ) and suppose x1 2 B(y 0 , p0 ). (1.5.2)
implies that x0 x0 for all x0 2 B(y 0 , p0 ). Since x1 2 B(y 0 , p0 ) we
0
therefore have x x1 . Now suppose by contradiction that x0 2
B(y 1 , p1 ). Since x1 x0 for all x0 2 B(y 1 , p1 ) this implies x1 x0
but this is not possible because is asymmetric which follows from
Lemma 2. ⇤
Remark 1. If a demand function f satisfies WARP, then it is
homogeneous of degree zero. It is a good exercise to prove this obser-
vation!
1.6.4. Preferences and Marshallian Demand. So far we have
derived Marshallian demand from utility maximization and we have
seen that this implies some structure: Any demand system f is derived
from utility maximization satisfies WARP and budget balancedness. A
natural question is if the converse holds: Suppose we are given demand
system f that satisfy WARP and budget balancedness. Can we always
find a preference relation % (and hence a utility function) such that
the induced Marshallian demand f % coincides with the given demand
system f ?
If there are only two goods, it turns out that WARP and budget
balancedness of f is sufficient for the existence of a preference relation
% such that f = f % .
If there are more goods, we can use transitivity of the preference
relation % to obtain a stronger condition that must be satisfied by
Marshallian demand:
6Without the assumption of strict convexity, demand is not necessarily unique
and f % is a correspondence.
18 1. CONSUMER THEORY

Definition 11. A Marshallian demand function f (y, p) satisfies


the Strong Axiom of Revealed Preference (SARP) if for any se-
quence of budget sets B(y 0 , p0 ), . . . , B(y k , pk ) with x0 6= x1 6= . . . 6=
xk , where x` = f (y ` , p` ), and x`+1 2 B(y ` , p` ) for all ` = 0, . . . , k 1,
then we have that x0 2 / B(y k , pk ).
The Strong Axiom considers sequences of budgets such that the
optimal choice at budget ` + 1 is a↵ordable at budget `, but it is not
chosen at `. Therefore the demand reveals that x` is strictly preferred
to x`+1 . Let us use the binary relation P denote strict revealed prefer-
ence. We thus have x` P x`+1 for all ` = 0, . . . , k 1 or in other word
x0 P x1 P . . . P xk . The strong axiom then requires that x0 is not a↵ord-
able when xk is chosen, i.e. x0 2 / B(y k , pk ). If x0 was a↵ordable when
xk is chosen, then we would have xk P x0 which implies a cycle in the
in the strict revealed preferences:
x0 P x1 P . . . P xk P x0 .
The strong axiom rules out this cycle (i.e. a violation of transitivity)
in strict revealed preferences P by requiring xk 6P x0 .7
It turns out that SARP together with budget-balancedness is not
only necessary but also sufficient for a demand function f (y, p) to be
consistent with utility maximization. See your textbook for references
to a proof.
So far we have discussed the problem of making sure that a given set
of demand functions f (y, p) can be derived from utility maximization.
A related questions arises if we observe a finite data (xi , B(y i , pi ))i=1,...,K
about the consumption choices of a single individual, where xi is the
observed demand when the budget was B(y i , pi ). When is such a data
set consistent with the hypothesis that the consumer is maximizing
rational preferences? Afriat’s Theorem tells us that this is the case
whenever the data set satisfies budget balancedness and SARP.

We have formulated WARP for the consumer choice problem. In this


framework, a bundle x0 is revealed preferred to an alternative bundle x1 ,
if x0 is chosen from a budget B(y 0 , p0 ) at which x1 is a↵ordable—that
is, x1 2 B(y 0 , p0 ).
Alternatively, we could consider more abstract choice situations in
which the set of possible choices is any subset A ⇢ X where X is some
finite set of conceivable alternatives. We can reformulate WARP for this
setting (how would you do that?). It turns out that in this situation, WARP
alone is sufficient for choices to be consistent with preference maximization
and we do not need to formulate a version of SARP.8 Loosely speaking,
7Remember P is strict revealed preference which must be transitive by Lemma
(3) and irreflexive by Lemma (4) if the consumer maximizes a preference relation.
8See for example Chapter 1 in Mas-Colell, Whinston, and Green (1995) or
Chapter 2 in Kreps (1988).
1.7. DUALITY 19

SARP is needed because Marshallian demand f (y, p) only specifies de-


mand from budget sets rather then for all possible sets of consumption
bundles.

1.6.5. Optimality Conditions, Indi↵erence Curves. Suppose


we have a preference relation that is represented by a di↵erentiable,
strictly monotonic, and strictly concave utility function u. Then the
utility maximization problem
(UMP) max u(x) s.t. p| x  y
can be solved using the Kuhn-Tucker conditions. If the optimal solution
specifies positive consumption levels xi > 0 for all goods i = 1, . . . , n,
the first order conditions are necessary and sufficient if the utility func-
tion is strictly concave. This gives rise to the familiar condition that
marginal rates of substitution

@u/@xi
M RSij =
@u/@xj
must equal the price ratio pi /pj . Marshallian demand can be obtained
by solving the system of equations9
pi
p| x = y, and M RSi,1 = for i = 2, . . . , n.
p1
The assumption that preferences are convex (or the utility function
quasi-concave) implies that upper contour sets are convex sets and
M RSij is diminishing (in magnitude) as xi increases. This yields the
familiar picture of convex indi↵erence curves for the case n = 2.

1.7. Duality
1.7.1. Hicksian demands. Marshallian demand maximises util-
ity for given total budget y and prices p. The same quantities also
minimize the expenditure necessary to reach a given utility u given
prices p 0.
Therefore, we consider the dual problem of expenditure mini-
mization
(EMP) min p| x s.t. u(x) u,
which is contrasted with the primal problem given by the utility max-
imization problem (UMP).10 The quantities solving this problem can
be written as functions of utility u and prices p and are called the
9If you are unsure how to get these conditions, derive them step by step as an
exercise! You should also make sure that you can easily derive Marshallian demand
for a standard utility function such as Cobb-Douglas.
10We will always use a continuous utility function which is guaranteed to exist
if preferences are continuous.
20 1. CONSUMER THEORY

Hicksian demand or compensated demands, which we write as


g(u, p).
First order conditions for this problem are clearly similar to those
for solutions of the primal problem:
@u
pi = µ x i = 1, . . . , n
@xi
where µ is the Lagrange multiplier on the utility constraint (again
make sure you can derive a solution to the expenditure minimization
problem!).
Hicksian demands satisfy
. homogeneity in prices, g(u, p) = g(u, p),
. the utility constraint is satisfied with equality if preferences
are strictly monotonic, u(g(u, p)) = u.
(The argument is as follows: if the constraint was slack (i.e. a
strict inequality) at the optimal solution, we can reduce the
quantity consumed for all goods by " > 0. Because preferences
are strictly monotonic, this reduces utility, but if " is suffi-
ciently small, then for the new bundle x1 = g(u, p) (", . . . , "),
the constraint is still slack (u(x1 ) > u) because the utility func-
tion is continuous. On the other hand, the expenditure for x1
is smaller than for g(u, p) and hence g(u, p) cannot be an op-
timal solution to the expenditure minimization problem: x1
also fulfils the constraints and leads to a lower expenditure.)
Hence the constraint cannot be slack at the optimum.
. and demands are unique if preferences are strictly convex. (Try
to prove this yourself. Draw a diagram first)
1.7.2. Indirect utility function and expenditure function.
We can define functions giving the values of the primal and dual prob-
lems. these are known as the indirect utility function
v(y, p) = max u(x) s.t. p| x  y
and the expenditure function
e(u, p) = min p| x s.t. u(x) u.
These functions can be derived from the corresponding demands by
evaluating the objective functions at those demands, i.e.,
v(y, p) = u(f (y, p)) e(u, p) = p| g(u, p).
The duality between the two problems can be expressed by noting
the equality of the quantities solving the two problems
f (e(u, p), p) = g(u, p) f (y, p) = g(v(y, p), p)
or noting that v(y, p) and e(u, p) are inverses of each other in their
first arguments
v(e(u, p), p) = u e(v(y, p), p) = y.
1.7. DUALITY 21

The expenditure function has the following properties:


. It is homogeneous of degree one in prices p, e(u, p) = e(u, p).
The Hicksian demands are homogeneous of degree zero so the
total cost of purchasing them must be homogeneous of degree
one
e(u, p) = p| g(u, p) = p| g(u, p) = e(u, p)
. it is (weakly) increasing in p and strictly increasing in u.
. it is concave in prices
e(u, p1 + (1 )p0 ) = ( p1 + (1 )p0 )| g(u, p1 + (1 )p0 )
= p1| g(u, p1 + (1 )p0 )
+(1 )p0| g(u, p1 + (1 )p0 )
e(u, p1 ) + (1 )e(u, p0 )
To show that the last inequality holds we show that:
p1| g(u, p1 + (1 )p0 ) e(u, p1 ),
and
p0| g(u, p1 + (1 )p0 ) e(u, p0 ).
To see that the last two inequalities must hold, notice that
g(u, p1 + (1 )p0 ) must lead to a (weakly) larger expendi-
ture at prices p than g(u, p1 ) because g(u, p1 ) is a minimizer
1

subject to the constraint that u(x) u and u(g(u, p1 + (1


)p0 )) u. This implies the first inequality, and the argument
for the second inequality is identical with g(u, p1 ) replaced by
g(u, p0 ) .
These are all of the properties that an expenditure function must have.
In other words, a function ẽ(u, p) is an expenditure function derived
from preferences, if and only if it satisfies homogeneity, (weak) mono-
tonicity and concavity.
The properties of the indirect utility function follow from those of
the expenditure function given the inverse relationship between them
. it is homogeneous of degree zero in total budget y and prices
p, v( y, p) = v(y, p). This should be apparent also from the
homogeneity properties of Marshallian demands
. it is weakly decreasing in p and strictly increasing in y.
. it is quasi-convex in prices
v(y, p1 + (1 )p0 )  max(v(y, p1 ), v(y, p0 ))
1.7.3. Shephard’s Lemma and Slutsky-Matrix. We will see
in the following that the expenditure function (especially the concavity
property) allows us to derive all the properties of demand that are
implied by utility maximization.
First we show that Hicksian demand can be derived by simple dif-
ferentiation. A result that is known as Shepard’s Lemma.
22 1. CONSUMER THEORY

Lemma 8. Hicksian demand is related to the expenditure functions


as
@e(u, p)
= gi (u, p).
@pi
Proof. Since e(u, p) = minx:u(x) u p| x the envelope theorem im-
plies that the derivative of e(u, p) with respect to any price pi is given
by minimizer x⇤i = gi (u, p). ⇤
This Lemma together with concavity of the expenditure function
has immediate implications for the matrix of cross-price derivatives of
compensated demand. We define the Slutsky matrix as the matrix
S with elements
@gi (u, p)
Sij = .
@pj
Lemma 9. If the expenditure function is twice continuously di↵eren-
tiable, the Slutsky matrix is symmetric and negative semi-definite.
Proof. By Shephard’s Lemma we have
@gi (u, p) @ 2 e(u, p)
Sij = = .
@pj @pj @pi
By Young’s Theorem, the cross partial derivatives of a twice continu-
ously di↵erentiable function are symmetric. Hence
@ 2 e(u, p) @ 2 e(u, p)
Sij = = = Sji .
@pj @pi @pi @pj
Finally, note that S is the Hessian matrix of e(u, ·) for fixed u. And
e(u, p) is concave in p. Therefore S is negative semi-definite. ⇤
The fact that the Slutsky matrix is symmetric implies that we
can define complementarity and substitutability in a symmetric way
in terms of Hicksian demand. Two goods are (compensated) substi-
tutes if Sij > 0 and (compensated) complements Sij < 0. Symmetry
implies that i is a complement of j if and only if j is a complement of i
if complementarity is defined in terms of compensated demands. This
is not true of uncompensated demands because (as we will see below)
income e↵ects are not symmetric. It is therefore preferable to base
definitions of complements and substitutes on compensated demands.
So far we have only derived restrictions on Hicksian demand which
is not observable. The Slutsky equation allows us to translate these
properties to Marshallian demand.
Lemma 10. For all i, j = 1, . . . , n we have
@gi (u, p) @fi (y, p) @fi (y, p)
= + fj (y, p).
@pj @pj @y
1.7. DUALITY 23

Proof. We use the duality relationship g(u, p) = f (e(u, p), p).


Di↵erentiating w.r.t. pj we obtain
@gi (u, p) @fi (y, p) @fi (y, p) @e(u, p)
= +
@pj @pj @y @pj
@fi (y, p) @fi (y, p)
= + gj (u, p)
@pj @y
@fi (y, p) @fi (y, p)
= + fj (y, p),
@pj @y
In the first line, we have substituted y = e(u, p). The second line
follows from Shephard’s Lemma. The last follows from gj (u, p) =
fj (e(u, p), p) = fj (y, p). ⇤
We have shown that if a function f is a Marshallian demand func-
tion derived from (UMP) then it satisfies
. budget-balancedness,
. homogeneity
. and the Slutsky matrix given by
@fi (y, p) @fi (y, p)
Sij = + fj (y, p)
@pj @y
is symmetric and negative semi-definite.
These conditions, which are sufficient, that is, if a function f satisfies
them, then there exists a preference relation % such that f = f % .
These conditions are also known as integrability conditions.11
The Slutsky equation also allows us to decompose price derivative
of Marshallian demand. Solving for @f@p
i (y,p)
j
we get
@fi (y, p) @gi (u, p) @fi (y, p)
= fj (y, p).
@pj @pj @y
The first term is the called substitution e↵ect and the second terms
is called income e↵ect. If we look at the own price e↵ect (setting
i = j) we get
@fi (y, p) @gi (u, p) @fi (y, p)
= fi (y, p).
@pi @pi @y
Note that the substitution e↵ect is always negative because Sii =
@gi (u,p)
@pi
 0 from negative semi-definiteness of the Slutsky matrix. This
is the compensated law of demand. The compensated demand of
11To see why, note that from Shephard’s lemma and duality we have
@e(u, p)
= fi (e(u, p), p)
@pi
for i = 1, . . . , n. The integrability conditions guaranteed that this system of partial
di↵erential equations can be integrated to obtain a solution e(u, p) and that the
solution satisfies all properties of an expenditure function.
24 1. CONSUMER THEORY

a good is decreasing in its own price. For uncompensated demand the


law of demand does not always hold. Note that the income e↵ect can
be positive or negative. It is positive for inferior good ( @fi@y
(y,p)
< 0).
If a good is strongly inferior, uncompensated demand may increase in
its own price and we have a Gi↵en good. In the converse case, if i is a
normal good, the income e↵ect is negative and has the same sign as the
substitution e↵ect. Therefore the uncompensated law of demand
holds for normal goods.
To conclude the show an analogous result to Shepard’s Lemma for
the indirect utility function and Marshallian demand:
If we di↵erentiate v(e(u, p), p) = u with respect to any price pi and
use u = v(y, p) and y = e(u, p), we get
@v(y, p) @v(y, p) @e(u, p)
+ = 0
@pi @y @pi
@v(y, p)/@pi @e(u, p)
=
@v(y, p)/@y @pi
@v(y, p)/@pi
= gi (v(y, p), p)
@v(y, p)/@y
= fi (y, p)
For the third line we have used Shephard’s Lemma and for the last line
the duality relationship between g and f . The equation
@v(y, p)/@pi
fi (y, p) =
@v(y, p)/@y
is called Roy’s identity. Its importance is that it allows uncompen-
sated demands to be deduced simply from the indirect utility function,
again solely by di↵erentiation
In many ways it is therefore easier to derive a demand system by
beginning with v(y, p) or e(u, p) than by solving the consumer problem
directly given u(x).
1.7.4. Summary.
Utility maximisation problem Cost minimisation problem
max u(x) s.t. p| x  y ! min p| x s.t. u(x) u
# #
Uncompensated demands Compensated demands
x = f (y, p) ! x = g(u, p)
l l
Indirect utility function Expenditure function
v(y, p) ! e(u, p)
1.8. WELFARE 25

1.8. Welfare
One of the many applications of the consumer choice model is to
evaluate welfare e↵ects of price changes. This is an important topic
for public economics and economics of industrial organization where
the welfare e↵ects of di↵erent policy interventions (i.e., taxation or
regulations that influence the competitive environment) are compared.
We focus on the following application. Suppose that in the status
quo—that is, before a new policy is implemented, a consumer has a
given budget y > 0 and prices are given by p0 0. We would like
to investigate how a change from p0 to a new price vector p1 0,
12
changes the consumer’s welfare if her budget y is unchanged. We also
assume that we are in an ideal world where enough data is available to
reliably estimate expenditure functions and demand functions without
discussing how this is achieved.
What do we mean by changes in consumer welfare? A natural
measure of the welfare of a consumer is the indirect utility function
v(y, p). A price change from p0 to p1 makes the consumer worse o↵
(her welfare decreases) if and only if
v(y, p0 ) > v(y, p1 ).
In other words, our welfare criterion compares the utility achieved by
the optimal choice of the consumer in the status quo, to the utility
achieved by the optimal choice after the price change.
Since utility function are not unique we can choose any indirect
utility function that is convenient for our analysis. If we use v̂(y, p) =
(v(u, p)) where is a strictly increasing function, we measure changes
in an alternative utility function û(x) = (u(x)) that represents the
same preference relation. Therefore any welfare comparison remains
unchanged. One particular indirect utility function can be derived
from the consumer’s expenditure function. Let us fix an arbitrary price
vector p̄ 0 and set (u) = e(u, p̄). Since e(u, p̄) is strictly increasing
in u we have:
v(y, p0 ) > v(y, p1 ) () e(v(y, p0 ), p̄) > e(v(y, p1 ), p̄).
Hence the function w(p̃) = e(v(y, p̃), p̄) measures the welfare e↵ect
of changes in the price vector p̃. One can interpret the change in
e(v(y, p̃), p̄), i.e.,
(1.8.1) e(v(y, p1 ), p̄) e(v(y, p0 ), p̄)
as the money-value of the change in welfare due to the change in prices.
We have thus constructed a money-metric (indirect) utility func-
tion. So far this money-value associated with the welfare change de-
pends on the (arbitrarily chosen) price vector p̄. Therefore, it is not
12We could extend the discussion to allow for y.
26 1. CONSUMER THEORY

useful to take about the welfare change as being equivalent to the mone-
tary value given in (1.8.1). This monetary value changes with p̄. There
are two particular choices of p̄, where a natural interpretation emerges
and these give rise to two commonly used welfare measures:
If we set p̄ = p0 , (1.8.1) is called the equivalent variation:
EV (p0 , p1 , y) = e(v(y, p1 ), p0 ) e(v(y, p0 ), p0 )
| {z }
=:u1
1 0
= e(u , p ) y.
With this definition, the consumer is indi↵erent between the price
change, and a receiving a monetary transfer of EV (p0 , p1 , y) because
this is the additional wealth needed to achieve u1 at price p0 :
y + EV (p0 , p1 , y) = e(u1 , p0 ).
Indeed we have
u1 = v(e(u1 , p0 ), p) = v(y + EV (p0 , p1 , y), p).
The consumer is worse o↵ if and only of the equivalent variation is
negative. (As an exercise, make sure that you fully understand why
the last sentence is correct!)
If we set p̄ = p1 , we obtain the compensating variation:
CV (p0 , p1 , y) = e(v(y, p1 ), p1 ) e(v(y, p0 ), p1 )
| {z }
=:u0
=y e(u0 , p1 ).
With this definition the consumer is indi↵erent between the price change
together with a budget reduction of CV (p0 , p1 , y), and the original sit-
uation before the price change. The reduction adjusts the budget such
that exactly the old utility level u0 can be achieved at the new prices:
y CV (p0 , p1 , y) = e(u0 , p1 ).
Again, the consumer is worse o↵ if and only of the compensating vari-
ation is negative.
Since both EV and CV measure the change in an indirect utility
function that corresponds to the same preferences, we have EV > 0 if
and only if CV > 0. The two welfare measures always give the same
answer to the questions: Does the prices change make the consumer
better o↵?

In this section, we consider the welfare of a single consumer. It is harder to


make a welfare judgement if there are many consumers. If the welfare of all
consumers increases or decreases in response to the price change, there is
an unambiguous answer to the questions whether the price change increases
welfare. But what if some consumer are made better o↵ and some are made
worse o↵? It has been suggested to use the sum of the compensating
variations across consumers as a welfare measure. A price change that
1.8. WELFARE 27

leads to a positive sum of CV’s, is called a Kaldor-Hicks improvement.


The interpretation is as follows: If the sum of the compensating variations
is positive, then there exist wealth transfers such that together with the
price changes, no consumer is worse o↵ and the sum of the transfers is zero.
If the wealth transfers are feasible (i.e. enough information is available to
determine the transfers), then this change (price change together with
transfers) is a Pareto improvement (See Chapter (4) for a definition of
Pareto improvement). If the transfers are not implemented, there is a
potential for a Pareto improvement but the price change alone may leave
some consumers better o↵. In practice, the informational requirement will
typically prevent the implementation of the transfers. I am including this
remark because the Kaldor-Hicks improvement as a welfare measure has
some paradoxical properties: In particular, there are sequences of price
changes p0 ... pk where all individual price changes are Kaldor-
Hicks improvements but pk = p0 . This shows that the Kaldor-Hicks
improvement violates transitivity.

Now we want to derive a formula that allows us to compute the


equivalent variation and compensating variation directly from Hicksian
demand, as the area under the Hicksian demand curve. Then we move
on to compare this to the area under the Marshallian demand curve,
which is known as Consumer Surplus.
To simplify the exposition, we look at a price change where the
prices of all goods but x1 are held constant. That is, we have

p0i = p1i , 8i = 2, . . . , n.

Let p0 1 = (p02 , . . . , p0n ). Then we have p0 = (p01 , p0 1 ) and p1 =


(p11 , p0 1 ). With this notation we have

EV (p0 , p1 , y) = e(u1 , p0 ) y
= e(u1 , p0 ) e(u1 , p1 )
= e(u1 , (p01 , p0 1 )) e(u1 , (p11 , p0 1 ))
Z p11 
de(u1 , (z, p0 1 ))
= dz
p01 dz
Z p11
(Shephard’s Lemma) = g1 (u1 , (z, p0 1 ))dz
p01

So we obtain the equivalent variation as the area under the Hicksian


demand curve. Notice that p01 < p11 implies that EV (p0 , p1 , y) is neg-
ative (or zero if g1 (u1 , (z, p0 1 )) = 0 for all z 2 [p01 , p11 ]). So a price
increase leads to a reduction in welfare.
Using essentially the same steps, we can compute the compensating
variation
28 1. CONSUMER THEORY

CV (p0 , p1 , y) = y e(u0 , p1 ).
= e(u0 , p0 ) e(u0 , p1 ).
= e(u0 , (p01 , p0 1 )) e(u0 , (p11 , p0 1 ))
Z p11 
de(u0 , (z, p0 1 ))
= dz
p01 dz
Z p11
(Shephard’s Lemma) = g1 (u0 , (z, p0 1 ))dz.
p01

We see that the two measures are di↵erent because EV is the area
under the Hicksian demand curve evaluated at u1 and CV is the area
under the Hicksian demand curve evaluated at u0 . But as we remarked
before, the sign of the two measures is always the same.
Equivalent and compensating variation coincide if there are no in-
come e↵ects for the good for which the price is changed when moving
from p0 to p1 . Suppose that p0 i = p1 i = p i for some i—that is only
the price of good i changes, and there are no income e↵ects for the
good i—that is, @fi (y, p)/@y = 0. This implies that
gi (u0 , p) = fi (e(u0 , p), p) = fi (e(u1 , p), p) = gi (u1 , p).
We have
Z p1i
0 1
EV (p , p , y) = gi (u1 , (z, p i ))dz
p0i
Z p1i
0 1
CV (p , p , y) = gi (u0 , (z, p i ))dz
p0i

Hence we have (using gi (u0 , p) = gi (u1 , p) from above):


Z p1i
0 1
EV (p , p , y) = gi (u1 , (z, p i ))dz
p0i
Z p1i
= gi (u0 , (z, p i ))dz
p0i

= CV (p0 , p1 , y)
Moreover, in the absence of income e↵ects, we can write
gi (u0 , p) = gi (u1 , p) = fi (y, p) = fi (ȳ, p)
for an arbitrary budget ȳ. Hence we have
EV (p0 , p1 , y) = CV (p0 , p1 , y)
Z p1i
= fi (ȳ, (z, p i ))dz.
p0i
1.8. WELFARE 29

The welfare measures EV and CV can be obtained as the area un-


der the Marshallian demand curve. This is the Marshallian Consumer
surplus.
Z p1i
0 1
(1.8.2) CS(p , p ) = fi (ȳ, (, z, p0 i ))dz.
p0i

Notice that Consumer surplus generally does not coincide with the
equivalent or compensating variation. They are only equal in the ab-
sence of income e↵ects for the goods whose prices are changed.
If there are income e↵ects, Consumer Surplus is not a proper mea-
sure of welfare. In general, it lacks the foundation as the change in
some indirect utility function. In the next section we will introduce
quasi-linear utility functions where income e↵ects are absent for all
but one good. For models with quasi-linear utility, consumer surplus is
a proper welfare measure and this justifies its frequent use in applied
work. But you should bear in mind that this abstracts from income
e↵ects for the goods whose price changes are studied.13
So far we have not made any observation that would lead us to
prefer EV over CV or vice versa. This changes if we want to compare
the welfare e↵ect of di↵erent price changes. Suppose again that the
status quo is given by a price level p0 and wealth level y. If we want
to compare the e↵ects of two di↵erent price changes that lead to the
vectors p1 and p2 respectively, only the equivalent variation is an ap-
propriate tool. For each price change, it measures the change in the
same indirect utility function

e(v(y, p), p0 ).

Therefore, if EV (p0 , p1 , y) > EV (p0 , p2 , y) we have

e(v(y, p1 ), p0 )e(v(y, p0 ), p0 ) > e(v(y, p2 ), p0 ) e(v(y, p0 ), p0 )


e(v(y, p1 ), p0 ) > e(v(y, p2 ), p0 )
() v(y, p1 ) > v(y, p2 )

In contrast, the compensating variation for the first price change


measure the change in the indirect utility function

e(v(y, p), p1 ),

whereas the compensating variation for the second price change mea-
sures the change in
e(v(y, p), p2 ).

13We will discuss below in a supplemental section why the use of CS can lead
to incorrect welfare conclusions.
30 1. CONSUMER THEORY

In general, these two measures are not comparable. If CV (p0 , p1 , y) >


CV (p0 , p2 , y) we have
e(v(y, p1 ), p1 ) e(v(y, p0 ), p1 ) > e(v(y, p2 ), p2 ) e(v(y, p0 ), p2 )
| {z } | {z }
=y =y

() e(v(y, p ), p ) > e(v(y, p0 ), p1 )


0 2

But this is an inequality that compares the expenditures needed to


reach u0 = v(y, p0 ) at the two price vectors p1 and p2 . This inequality
does not imply v(y, p1 ) > v(y, p2 ). Therefore, compensating variation
cannot be used to compare the welfare e↵ects of two di↵erent price
changes.

Focussing on a single price change was useful to simplify the exposition


but deriving at formula for EV and CV if the whole price vector changes
follows the same logic. Consider first the equivalent variation. We decom-
pose the price change p0 p1 into intermediate steps. At each step, we
only change the price of a single good.
p̃0 = p0 p̃1 = (p11 , p0 1 )
p̃2 = (p11 , p12 , p03 , . . . , p0n )
p̃n 2 = (p11 , . . . , p1n 2 , p0n 1 , p0n )
..
.
p̃n 1 = (p1 n , p0n )
p̃n = p1 .
The change from one intermediate price vector p̃i 1 to the next p̃i is only
in the price for good i. With this decomposition of the price change, we
can calculate the equivalent variation step by step.
EV (p0 , p1 , y) = e(u1 , p0 ) y
Xn
= e(u1 , p̃i 1 ) e(u1 , p̃i )
i=1
n Z
X p1i 
de(u1 , (p11 , . . . , p1i 1 , z, p0i+1 , p0n ))
= dz
i=1 pi
0 dz
Xn Z p1
i
(Shephard’s Lemma) = gi (u1 , (p11 , . . . , p1i 1 , z, p0i+1 , p0n ))dz.
i=1 p0i

In this formula we change prices one-by-one and for each individual price
change, we calculate the area under the Hicksian demand curve. Adding
up all these areas yields the equivalent variation. When you apply this
formula, you have to be careful to always use the utility level u1 associ-
ated with the final price vector and not ũi = v(y, p̃i ). Therefore, you
1.8. WELFARE 31

cannot calculate EV (p0 , p1 , y) as the sum of individual equivalent


variations for individual price changes:

n
X
0 1
EV (p , p , y)6= EV (p̃i 1 , p̃i , y)
i=1

For the compensating variation we obtain the same formula with u1


replaced by u0 :

n Z
X p1i
0 1
CV (p , p , y) = gi (u0 , (p11 , . . . , p1i 1 , z, p0i+1 , p0n ))dz.
i=1 p0i

You may be worried that you get di↵erent results in this formula if you
change prices one-by-one but in a di↵erent order. In the formula above we
started with the price of good one, then changed the price of good two
and so forth until we changed price of good n. Is it guaranteed that we
get the same result if we first change the price of good n, then of n 1
and so forth until good 1? It turns out that the answer is yes, because of
the symmetry of the Slutsky matrix.14
Equivalent and compensating variation coincide if there are no income
e↵ects for the goods for which prices are changed when moving from p0 to
p1 . Obviously this cannot be true if all prices change because it cannot be
the case that demand for all goods is independent of income. Therefore,
let us suppose that when moving from p0 to p1 , prices change only for a
subset of goods I 2 {1, . . . n} and assume that for all goods i 2 I, there
are no income e↵ects, i.e., @fi (y, p)/@y. This implies that fori 2 I,

gi (u0 , p) = fi (e(u0 , p), p) = fi (e(u1 , p), p) = gi (u1 , p).

If only prices for goods i 2 I change, we have

n Z
X p1i
0 1
EV (p , p , y) = gi (u1 , (p11 , . . . , p1i 1 , z, p0i+1 , p0n ))dz
0
i2I pi
Xn Z p1
i
CV (p0 , p1 , y) = gi (u0 , (p11 , . . . , p1i 1 , z, p0i+1 , p0n ))dz
i2I p0i

14For those with a strong mathematical background: symmetry implies that


g(u, p) is a conservative vector field (it satisfies the integrability condition
@gi (u,p)
@pk = @gk@p
(u,p)
i
) which can be integrated along arbitrary paths to obtain the
expenditure function. If you do not understand this, don’t worry, just remember
that symmetry is important for this formula to work.
32 1. CONSUMER THEORY

Hence we have (using gi (u0 , p) = gi (u1 , p) from above):


X n Z p1
i
0 1
EV (p , p , y) = gi (u1 , (p11 , . . . , p1i 1 , z, p0i+1 , p0n ))dz
0
i2I pi
Xn Z p1
i
= gi (u0 , (p11 , . . . , p1i 1 , z, p0i+1 , p0n ))dz
i2I p0i

= CV (p0 , p1 , y)
Moreover, in the absence of income e↵ects, we can write for i 2 I
gi (u0 , p) = gi (u1 , p) = fi (y, p) = fi (ȳ, p)
for an arbitrary budget ȳ. Hence we have
EV (p0 , p1 , y) = CV (p0 , p1 , y)
X n Z p1
i
= fi (ȳ, (p11 , . . . , p1i 1 , z, p0i+1 , p0n ))dz.
i2I p0i

The welfare measures EV and CV can be obtained as the area under


the Marshallian demand curve (summed up over individual price changes).
This is the Marshallian Consumer surplus.
Xn Z p1
i
0 1
(1.8.3) CS(p , p ) = fi (ȳ, (p11 , . . . , p1i 1 , z, p0i+1 , p0n ))dz.
i2I p0i

Notice that Consumer surplus generally does not coincide with the equiv-
alent or compensating variation. They are only equal in the absence of
income e↵ects for the goods whose prices are changed.
Moreover, if there are multiple goods for which the price is changed,
consumer surplus is not well defined. Notice that in (1.8.3), prices are
changed in a particular order. We have argued above for EV and CV , that
the order does not matter because of the symmetry of the Slutsky matrix.
For Marshallian demand, however, the substitution matrix (@fi /@pk )ik is
generally not symmetric if there are income e↵ects. Therefore, the value
of CS depends on the order of the price changes and consumer surplus is
not well defined.
To conclude we give a (slightly lengthy) example that demonstrates
this: Suppose an individual consumes two goods x = (x1 , x2 ) and the
utility function is given by
u(x) = (x1 1)
1
(x2 2)
2

where, i 0, 1 + 2 = 1, and i > 0. Marshallian demand for this


utility function is given by the following functions,

(y p| ),
i
(1.8.4) fi (y, p) = i +
pi
provided that y > p| so that demand for good i exceeds i. Let us
assume in the following that this is the case. We set 1 = 2 = 1/2,
1.8. WELFARE 33

1 = 1/4, 2 = 1/8, and y = 1 and compare the two price vectors


p = (2, 1) and p1 = (1, 52 ). Inserting demand into the utility function we
0

obtain the indirect utility:


1 y p41 p82
v(y, p) = p .
2 p1 p2
Evaluating this at y = 1 and p0 , p1 we get
0 1 1 24 1
8 3
v(1, p ) = p = p ⇡ .1326
2 2 16 2
1 15
11 4 7
v(1, p1 ) = q 82
= q ⇡ .1383
2 5
32 52
2

This implies that the consumer prefers the price vector p1 .


Now let us compute the consumer surplus for this price change: Given
the parameter restrictions, the demand system simplifies to
1 1 1 p2 1 1 p2
f1 (1, p) = + = +
4 2p1 8 16p1 8 2p1 16p1
1 1 1 1 p1 1 1 1 p1
f2 (1, p) = + = +
8 2p2 16 8 p2 16 2p2 8 p2
Since multiple prices change at the same time, we consider two price
changes p0 p̃ p1 , where p̃ = (p11 , p02 ) = (1, 1). The change in
consumer surplus is
Z p11 Z p12
0
f1 (1, p1 , p2 )dp1 f2 (1, p11 , p2 )dp2
p01 p02
Z 1 Z 5
1 1 p02 2 1 1 1 p11
= + dp1 + dp2
2 8 2p1 16p1 1 16 2p2 8 p2
Z 2 Z 5
1 1 1 1 1 11
2
= + dp1 + dp2
1 8 2p1 16p1 1 16 2p2 8 p2
Z 2 Z 5
1 1 1 1 3 2 1 11
= + dp1 dp2
8 1 2p1 16p1 16 2 1 2p2 8 p2
Z 2 Z 5
1 7 1 3 2 1
= + dp1 dp2
32 16 1 p1 8 1 p2
1 7 3 5
= + ln 2 ln
32 16 8 2
⇡ .0091.
We find the that consumer surplus goes down if we move from p0 to p1 .
This would indicate that the consumer is better o↵ at p0 . However, we
have shown above by comparing the indirect utility at the two price vectors
that this is not the case. The consumer has a higher indirect utility at p1 .
This illustrates that the change in Consumer Surplus cannot be used as a
34 1. CONSUMER THEORY

welfare measure for this consumer. The reason is that the demand func-
tions have income e↵ects, and therefore, the change in consumer surplus
is di↵erent from the true welfare measures, the Equivalent Variation or
Compensating Variation.

1.9. Grouping and Separability


The need to simplify the analysis both in theoretical and in em-
pirical work often requires that we consider grouping of goods and
analysing their demand in isolation from the rest of the consumer prob-
lem. For example, we may want to analyse the market for telecommuni-
cation services independently from the market for fruit and vegetables.
This seems a reasonable simplification but we want to understand if we
can develop a model where we see the assumptions needed for such a
simplification. It turns out that we can either justify the simplification
using assumptions about price movements or by assumptions about
preferences. We will turn to this topic at the end of this section. To
prepare the analysis, we first consider two classes of preferences, that
are also frequently used in applications.

1.9.1. Quasilinear Preferences. Quasilinearity requires that in-


di↵erence curves all have the same shape, in the sense of being trans-
lated versions of each other. In this case there exists one good j such
that indi↵erence between any two bundles x0 and x1 is not a↵ected if
the same amount of j is added to both bundles—that is,
x0 ⇠ x1 () 8zj : x0 + (zj , 0) ⇠ x1 + (zj , 0).
In terms of the utility function, preferences are quasilinear if and
only if u(x) = ( (x j ) + xj ) for some sub-utility function v : Rn+ 1 !
R and a strictly increasing function : R ! R. Most of the time we
will use the standard specification u(x) = v(x j ) + xj for which utility
is measured in units of good j. We should bear in mind, however, that
utility is still ordinal.
For quasilinear preferences, income expansion paths are straight
lines parallel to the jth axis and quantities of all goods except the jth
good are independent of y given p, provided that the fixed quantities
of these goods in question remain a↵ordable. This means that for all
goods i 6= j, we have @fi@y (y,p)
= 0. In particular, this implies (using
the Slutsky equation) that Hicksian and Marshallian demands react
to price changes in the same way because there are no income e↵ects.
This justifies the use of Marshallian Consumer Surplus as a welfare
measure in models with quasi-linear preferences as we have seen in the
previous section.
There is one caveat to this conclusion, in typical models with quasi-
linear preferences, e.g. applications in IO, one analyses a small group of
1.9. GROUPING AND SEPARABILITY 35

commodities x j and treats xj as a basket of all remaining goods


rather than a single good or the amount of money spend on all other
goods. In the following we will see what assumptions on the underly-
ing utility function justify this. An important property towards this
justification is homotheticity.
1.9.2. Homothetic Preferences. Preferences are homothetic if
indi↵erence is invariant to scaling up consumption bundles: x0 ⇠ x1
implies x0 ⇠ x1 for any > 0. This imposes no restriction on the
shape of any one indi↵erence curve considered in isolation but implies
that all indi↵erence curves have the same shape in the sense that those
further out are magnified versions from the origin of those further in.
As a consequence, marginal rates of substitution are constant along
rays through the origin.
This implies that the income expansion paths are rays through the
origin: Ratios between chosen quantities are independent of y given
p and hence f (y, p) is a linear function of income. Consequently, the
budget shares of all commodities are independent of income.
Homotheticity clearly holds if the utility function is homogeneous of
degree one: u( x) = u(x) for > 0. In fact, up to increasing transfor-
mation, this is the only class of utility functions which give homothetic
preferences, i.e., preferences are homothetic i↵ u(x) = ( (x)) where
( x) = (x) for > 0.
Given that homothetic preferences lead to Marshallian demand that
is linear in income we can define a commodity basket ↵(p) = f (1, p)
(which depends on the whole price vector) and write Marshallian de-
mand as
f (y, p) = yf (1, p) = y↵(p)
If we use the homogeneous utility representation of preferences, which
necessarily exists, then doubling y doubles demands and utility. Thus,
the corresponding indirect utility function is also linear in y:
y
v(y, p) = u(f (y, p)) = u(y↵(p)) = yu(↵(p)) = ,
a(p)
where we set a(p) = 1/u(↵(p)). Note that the price of ↵(p) is one, so
the consumer gets u(↵(p)) units of utility for an expenditure of one.
The price of one unit of utility is therefore a(p) = 1/u(↵(p)). So we
can interpret the price index a(p) as the amount of money needed to
“buy one unit of utility” if the price vector is p. The indirect utility
function depends on p only through the one-dimensional “price index”
a(p) rather than the whole price vector. If we solve u = y/a(p) for y
we get the expenditure function
e(u, p) = u a(p).
Again we see that we can interpret the price index a(p) as the amount
of money needed to “buy one unit of utility” if the price vector is p.
36 1. CONSUMER THEORY

Homothetic preferences play an important role in justifying models


with quasi-linear preferences in which x j is a basket of goods that is
not of interest to the modeller. We will see this below.
1.9.3. Separability. Suppose that we partition the consumption
bundle x into groups xI1 , xI2 , xI3 , . . . ,Swhere each Ik is a subset of
goods, all goods are in one of the subets: k=1,2,... Ik = I, and for k 6= `
we have no overlap: Ik \ I` = ;. In applications we often assume that
preferences over sub-bundles for a group are independent or separable
from consumption in the other groups, and this separability can take
various forms. We have already seen a very strong form of separability
in the case of quasi-linear preferences, where one commodity (j) enters
the utility function additively and linearly. The additivity implies that
marginal rates of substitution M RSik between the commodities i, k 6=
j are independent of the consumption level xj . On top of this, the
linearity in xj implies that the marginal rate of substitution between
any good i 6= j and j is independent of xj . This latter property implies
that if we start with a budget where fj (y, p) > 0, then an increase in
income at constant prices is absorbed completely by good j because a
change of any xi , i 6= j, would imply that M RSij no longer equals the
price ratio pi /pj .15
We now look at weaker notions of separability that retain some but
not all of these properties.
1.9.3.1. Weak separability. We say that the group I1 is a weakly
separable group if preferences over xI1 are independent of the quantities
in other groups. Formally, weak separability requires that if we have
for x0I1 , x1I1 that
(x0I1 , xI2 , . . . ) % (x1I1 , xI2 , . . . ),
then
(x0I1 , x0I2 , . . . ) % (x1I1 , x0I2 , . . . )
for all x0I2 , . . . . In words, if two bundles x0 and x1 di↵er only in the
consumption levels of the first group I1 , then changing the consumption
bundles in the groups Ik , k 6= 1—in the same way for both x0 and
x1 —does not change the preferences between x0 and x1 .
Stated in terms of the utility function, weak separability requires
that u(·) has the structure
u= ( I1 (xI1 ), xI2 , . . . )

for some within-group utility function I1 (·). Note that this implies the
MRS between goods i, j 2 I1 are independent of quantities of goods
15You might wonder what would happen if u is also linear in i, e.g., u(x) =
v(x ij ) + axi + xj . In this case we can only have fj (y, p) > 0 if a/pi 1/pj ,
because otherwise the consumer could substitute i for j until xj = 0 and increase
his utility (because M RSij di↵ers from the price ratio). But if a/pi 1/pj , by the
same logic, it is still optimal that xj absorbs all additional income.
1.9. GROUPING AND SEPARABILITY 37

outside the group:


dxi @ I1 /@xj
= .
dxj u=const. @ I1 /@xi

This is not true for the MRS between i, j if at least one of i and j is
not in I1 .
A second implication is that the M RSij between two goods i, j 2 /
I1 , depends on the consumption in the weakly separable group only
through the level of I1 (xI1 ). It does not change with the composition
of xI1 as long as I1 (xI1 ) is unchanged. For example, if fruit and
vegetables are a weakly separable group, then there may be di↵erent
baskets of fruit and vegetables that yield the same subutility, and the
preferences over other commodities do not depend on which of these
baskets the consumer consumes (as long as the subutility for fruit and
vegetables is the same).
1.9.3.2. Strong separability. Next, suppose that utility can be writ-
ten in an additive form
u= I1 (xI1 ) + I2 (xI2 ) + I2 (xI2 ) + ....
It is immediately clear that with this utility function, every group is a
weakly separable group. But we can also combine arbitrary groups to
obtain new weakly separable groups.16 For example if Ĩ = Ii [ Ij [
Ik [ . . . and J˜ = {` 2
/ Ĩ}, strong separability implies that if
(1.9.1) (x0Ĩ , xJ˜ ) % (x1Ĩ , xJ˜ )
then
(1.9.2) 8x0J˜ : (x0Ĩ , x0J˜ ) % (x1Ĩ , x0J˜ ).
Note that Ĩ can be an arbitrary union of groups, but a group cannot be
split. The condition on preferences stated here is a necessary condition.
If there are more than three groups Ik , then the converse also holds, i.e.,
the condition is also sufficient. Preferences which admit a additively
separable utility representation are called strongly separable, additively
separable or block-additive.
1.9.4. Composite Commodities. Finally, we consider assump-
tions on preferences that allow us to use a utility specification often
used in practice. In particular it allows to analyse consumption for one
group, neglecting demand for and relative prices of the goods outside
that group. Suppose we divide all goods into two groups I and J and
we want to analyse demand for group I (this could be a single commod-
ity or many goods). In applications, it is often assumed that the goods
16Note that this is still weaker then the quasi-linearity assumption which implies
that M RSij is independent of xj —one of the goods for which we calculate the MRS.
Strong separability only implies that M RSij is independent of all xk if k does not
belong to the same groups as i or j, in particular we cannot have j = k.
38 1. CONSUMER THEORY

in group J can be aggregated into a composite commodity, say x̂J 2 R,


with a price index p̂J 2 R. Note that these are one-dimensional vari-
ables so we need to find a model that allows us to represent the whole
consumption bundle xJ and its price vector pJ by single-dimensional
variables.
The goal is then to look at a simplified problem. The general util-
ity maximization problem without the composite commodity can be
written as

(UMP) max u(xI , xJ ) s.t. pI | xI + pJ | xJ  y.


(xI ,xJ )

Here, we have written xI and xJ separately to highlight the two groups


but the problem is the same as the standard utility maximization prob-
lem with u(x) in the objective and p| x  y in the constraints.
A first simplification that uses the composite commodity is the
following:

[
(UMP) max û(xI , x̂J ) s.t. p|I xI + p̂J x̂J  y
(xI ,x̂J )

Note that the composite commodity enters the utility function and
we don’t impose any more structure except that we can “hide” the
consumption vector xJ behind the composite commodity and its price
vector pJ behind the price index. The first goal of this section is to
show that under some assumptions, it is indeed possible to simplify
d without changing the optimal bundle xI
(UMP) and analyse (UMP),
for the groups of goods that we are interested in.

Proposition 2. If J is a weakly separable group, and pref-


erences for group J are homothetic, then there exist a composite
commodity x̂J and a utility function û such that the optimal bundle x̂⇤I
d coincides with the optimal optimal bundle x⇤ in
in the problem (UMP) I
the problem (UMP).

Proof. Weak separability implies that we may use a utility repre-


sentation of the form

u(xI , xJ ) = v(xI , (xJ )).


1.9. GROUPING AND SEPARABILITY 39

Homothetic preferences imply that we can take to be homogeneous


of degree one.17 Therefore we can write (UMP) as
(1.9.3) max v(xI , (xJ )) s.t. pI | xI + pJ | xJ  y.
(xI ,xJ )

Imagine that instead of solving this problem directly, we proceed


in two steps:
(1) Suppose we have already decided the two budget shares for
groups I and J , which we denote by yI and yJ . Given yJ
and price pJ , we can determine optimal demand for group J
which we denote by f J (yJ , pJ ). (Since J is a weakly separable
this does not depend on the consumption bundle for group I.)
Inserting this into the sub-utility function we obtain
vJ (yJ , pJ ) = (f J (yJ , pJ )),
and we call vJ (yJ , pJ ) the indirect sub-utility function for
group J .
(2) In the second step, we replace the sub-utility function in (1.9.3)
by the indirect sub-utility function and the expenditure for
group J , which is pJ | xJ , by yJ . This is yields the problem:
max v(xI , vJ (yJ , pJ )) s.t. pI | xI + yJ  y.
(xI ,yJ )

Solving this yields the optimal bundle for group I and the
budget share for group J .
This approach is called “two-stage budgeting”.18
Now let us look closer at Step 1 outlined above. For given yJ and
price pJ , we solve the following problem:
max (xJ ) s.t. pJ | xJ  yJ
xJ

17To be fully precise, homothetic preferences imply that there is some strictly
increasing function ' and a linear-homogeneous function ˆ such that
(xI ) = '( ˆ(xI )).
But then we can use
w(xI , ˆ(xI )) = v(xI , '( ˆ(xI )))
instead of v. Now the second argument of w is the linear-homogeneous function
ˆ. Since we can always switch to w and ˆ, for any v and we started with, it is
without loss to use a utility representation where the sub-utility function for group
J is linear-homogeneous.
18It is important to note that we apply Step 1 only to group J . To determine
the optimal bundle for group I, we need to know more than the budget share yI and
the prices pI . Since I is not assumed to be a weakly separable group, M RSi,k for
i, k 2 I also depends on xJ through (xJ ). So we cannot solve for xI completely
independently of xJ . If on the other hand, we had more than one weakly separable
group we could apply step one for each group separately and then proceed with step
two to determine all budget shares. (This will be used in the third demonstration
exercise.)
40 1. CONSUMER THEORY

Remember that homothetic preferences imply that the optimal bun-


dle is linear in the total budget:
f J (yJ , pJ ) = yJ ↵(pJ ),
where ↵(pJ ) is the consumption basket chosen if yJ = 1. Furthermore,
recall that the indirect subutility function is given by
yJ
vJ (yJ , pJ ) = (yJ ↵(pJ )) = yJ (↵(pJ )) = ,
p̂J
where we have used that is homogeneous of degree one for the sec-
ond equality. For the last equality we define the price index p̂J =
1/ (↵(pJ )) which is the expenditure needed to get one unit of sub-
utility for given prices pJ . We can therefore define the composite
commodity as a basket with the same composition as ↵(pJ ), scaled
to a size that yields exactly one unit of sub-utility. x̂J = p̂yJJ is the
number of units of the composite commodity that the consumer buys
if the budget share is yJ .
In Step 2 we solve
yJ
max v(xI , ) s.t. pI | xI + yJ  y.
(xI ,yJ ) p̂J
Using x̂J = yJ
p̂J
we get v(xI , x̂J ) and pI | xI + x̂J p̂J  y so we have
obtained a problem of the form (UMP).d This completes the proof of
the Proposition. ⇤

We often see further simplifications in applications. The strongest


simplification is to work with a problem of the form
]
(UMP) max I (xI ) + ỹJ s.t. p̃I | xI + ỹJ  ỹ
(xI ,ỹJ )

d
instead of (UMP). This requires stronger assumptions:
Proposition 3. If preferences are additively separable with
(1.9.4) u(x) = I (xI ) + J (xJ ),

and J is homogeneous of degree one, then there exist a composite


g
commodity x̂J such that the optimal bundle x̃⇤I in the problem (UMP)

coincides with the optimal optimal bundle xI in the problem (UMP).
Note that we have made the stronger assumption of additive sep-
arability instead of weak separability. Moreover, it is not enough to
assume that preferences for group J are homothetic. This would only
yields a simplified problem of the form
max I (xI ) + '(ỹJ ) s.t. p̃I | xI + ỹJ  ỹ.
(xI ,ỹJ )
1.10. MARKET DEMAND AND AGGREGATION 41

We need to make the additional assumption that the function J (xJ )


in the additive utility representation (1.9.4) is linear-homogeneous.19
Step 1 is unchanged so we can directly proceed to Step 2. With
the additive separability, Step 2 involves the maximization problem
yJ
max I (xI ) + s.t. pI | xI + yJ  y.
(xI ,yJ ) p̃J
g we divide both sides of the budget
To bring this into the from (UMP),
constraint by the price index p̃J :
✓ ◆
yJ pI | yJ y
max I (xI ) + s.t. xI +  .
(xI ,yJ ) p̃J p̃J p̃J p̃J
g
Finally we set ỹJ = pyJJ , p̃I = p̃pJI , ỹ = pyJ which yields (UMP).
In summary, we have used strong separability and homothetic pref-
erences to justify a separation that allows us to disregard individual
price for commodities in J and instead consider a composite commod-
ity and a price index.
This sort of preference structure justifies common empirical simpli-
fications such as
. analysing spending on one period’s demands as a function of
that period’s budget and prices ignoring patterns of spending
in other periods
. analysing spending on commodity demands as a function of
total commodity spending and prices ignoring decisions on al-
location of time
. analysing spending on non-durable goods as a function of non-
durable spending and prices ignoring patterns of spending on
durable goods
. and many others.
1.10. Market Demand and Aggregation
So far we have considered a single consumer and derived properties
of demand from the assumption that the consumer chooses a bundle
that is optimal given her preference relation.
An important question is whether the structure we derived for in-
dividual demand functions carries over to market demand. For con-
creteness, consider a situation in which K consumers have budgets
B(y 1 , p), . . . , B(y K , p). Notice that the total budget y k may be dif-
ferent for di↵erent consumers but all consumers face the same prices.
Let us also maintain our assumption that the price vector is exoge-
nously given and consumers can buy unlimited quantities at constant
prices. The Marshallian demand for an individual consumer is denoted
by f k (y k , p). Note that demands for two consumers k 6= ` may be
19To be precise, note that we cannot switch to a di↵erent utility representation
as in footnote 17 because this would destroy additive separability.
42 1. CONSUMER THEORY

di↵erent even if y k = y ` , because they may have di↵erent preference


relations (or utility functions). Market demand for good i is given by
K
X
f¯i (y 1 , . . . , y K , p) = f k (y k , p).
k=1

In general, market demand has a di↵erent structure than individual


demand. While individual demand is a function of a single budget y k ,
market demand depends on the budgets of all consumers y 1 , . . . , y K .
The first question we want to answer is under what conditions we
can assume
PK kthat market demand only depends on the aggregate income
ȳ = k=1 y and prices p so that we can write
(1.10.1) f¯i (y 1 , . . . , y K , p) = f¯i (ȳ, p).
The second question is whether market demand, if it can be written
as a function of aggregate income only, has the same properties as the
demand function of an individual. If (1.10.1) for all budgets y 1 , . . . , y K
and all price vectors, we say exact aggregation holds.
We start with the first question. If (1.10.1) holds, the market de-
mand for good i does not depend on the income distribution. For
example, suppose we increase the income of one consumer k by £1 and
decrease the income of another consumer ` 6= k by £1. Then if (1.10.1)
holds, market demand must be unchanged since the aggregate income ȳ
is unchanged. This implies that income e↵ects must be constant across
individuals and wealth levels. More formally, this means that for each
good i and for all k, ` = 1, . . . , K and all y k , y ` > 0:
@fik (y k , p) @fi` (y ` , p)
(1.10.2) = = i (p).
@y k @y `
Since fik (0, p) = 0, this implies that demand must be linear in income:
Z yk Z yk
k k k @fik (z, p) k
fi (y , p) = fi (0, p) + dz = i (p)dz = y i (p).
0 @y k 0
We have seen that linear Marshallian demand is obtained if preferences
are homothetic. If we want to obtain linear demand for all goods so
that individual demand can be aggregated to a market demand func-
tion f¯i (ȳ, p) for all markets with an unrestricted range of incomes y k ,
homothetic preferences are also necessary (which is not proven here).
Moreover, notice that in (1.10.2), the functions i (p) must be the same
for all consumers to obtain a market demand function. This answers
the first question and it is easy to see that once we assume that all
consumers have identical homothetic preferences, the market demand
function can be derived from these preferences. Therefore, under these
assumption we can model market demand as Marshallian demand of a
single representative agent. Needless to say the assumption of identical
homothetic preferences for all consumers is empirically not plausible.
1.11. DEMAND WITH ENDOWMENTS 43

The question of aggregation has many aspects which are not cov-
ered in this introductory course. For example, one can get slightly
weaker conditions under which an aggregate demand function of the
form (1.10.1) exists if the range of possible income levels for each con-
sumer is restricted. If we only want to model market demand for a
range of incomes y > y that are strictly positive, we could set an inter-
cept fik (0, p) = ↵ik (p) (positive or negative) which may be di↵erent for
di↵erent consumers and work with a more general demand specification
fik (y k , p) = ↵ik (p) + y k i (p)
as long as fik (y, p) > 0. A good starting point on the topic is Deaton
and Muellbauer (1980).

1.11. Demand with Endowments


Until now we have assumed consumers’ budgets are in the form
of endowed nominal income and that their nominal incomes do not
therefore change as prices change. If consumers have endowments of
goods then this is no longer true. The exchangeable values of these
endowments change with prices. Suppose the endowment is ! 2 Rn+ .
The budget constraint becomes y +p| ! p| x or equivalently y p| z
where z = x ! denotes the net demands or excess demands. (The
assumption that leads to this constraint is that consumers can freely
buy and also sell at constant prices p. This just extends the assumption
about market conditions that we made before.)
Quantities consumed will be x = f (y + p| !, p). Let ⇣(y, p, !) =
f (y + p| !, p) ! denote the Marshallian net demand function. Di↵er-
entiating and using the Slutsky equation for demand without endow-
ments, we obtain a version of the Slutsky equations for demand with
endowments:
@⇣i @fi @fi
= + !j
@pj @pj @y
@fi @fi
= + (xj zj )
@pj @y
@fi @fi @⇣i
= + xj zj
@pj @y @y
@gi @⇣i
(using Slutsky Eq.) = zj
@pj @y
@gi @⇣i
= (xj !j ).
@pj @y
For the second and the last line we have used ! = x z and for the
third line we have used @⇣i (y, p, !)/@y = @fi (y + p| !, p)/@y.
Note that this has the same form as the conventional Slutsky equa-
tion but expressed in terms of net demands rather than gross demands.
44 1. CONSUMER THEORY

The direction of income e↵ect now depends upon whether the individ-
ual is a net buyer or seller of the good which changes in price (that is
to say, whether zj > 0 or zj < 0). If the individual is a net seller of
good i (zi < 0), then the uncompensated demand curve need not slope
down for a normal good - in fact income and substitution e↵ects will
be opposing. We can only conclude that he “law of demand” holds,
i.e., demand of good i is decreasing in pi if either i is a normal good
and the consumer is a net buyer of good i, or if i is an inferior good
and the consumer is a net seller of good i. In these cases the income
e↵ect and the substitution e↵ect have the same sign.

Extending the conventional notions of indirect utility and expenditure func-


tions, define
ṽ(y, p, !) = v(y + p| !, p) = max u(x) s.t. p| x  y + p| !
and
ẽ(u, p, !) = e(u, p) p| ! = min p| (x !) s.t. u(x) u.
These functions are related to the net demands by relationships similar to
the conventional Roy’s identity and Shephard’s lemma,
@ṽ(y, p, !)/@pi
= ⇣i (y, p, !)
@ṽ(y, p, !)/@y
and
@ẽ(u, p, !)
= gi (u, p) !i .
@pi

1.12. Labour supply


Labour supply choice is a key example of demand with endowments.
We treat this as a two good choice problem where the consumer has an
endowment of one of the goods, time. The second good is a composite
commodity that represents all commodities (except “time”).
Combining all commodities into a single good c with a price p can
be justified by invoking homothetic (weak) separability of goods from
leisure. The consumer has an endowment T of time of which ` is
consumed (sometimes referred to as leisure) and h = T ` is sold to
the market at a price of w (the wage rate). Note that the consumer
is always a net seller of time since she can enjoy at most T hours of
leisure. The consumer possibly also has an endowment of unearned
income y so the budget constraint is
w` + pc  y + wT.
The value of total available resources y + wT is sometimes called full
income. Equivalently, the value of earned and unearned income needs
1.12. LABOUR SUPPLY 45

to be enough to cover the value of consumption,


pc  y + wh.
If y > 0, we must impose the additional constraint that `  T . The
individual cannot enjoy more leisure than her endowment T . This
implies that the budget set is no longer linear. In the following, we
make the assumption that optimal choice of leisure and consumption
leads to ` < T .20 Utility is u(c, `).
Uncompensated demand for time is
` = f` (y + wT, w, p) = ⇣` (y, p, w, T ) + T.
To determine under what conditions labour supply is increasing in w
(or leisure is decreasing in w), we consider the Slutsky equation:
@⇣` @f` @f`
= + T
@w @w @y
@g` @⇣`
= (` T )
@w @y
Note that the two terms have opposite sign if time is a normal good
since the consumer must be a net seller of time, l T  0. Hence the
sign of the labour supply response to a wage change is unambiguous
only if leisure is an inferior good.
If we let ⌘(y, p, w, T ) = T ⇣(y, p, w, T ) denote the uncompensated
labour supply function and ` (u, p, w) = T g` (u, p, w) the Hicksian
labour supply fuction, then we have the corresponding Slutsky equation
for labour supply
@⇣` @g` @⇣`
= (` T )
@w @w @y
@⇣` @g` @⇣`
= + h
@w @w @y
@⌘ @ ` @⌘
= h
@w @w @y
@⌘ @ ` @⌘
= + h
@w @w @y
Letting
ṽ(y, p, w, T ) = max u(c, `) s.t. y + wT w` + pc,
then Roy’s identity for labour supply is given by
@ṽ(y, p, w, T )/@w
= ⌘(y, p, w, T ).
@ṽ(y, p, w, T )/@y
20If this was not the case,
the optimal bundle would be the corner solution ` = T
and c = y/p and the analysis of the model becomes uninteresting. In other words
we are studying the intensive margin of labour supply rather than the extensive
margin. In this simple model, the extensive margin is not very interesting.
46 1. CONSUMER THEORY

Note that this equation for net supply rather than net demand has a
change of sign from the conventional Roy’s identity.

1.13. Inter-temporal choice


The inter-temporal nature of consumer choice can be dealt with by
treating the vectors of commodities consumed in di↵erent time periods,
t = 0, 1, . . . , T , as di↵erent bundles, xt = (xt1 , . . . , xtn ). An inter-
temporal budget constraint then requires that the discounted present
value of the lifetime spending stream equals the discounted present
value of the lifetime income stream, yt , t = 0, 1, , . . . , T :
T
X T
X
| t
pt xt /(1 + r)  yt /(1 + r)t = Y
t t

where r is the market interest rate.21


On the one hand, it is natural in such a context to assume some
sort of separability of preferences across periods, since goods in di↵er-
ent periods are not consumed at the same time. On the other hand,
any such assumption restricts the role of memory and anticipation in
determining preferences, ruling out habits in consumption, for example.
Preferences are often assumed strongly inter-temporally separable:
T
X
u= t (xt ),
t=0

for some concave functions t (·). Thus the MRS between goods con-
sumed in any two periods (or in the same period) is independent of
the quantities consumed in any third (other) period.To see the restric-
tion this imposes consider for example the consumption of tobacco: If
preference are strongly inter-temporally separable, the MRS between
tobacco and other goods today is independent of past tobacco con-
sumption. This is not very plausible is tobacco is addictive.
If, furthermore, within period preferences are homothetic then the
problem of allocating spending across periods can be written as
T
X T
X T
X
ct at (p ) t yt
max t (ct ) s.t. 
t=0 t=0
(1 + r)t t=0
(1 + r)t

21Note that we assume that a consumer can borrow and lend at the same
interest rate. In real credit market this is typically not the case for reasons that are
beyond the current model, such as default risk or agency problems. We abstract
from this complication in order to get some basic understanding of inter-temporal
choice. If a consumer is always a net borrower or net lender throughout her life,
the analysis applies with an interest rate equal to the rate at which she can borrow
or lend, respectively, even if these are not identical.
1.13. INTER-TEMPORAL CHOICE 47

for some concave functions t (·) and time-specific price indices at (·).22
ct is the number of units of a basket of goods consumed in period t.
The inter-temporal choice problem now has the form of a problem of
demand with endowments where the goods are total consumptions in
each period and endowments are incomes in each period. E↵ects of
interest rate changes therefore depend upon whether the consumer is a
net seller or net buyer of these goods or, in other words, whether they
are a saver or a borrower.
Frequently, in applications we see t (ct ) = (ct )/(1 + )t for some
concave function (·) (Which of course comes at the price of an addi-
tional assumption on preferences.) The parameter > 0 is a subjective
discount rate reflecting down-weighting of future utility relative to the
present and therefore capturing impatience. Assuming for simplicity
that prices pt are constant over time, then first order conditions for
inter-temporal choice require
0
✓ ◆t s
(ct ) 1+
0 (c )
= .
s 1+r
If = r then ct = cs given concavity of (·) so concavity can be seen
as capturing the desire to smooth the consumption stream. If r >
then chosen consumption will follow a rising path with a steepness
determined by the degree of concavity in (·).

1.13.1. Time Consistency and Inconsistency. So far, we have


analysed the problem of inter-temporal choice from the perspective of
period one. In the homothetic separable model, the consumer has pref-
erences over streams of consumption c0 , . . . , cT and chooses the optimal
stream in period one. Now we want to allow for the possibility that the
consumer makes choices in every period t. To do so, we need to specify
a utility function (or preferences) that determine choices in each pe-
riod. One formulation that is standard in economics is to assume that
at every point in time t, the consumer uses the same utility function
T
X v(c⌧ )
ut (ct , . . . , cT ) = .
⌧ =t
(1 + )(⌧ t)

Note that we obtain ut from u1 by deleting the terms for ⌧ < t in the
sum on the right-hand side. If this is the case, we can calculate the
marginal rate of substitution between c⌧ and c⌧ +1 from the perspective

22To obtain this specification we can use the same arguments as in the chapter
on composite commodities. The di↵erence here is that we have T groups of goods,
one for each period and aggregate each into one composite commodity (So we are
repeating step 1 for each group). Concavity of t can be obtained by assuming
that t is concave (the proof is left as an exercise).
48 1. CONSUMER THEORY

of period t (when the consumer has to make a choice):

v 0 (c⌧ ) (1 + )(⌧ +1 t)
v 0 (c⌧ ) (1 + )
= .
(1 + )(⌧ t) v 0 (c⌧ +1 ) v 0 (c⌧ +1 )
Note that this marginal rate of substitution is independent of the time
of decision t. Moreover, we can make the same observation for the
marginal rates of substitution between any two periods ⌧ and ⌧ + k.
This implies that the consumer has the same preferences about sav-
ing/borrowing in period ⌧ , no matter whether she plans her saving in
advance, or decides in period ⌧ . If she has determined her optimal
consumption stream c⇤0 , . . . , c⇤T in period 0 and interest rates and prices
pt do not change over time, then she will never revise her planned con-
sumption stream and deviate from c⇤0 , . . . , c⇤T . Such behaviour is called
time consistent.
An alternative formulation, is to assume that preferences are biased
towards present consumption. To introduce such a bias, we modify the
subjective discount factors in the utility function. On top of 1+1 , there
is an additional discount factor for consumption today vs. consumption
in the future:

T
X v(c⌧ )
u(ct , . . . , cT ) = v(ct ) + ,
⌧ =t+1
(1 + )(⌧ t)

where < 1.
This type of discounting is called “quasi-hyperbolic discounting”
as opposed to “exponential discounting” studied above. Note that the
additional discount factor does not play a role if the consumer thinks
about optimal consumption levels in future periods. If t < ⌧ , the
marginal rate of substitution between c⌧ and c⌧ +1 is given by

v 0 (c⌧ ) (1 + )(⌧ +1 t)
v 0 (c⌧ ) (1 + )
= ,
(1 + )(⌧ t) v 0 (c⌧ +1 ) v 0 (c⌧ +1 )
as before.
For example, if we consider the initial period t = 0, and = r,
this leads to an optimal consumption stream c⇤1 = c⇤2 = . . . = c⇤T . The
planned consumption is constant after the present period, because
the present bias does not a↵ect the preferences over consumption in two
future periods. Note however that M RS12 6= M RS23 which implies
here that c⇤0 > c⇤1 reflects the “present bias” of the consumer which
down-weights future consumption compared to current consumption.
Now let us see what the consumer will chose if she has already
consumed c⇤0 and reaches period 1. Will she stick to the original plan
to consume a constant amount over all remaining periods or will she
change and reoptimize her plans? The marginal rate of substitution
1.13. INTER-TEMPORAL CHOICE 49

between c1 and c2 is now given by


0 (1 + ) v 0 (c1 ) (1 + )
v (c1 ) 0 = ,
v (c2 ) v 0 (c2 )
whereas it remains unchanged for c⌧ and c⌧ +1 if ⌧ > 1. Therefore,
the consumer will now prefer a higher consumption level than c⇤1 in
the present period and lower (constant) consumption levels after that.
She changes her original plan of a constant consumption stream even
though this was the optimal plan from the perspective of period t = 0.
Such choices are called time inconsistent.
CHAPTER 2

Decision Theory under Uncertainty

We will study choice under uncertainty in a model with objective


uncertainty. This means that there are known outcomes that can
potentially arise and know probabilities for these outcomes. For
example, these could be the result of estimation from data about similar
choice situations.).
We start by considering a model with a finite set of possible
outcomes A = {a1 , . . . , an }. An outcome could be a consumption
bundle, a wealth level, etc., depending on the situation that we want
to model. The standard approach used in economics is to assume
that consumers make choices in order to optimize consumption (and
other decisions like employment, investment in human capital, etc.)
over their whole lifetime. In this approach ai 2 A could for example
describe a specific stream of consumption bundles and hours worked
over the lifetime of a person. There is evidence, however, that this
global approach does not deliver an accurate description of behaviour
under all circumstances. For example, some evidence suggests that
people consider decisions in isolation. This phenomenon is sometimes
called narrow bracketing. For example, the decision buy insurance
for some risks (e.g. missing a flight, stolen luggage, etc.) may not be
taken with all possible lifetime consumptions streams in mind, but a
consumer might consider it in isolation, independent of other risks she
is exposed to.
The general theory that is presented here can be applied both to
isolated decision problems or to a global optimization of all decisions a
person has to make. In the former case of narrow bracketing, however,
a theory is needed to describe how preferences depend on the individual
choice problem, which choice problems are considered in isolation and
which are considered with a broader perspective. We will briefly come
back to this later.

2.1. Choice Problem under Uncertainty


In contrast to the deterministic model considered so far, the decision
maker no longer chooses from a set of possible (a↵ordable) outcomes
(consumption bundles, lifetime consumption plans, ...) but she is given
the choice between di↵erent lotteries or gambles.
51
52 2. DECISION THEORY UNDER UNCERTAINTY

Definition 12. A simple lottery is a probability distribution


over outcomes that we denote by
L = (⇡1 a1 , . . . , ⇡n an ),
P
where ⇡ 2 n = {⇡ 2 Rn | ⇡i 2 [0, 1], ni=1 ⇡i = 1} .
In this definition, ⇡i is the objective probability that outcome ai
arises if the consumer chooses lottery L. Note that one of the outcomes
could describe the status quo if we want to allow for the possibility
that sometimes the lottery leaves the situation of the decision maker
unchanged. A simple example of a lottery is a coin-flip that yields
£1 for heads and otherwise requires a payment of £1 by the decision
maker: ✓ ◆
1 1
L= (w + £1), (w £1) ,
2 2
where w is the initial wealth of the decision maker. If the DM has to
pay a price p to play the coinflip, the lottery of final wealth is given by
✓ ◆
1 1
Lp = (w p + £1), (w p £1) .
2 2
In many situations, there are many random events that together
determine which final outcome will arise. To model this, we intro-
duce compound lotteries. In a compound lottery there are two (or
more) levels of uncertainty. In the first step, one of several P lotteries
L1 , L2 , L3 , . . . is selected with probabilities ↵
˜1, ↵
˜2, ↵
˜ 3 , . . . 0, ˜ k =1.

In the second step, the selected lottery Lk determines the final outcome
according to probabilities ⇡ik . Consequently, we write a compound lot-
tery as a lottery over lotteries:
˜ 1 L1 , ↵
L̃ = ↵ ˜ 2 L2 , . . . .
A simple example is a coin toss. In the case of tails, there is no
payment. In the case of heads, a die is rolled and the DM gets a
payment equal to the number that the die shows. Formally, we have
✓ ◆ ✓ ◆
1 1 1 1 1
L̃ = L, (w + £0) , L = 1
(w + £1), . . . , (w + £6) .
2 2 6 6
Notice that even though a compound lottery may have a compli-
cated structure, it induces a (unique) probability distribution over final
outcomes. This defines a simple lottery which we call the simple lot-
tery LS induced by the lottery L.
↵1 ⇡11 + ↵
LS = (˜ ˜ 2 ⇡12 + . . .) a1 , (˜
↵1 ⇡21 + ↵
˜ 2 ⇡22 + . . .) a2 , . . . .
In the above example we obtain the following simple lottery from this
formula:
✓ ◆
1 1 1
L̃S = (w + £0), (w + £1), . . . , (w + £6) .
2 12 12
2.2. PREFERENCES 53

The concept of a compound lottery and the induced simple lottery


can be extended to lotteries over compound lotteries so that we can
consider an arbitrary number of levels of compounding.
For any two (compound) lotteries L1 , L2 we write
L1 = S L2 if L1S = L2S ,
that is, if the two (compound) lotteries induce the same simple lottery,
we denote that by the symbol “=S ”. If L1 =S L2 , the two lotteries are
not necessarily equal but they induce the same distribution over final
outcomes.
A lottery is called non-degenerate if the induced simple lottery
assigns positive probability (⇡i > 0) to at least two distinct outcomes.
Conversely, a degenerate lottery assigns probability 1 to a single
outcome. For degenerate lotteries we introduce the notation:
a := (1 a)
There is another way of representing lotteries that corresponds
closely to consumption bundles we have considered before. Suppose
we consider lotteries that all lead to outcomes in the same fixed set of
possible outcomes A = {a1 , . . . , an }. Then we can identify a lottery
L with the vector of probabilities (⇡1 , . . . , ⇡n 1 ) given by the simple
lottery LS
L =S LS = (⇡1 a1 , . . . , ⇡n an ) ' (⇡1 , . . . , ⇡n 1 )
Here we use the symbol “'” to denote that a lottery L corresponds to
the vector (⇡1 , . . . , ⇡n 1 ).
Note that the vector Pof probabilities has only n 1 elements
P be-
cause the requirement ni=1 ⇡i = 1 implies that ⇡n = 1 j6=n j is

completely determined by the other probabilities. Therefore, instead
of considering simple Lotteries, we can considerPn 1 vectors of probabili-
ties (⇡1 , . . . , ⇡n 1 ) 2 n 1 = x 2 R+ n 1
i=1 ⇡i  1 . Notice that
vectors ⇡ 2 n 1 ⇢ R+ are similar to consumption bundles in the
n 1

sense that ⇡i now stands for the probability of an outcome instead of


the quantity of a good.

2.2. Preferences
The space of all possible lotteries is called L and the space
of all simple lotteries is denoted by LS . The set of all conceivable
choices is therefore X = L. We assume that the decision maker has
preferences captured by a binary relation % over the elements of X
which has the following properties:
Axiom 1 (Preference Relation). % is complete and transitive.
We have seen this property before in the context of consumer choice
under certainty.
54 2. DECISION THEORY UNDER UNCERTAINTY

Axiom 2 (Reduction to Simple Lotteries). For any lottery L 2 L,


with induced simple lottery LS we have
L ⇠ LS .
This axiom implies that the decision maker only cares about the
distribution over final outcomes. Consequently
L1 = S L2 ) L1 ⇠ L1S ⇠ L2S ⇠ L2 .
She ignores how a lottery is presented. This has a flavour of rationality.
With this Axiom it suffices to define preferences only over simple
lotteries. If we assume that Axiom 2 holds, we can therefore also
describe preferences over lotteries as preferences over the associated
vectors of probabilities. Hence if L1 ' ⇡ 1 and L2 ' ⇡ 2 , we can write
⇡ 1 % ⇡ 2 instead of L1 % L2 .
Axiom 3 (Continuity). For any three lotteries such that L0 % L1 %
L2 , there exists ↵ 2 [0, 1] such that
L1 ⇠ ↵ L0 , (1 ↵) L2 .
The statement of this axiom is the same as one of the equivalent
definitions of continuity that we saw in previous chapters. It rules
out a type of preferences that resembles lexicographic preferences. For
example, suppose that there are three possible outcomes a1 =“death”,
a2 = £0, a3 = £10m. Let us also assume that the decision maker
wants to avoid a1 at all cost. Formally, we assume that the preference
relation % strictly prefers any lottery L1 with ⇡11 = 0 to any lottery L2
for which ⇡12 > 0. Under this assumption we have
(1 £10m) (1 £0) (1 “death”) .
Continuity requires that there is some ↵ 2 [0, 1] such that the mixture
between the best and the worst lottery is indi↵erent to (1 £0):
(↵ “death”, (1 ↵) £10m) ⇠ (1 £0) .
This is clearly impossible because for ↵ = 0 the mixture is (1 £10m)
(1 £0) and for ↵ > 0, (1 £0) (↵ “death”, (1 ↵) £10m) by
our assumption on the preferences. Therefore preferences that have
the above property do not satisfy continuity.
Continuity requires that the decision maker is always willing to
accept some (possibly very small) probability of a bad outcome if she
is compensated by a higher probability of a good outcome.
Axiom 3 is stated in terms of lotteries. If we assume that Axiom
2 holds, then lotteries correspond to vectors of probabilities ⇡ 2 n 1
and implies that a preference relation % on cn 1 ⇢ Rn+ 1 . It is easy
to see that continuity for preferences over lotteries as defined in Axiom
3, is the same as continuity for preferences over vectors ⇡ 2 n 1 as
defined in 6.(3). Therefore, by Debreu’s Theorem, there exists a
continuous utility function that represents % if Axiom 3 is satisfied.
2.3. EXPECTED UTILITY 55

We call this function U (⇡1 , . . . , ⇡n 1 ) but also write U (L) or U (LS )


when L = LS ' (⇡1 , . . . , ⇡n 1 ).
Note that the utility function U (·) is a function of lotteries as op-
posed to the utility function u(·) we have used in the previous chapters.
When we consider degenerate lotteries a , it makes sense to assume that
the utility of the lottery a is equal to the utility of the outcome a
U ( a ) = u(a).
Without further assumptions, however, we do not know how prefer-
ences over non-degenerate lotteries L are related to the utility function
for outcomes u(a). The following Axiom will allow us to establish that
U (L) can be written as the expectation of u(a).
Axiom 4 (Independence Axiom). For any two lotteries L0 and L1 ,
L0 % L1
if and only if
8L2 , ↵ 2 [0, 1] : (↵ L0 , (1 ↵) L2 ) % (↵ L1 , (1 ↵) L2 )
This axiom also has a flavour of rationality. The two lotteries L̃0 =
(↵ L0 , (1 ↵) L2 ) and L̃1 = (↵ L1 , (1 ↵) L2 ) only di↵er in
the first part, which is compounded with the same lottery L2 in both
cases. The axiom requires that in this case the preference over the two
lotteries L̃0 and L̃1 should be determined by the part where they di↵er.
It should be independent of L2 and consistent with the preference over
L0 and L1 .

2.3. Expected utility


Theorem 4 (Expected Utility Theorem). Let % be a preference
relation on L that satisfies Axioms 1-4. Then there exists a utility
representation that has the form
n
X
(2.3.1) U (L) = E [u(ai )] = ⇡i u(ai ),
i=1
where ⇡i is the probability of outcome ai in the simple lottery induced
by L and u(·) 2 R is a utility function over outcomes.
As noted before, Axiom 2 allows us to identify a lottery L with the
induced simple lottery LS and therefore with the corresponding vector
of probabilities (⇡1 , . . . , ⇡n 1 ):
L =S LS = (⇡1 a1 , . . . , ⇡n an ) ' (⇡1 , . . . , ⇡n 1 ),
and we therefore write (slightly abusing notation)
U (L) = U (LS ) = U (⇡1 , . . . , ⇡n 1 ).
The Expected Utility Theorem was proven by John v. Neumann and
Oskar Morgenstern and U (·) is called a von Neumann-Morgenstern
56 2. DECISION THEORY UNDER UNCERTAINTY

utility function. The function u(·) is called the Bernoulli utility


function and a representation of the form (2.3.1) is called an expected
utility representation. The representation defines a utility index
over final outcomes u(·) and the utility of a lottery is given by the
expected value of the utility of the final outcome. Note that the utility
function U additively separable and linear in the probabilities ⇡i . It is
easy to check that the converse of the theorem also holds: If a preference
relation % on L has an expected utility representation, then % satisfies
Axioms 1-4.
The utility function U (L) represents a preference relation % as in
the previous chapters. Therefore, any strictly increasing transforma-
tion V (L) = (U (L)) yields a new utility function that represents the
same preferences. This is not true, however for the Bernoulli utility
function u(·). Replacing u by v(·) = (u(·)) for an arbitrary strictly
increasing function does not lead to a representation of the same
preference relation over lotteries. The results in the next chapter will
illustrate that. Note however, that the expectation operator is linear.
Therefore any positive affine transformation v(ai ) = a + bu(ai ), a 2 R,
b > 0 yields a new Bernoulli utility function v(·) with v.N.-M utility
function V (·), that represents the same preferences. This can be seen
as follows:

V (L) = E [v(ai )] = E [a + bu(ai )] = a + bE [u(ai )] = a + bU (L).

The expected utility V obtained from v is an increasing transformation


of the expected utility U obtained from u: V (L) = (U (L)) for (x) =
a + bx. Therefore V and U represent the same preferences.

The proof of the Expected Utility Theorem is quite long. To get an idea,
you may consider the following example for three outcomes. It outlines the
main steps also used in the general proof.1
To get an idea how one can obtain an expected utility representation
we consider an example with three possible outcomes A = {a, b, c}. Let %
be a preference relation that satisfies Axioms 1-4 and assume a b c.
To construct an expected utility representation we proceed in six steps.
Step 1: Existence of a utility representation (not necessarily of the
expected utility form). By Debreu’s theorem, the continuity axiom implies
that there exists a continuous utility function U 0 (⇡a , ⇡b ) that represents
%.
It can be shown using the independence axiom that reallocating prob-
ability from an outcome (e.g. c) to a strictly better outcome (e.g. b or a)
leads to a strictly better lottery.

1This is not discussed in the lecture and is not relevant for the exam, but I
include it for completeness.
2.3. EXPECTED UTILITY 57

Lemma 11. Assume A = {a, b, c} and Let U (⇡a , ⇡b ) be a utility


representation of a preference relation % that satisfies Axioms 1-4 and
a b c . Then
. U (⇡a , ⇡b ) is strictly increasing in both arguments.
. U (⇡a , x ⇡a ) is strictly increasing in ⇡a , where x 2 [0, 1] and
0  ⇡a  x.
Proof. The proof is left as an exercise. ⇤
Step 2: Normalization. We now use a monotonic transformation of
U 0 to obtain a utility representation for which
U ( a ) = U (1, 0) = 1, U ( c ) = U (0, 0) = 0, and U (x, 0) = x, 8x 2 [0, 1].
In words, if ⇡b = 0, the utility of a lottery is equal to the probability of
outcome a. To obtain this normalization we first set
U 0 (⇡a , ⇡b ) U 0 (0, 0)
U 1 (⇡a , ⇡b ) = .
U 0 (1, 0) U 0 (0, 0)
This satisfies U 1 (0, 0) = 0 and U 1 (1, 0) = 1 but not necessarily U (x, 0) =
x, 8x 2 [0, 1]. To achieve the latter, we set2
1
U (⇡a , ⇡b ) = (U 1 (⇡a , ⇡b )),
where (x) = U 1 (x, 0).
Note that (x) = U 1 (x, 0) implies 1 (U 1 (x, 0)) = x and hence U (x, 0) =
x, 8x 2 [0, 1].
Step 3: Bernoulli Utility Function. Continuity implies that since a

b c , there exits ↵ 2 (0, 1) such that

b ⇠ L↵ = (↵⇤ a , (1 ↵⇤ ) c ).

In terms of the utility function this is


U (0, 1) = U (↵⇤ , 0) = ↵⇤ ,
where the second equality follows from Step 2. With this step, we can
write the Bernoulli utility function as3
u(a) = U ( a ) = U (1, 0) = 1,
u(b) = U ( b ) = U (0, 1) = ↵⇤ ,
u(c) = U ( c ) = U (0, 0) = 0.
Step 4: Indi↵erence curves. Now we show that indi↵erence curves (in
(⇡1 , ⇡2 )-space) are straight lines. Consider first the lotteries corresponding
to points on a straight line from (0, 1) to (↵⇤ , 0):

L =( b , (1 ) L↵ ) ' (↵⇤ (1 ), )
2To see that this is well defined notice that Lemma 11 implies that for all
(⇡a , ⇡b ), 0  U 1 (⇡a , ⇡b )  1.
3This is the first time where we use that there are only three outcomes. If
there are more than three outcomes, one could proceed in the same way to obtain
di↵erent utility levels ↵i for each outcome.
58 2. DECISION THEORY UNDER UNCERTAINTY

where 2 [0, 1].



Since L↵ ⇠ b , the independence axiom implies that

L =( b , (1 ) L↵ ) ⇠ ( b , (1 ) b) =S b.

Hence the straight line connecting (0, 1) to (↵⇤ , 0) is an indi↵erence curve


and it has slope 1/↵⇤ .
It turns out that the other indi↵erence curves have the same slope. We
have
↵⇤
b ⇠ L ⇠ L
and the independence axiom implies that for all ⇢ 2 [0, 1]

(⇢ b , (1 ⇢) c) ⇠ (⇢ L , (1 ⇢) c) ⇠ (⇢ L↵ , (1 ⇢) c ),

Or
U (0, ⇢) = U (⇢↵⇤ (1 ), ⇢ ) = U (⇢↵⇤ , 0).
This implies that all indi↵erence curves closer to the origin than the one
through b are straight lines and they all have slope
⇢ 1
⇤ ⇤
= .
⇢↵ ⇢↵ (1 ) ↵⇤
A similar argument can be used to show that the indi↵erence curves further
out have the same slope.
Step 5: MRS and marginal utility. Given that all indi↵erence curves
are straight parallel lines with slope 1/↵⇤ , we have
@U
@⇡a 1
@U
= .
@⇡b
↵⇤
Moreover, the fact that indi↵erence curves are parallel implies that
@U (⇡a , ⇡b ) @U (⇡a , 0)
= = 1.
@⇡a @⇡a
Therefore
@U (⇡a , ⇡b )
= ↵⇤ .
@⇡b
Step 6: Expected utility. To conclude, we integrate marginal utility to
get the utility function
U (⇡a , ⇡b ) = U (0, 0) + [U (⇡a , 0) U (0, 0)] + [U (⇡a , ⇡b ) U (⇡a , 0)]
Z ⇡b
@U (⇡a , x)
= ⇡a + dx
0 @⇡b
= ⇡a + ⇡b ↵ ⇤
= ⇡a u(a) + ⇡b u(b) + ⇡c u(c).
The example shows that the independence axiom has very strong im-
plications. The marginal rate of substitution between ⇡a and ⇡b is constant
across all (⇡a , ⇡b ). This is stronger than the strong separability defined in
topic 1 and it implies that utility is linear in the probabilities.
2.3. EXPECTED UTILITY 59

Soon after Expected utility was proposed by von Neuman and Mor-
genstern, it has been criticised because it sometimes leads to predictions
that are at odds with choices people make. The most famous criticism
was voiced by Maurice Allais who proposed to consider the following
two choice problems:4
Choice Problem 1: Which of the following lotteries do you prefer?
L1 = (1 £500K), L2 = (.1 £2.5m, .89 £500K, .01 £0).
Choice Problem 2: Which of the following lotteries do you prefer?
L̂1 = (.11 £500K, .89 £0), L̂2 = (.1 £2.5m, .9 £0).
It is very common that people reveal the following preferences:
L1 L2 and L̂2 L̂1 .
If preferences over lotteries have an expected utility representation,
the stated preference in the first choice problem implies that
10 89 1
u(500K) > u(2.5m) + u(500K) + u(0)
100 100 100
11 89 10 89 1
() u(500K) + u(500K) > u(2.5m) + u(500K) + u(0)
100 100 100 100 100
11 89 10 89 1
() u(500K) + u(0) > u(2.5m) + u(0) + u(0)
100 100 100 100 100
11 89 10 90
() u(500K) + u(0) > u(2.5m) + u(0)
100 100 100 100
The last line implies that L̂1 L̂2 which contradicts the preferences
stated by many people. This observation is commonly known as the
Allais-Paradox. Therefore, the preferences stated by many people are
not consistent with Axioms 1–4. To see this more clearly, we can repeat
the derivation in terms of preferences instead of utilities. Clearly the
two lotteries can be written as more complicated compound lotteries
without changing the implied simple lotteries:
✓ ◆
1 11 89
L =S 500, 500K ,
100 100
and ✓ ✓ ◆ ◆
2 11 10 1 89
L =S 2.5m, 0 , 500K .
100 11 11 100
Hence Axiom 2, implies that
L1 L2 ()
✓ ◆ ✓ ✓ ◆ ◆
11 89 11 10 1 89
500, 500K 2.5m, 0 , 500K
100 100 100 11 11 100
4The numerical example is taken from (Mas-Colell, Whinston, and Green,
1995).
60 2. DECISION THEORY UNDER UNCERTAINTY

because we have just replaced the original lotteries by more complicated


ones with the same simple lotteries.
89
But notice that both lotteries have the same component “ 100
500K”. The independence axiom therefore implies that the preference
89
is not changed if we replace the common component by “ 100 0”:
✓ ◆ ✓ ✓ ◆ ◆
11 89 11 10 1 89
500, 0 2.5m, 0 , 0 .
100 100 100 11 11 100
Using Axiom 2 again, we get the preference for the implied simple
lotteries:
✓ ◆ ✓ ◆
1 11 89 10 9
L̂ = 500, 0 2.5m, 0 = L̂2 .
100 100 100 10
Several alternative theories have been proposed that are consistent
with the Allais-Paradox. For example, prospect theory, which will
be mentioned briefly at the end of this chapter. One crucial element
of that theory, that is used to explain the Allais-Paradox is the obser-
vation that there is a fundamental di↵erence in the way people think
about safe outcomes and random outcomes. There is a preference for
89 89
certainty. Replacing “ 100 500K” by “ 100 0” turns the certain outcome
of the lottery on the left-hand side into an uncertain outcome, which
makes it disproportionally less attractive. On the right-hand side, we
already started with a random outcome. Therefore, the preferences for
certainty is not so important for the evaluation on the right-hand side.
Despite the Allais-Paradox, expected utility has become the main-
stream tool to model decisions under uncertainty and is employed un-
less researchers are particularly interested in modelling deviations from
this rational benchmark. On the other hand, there has been consider-
able success in explaining behavioural “puzzles” using alternative mod-
els of behaviour. Reviewing these contributions is beyond the scope of
an introductory course and I refer you to the second term module
ECONG050 on Behavioural Economics.

2.4. Utility For Money


So far, we have considered a model with general outcomes. From
now on, we will consider the case that outcomes can be linearly ordered
and A ⇢ R. For example, each outcome could represent a di↵erent
wealth level or the consumption level of a single good. We stick to the
leading example where the outcomes are monetary amounts yi . Since
outcomes are in R, we can consider the expected value of a lottery
which we denote as
X Z 1
L L
µ := ⇡i yi , or µ := yf L (y)dy.
i 1

Note that we will consider both lotteries L with a finite number of


possible outcomes and lotteries with an infinite number of outcomes. In
2.4. UTILITY FOR MONEY 61

the latter case we will be assuming that the distribution over outcomes
is given by a density function denoted f L (or distribution function F L ).
We will also assume throughout that the expected value is finite.
Note that in the general model we considered in previous sections,
the expectation of a lottery is not well defined. With one dimensional
outcomes the expectation is well defined, so we can consider the ex-
pected amount of money the DM will have for a given lottery. This
will allow us to analyse the willingness of decision makers to take risk
by comparing a lottery to the safe amount of money that is equal to
the lottery’s expected value.
We continue to assume that the DM has preferences that have an
expected utility representation. In case of a finite number of outcomes
it is given by:
X
U (L) = ⇡i u(yi ).
i
In case of and infinite number of possible outcomes we have:5
Z 1
U (L) = u(y)f L (y)dy.
1
In the following, we will assume that the decision maker maximizes
expected utility for a given Bernoulli utility function that is strictly
increasing
Assumption 1. If x > y then u(x) > u(y).
Remember from the previous chapter that general monotonic trans-
formations v(y) = (u(y)) of u(y) do not yield a v.N.-M utility function
V (·) that represents the same preference relation as U (·). We will now
see more concretely, that the shape of the Bernoulli utility function
a↵ects the risk-attitude of the decision maker. In other words, we
will show that the shape of the utility function a↵ects behaviour in a
systematic way. We will also see how this a↵ects investment behav-
ior, demand for insurance, and later in the last chapter, risk-sharing
between individuals.
First we introduce some terminology that describes the risk-attitude
of a decision maker.
Definition 13. A decision maker with Bernoulli utility function u
is called
(1) risk-averse if for all lotteries L: u(µL ) U (L), and strictly
risk-averse if for all non-degenerate lotteries u(µL ) > U (L).
5We have to make sure that we only consider lotteries where the expected
utility is finite. A sufficient condition for this is that the distribution has bounded
support, i.e., that the values where f (y) > 0 are contained in a bounded interval,
and the utility function is bounded. (The support of a distribution is the set where
f (y) > 0, or ⇡i > 0 in case of a discrete distribution.)
62 2. DECISION THEORY UNDER UNCERTAINTY

(2) risk-neutral if for all lotteries L: u(µL ) = U (L).


(3) risk-loving if for all lotteries L: u(µL )  U (L), and strictly
risk-loving if for all non-degenerate lotteries u(µL ) < U (L).
Taking a lottery L with two outcomes, risk-aversion implies
(2.4.1) U (L) = ⇡u(y0 ) + (1 ⇡)u(y1 )  u(⇡y0 + (1 ⇡)y1 ) = u(µL )
for 0  ⇡  1, with strict inequality for strict risk-aversion if y0 6= y1
and L non-degenerate (0 < ⇡ < 1). Note that (2.4.1) shows that risk-
aversion implies that u(·) is concave. The converse is also true as the
following Lemma shows:
Lemma 12. A decision maker with Bernoulli utility function u is
(1) (strictly) risk-averse, if and only of u is (strictly) concave.
(2) risk-neutral, if and only if for all x 2 R: u(x) = a + bx, where
a 2 R and b > 0.
(3) (strictly) risk-loving, if and only of u is (strictly) convex.
Proof. We have already given the argument for lotteries with two
outcomes. This also implies necessity of concavity/convexity in for
risk-aversion/risk-loving. The proof of sufficiency follows from Jensen’s
inequality which says that for any concave function and random vari-
able X we have
E[ (X)]  (E[X]).
The inequality is strict if X is non-degenerate and is strictly concave.

We have noted before that representations of preferences under un-
certainty are not preserved by arbitrary (non-linear) transformations of
the Bernoulli utility function. Risk aversion is a good example where
this can be seen. Concavity is not a property preserved under arbi-
trary increasing transformations of u(·). If v(y) = (u(y)), for some
strictly increasing function , v(·) can be convex even if u(·) is concave.
Assuming that u and v are twice di↵erentiable, we have
?
2
v 00 (y) = 00
(u(y)) (u0 (y)) + 0
(u(y))u00 (y) Q 0.
| {z }
sign=+/

We see that concavity of u does not necessarily imply that v is concave.


If 00 is sufficiently large, we can have v 00 > 0.
For a positive affine transformation v(y) = a + b u(y), however,
concavity and hence the risk-attitude is not a↵ected (because 00 = 0):
v 00 (y) = b u00 (y).
We can evaluate the degree of someone’s risk aversion by asking
how much they would be prepared reduce the expected value of their
wealth to avoid a gamble. This is captured be the following definitions.
2.4. UTILITY FOR MONEY 63

Definition 14. For a given Bernoulli utility function u the cer-


tainty equivalent of a lottery L, is defined by
u(CE(L)) = U (L).
The risk-premium is defined as
P (L) = µL CE(L).
CE(L) is the amount which if received with certainty would give
the same expected utility as the lottery L. The di↵erence between the
expected value of the lottery and the certainty equivalent is what the
individual would pay (in terms of expected final wealth) to avoid the
gamble.
Lemma 13. For a risk-averse decision maker we have for all lot-
teries L:
CE(L)  µL and P (L) 0,
with strict inequality in the case of strict risk-aversion if L is non-
degenerate. For a risk-loving decision maker we have the opposite in-
equalities and risk-neutrality implies equalities for all lotteries.
Proof. The proof is left as an exercise. ⇤
In order to compare the risk-attitudes of two individuals with Bernoulli
utility functions u and v, we can compare the risk premia or certainty
equivalents that individuals have for lotteries. If we observe that u
consistently leads to (strictly) higher risk-premia and (strictly) lower
certainty equivalents than v, for all (non-degenerate lotteries), then we
call u more risk-averse than v.
Definition 15. A decision maker with Bernoulli utility function u
is (strictly) more risk averse than a decision maker with Bernoulli
utility function v if for all (non-degenerate) lotteries L,
CE u (L)  (<)CE v (L)
or equivalently
P u (L) (>)P v (L).
Where the superscripts of P and CE indicate the Bernoulli utility
function used to obtain the certainty equivalent and risk-premium.
Comparing the risk-aversion of two decision makers by comparing
the certainty equivalents for arbitrary lotteries is cumbersome. There-
fore, we want to identify properties of u and v that allow us to compare
risk attitudes. We first note that u is “more concave” than v if the
former captures more risk aversion. What “more concave” means is
captured in the following lemma.
Lemma 14. u is (strictly) more risk-averse than v if and only if
there exists a (strictly) concave function such that u(y) = (v(y)).
64 2. DECISION THEORY UNDER UNCERTAINTY

Proof. Let us define as (z) = u(v 1 (z)). This implies that


u(x) = (v(x)).
Suppose first that is concave. We want to show that this implies
CE u (L)  CE v (L) for any lottery L. Let V denote the expected utility
for v. Let L be a lottery with density f . We have
Z
u
u(CE (L)) = U (L) = u(x)f (x)dx.
Z
= (v(x))f (x)dx
✓Z ◆
 v(x)f (x)dx

= (V (L))
= (v(CE v (L)))
= u(CE v (L))
The first line follows from the definition of the certainty equivalent,
the second from u(x) = (v(x)). The inequality is Jensen’s inequality
which says that for a concave function
(E[X]) E[ (X)],
for any random variable X. The inequality is strict if f is non-degenerate
and is strictly concave. The fifth line follows from the definition of
the certainty equivalent (for v) and the last line follows from u(x) =
(v(x)). Since u is strictly increasing, we have
CE u (L)  CE v (L),
with strict inequality if is strictly concave and L is non-degenerate.
The proof for the case that L has a finite number of outcomes works
similarly.
Next we want to show that if u is more risk-averse than v, then is
concave. To show this, we show that if is not concave, then u is not
more risk-averse and v, meaning that here exists some lottery L such
that CE v (L) < CE u (L):
So suppose that is not concave. Then there exist x, y 2 R, x 6= y
and ⇡ 2 (0, 1) such that
(V (L)) = (⇡v(x) + (1 ⇡)v(y)) < ⇡ (v(x)) + (1 ⇡) (v(y)) = U (L),
where the lottery L is given by L = (⇡ x + (1 ⇡) y). We must
be able to find x, y 2 R, x 6= y and ⇡ 2 (0, 1) such that the inequality
hold if is not concave.
Using (V (L)) < U (L) we have
(v(CE v (L))) = (V (L)) < U (L) = u(CE u (L)) = (v(CE u (L))).
Hence CE v (L) < CE u (L) and v is not more risk averse than u if is
not concave. ⇤
2.4. UTILITY FOR MONEY 65

This Lemma allows us to derive an easy test to compare the risk


aversion of two DMs. Let us di↵erentiate
u(x) = (v(x))
twice with respect to x. We get
u0 (x)
u0 (x) = 0
(v(x))v 0 (x) () 0
(v(x)) = .
v 0 (x)
Di↵erentiating again we get
00 u00 (x)v 0 (x) u0 (x)v 00 (x)
(v(x))v 0 (x) =
(v 0 (x))2
00
Note that (v(x))  (<)0 if and only if
u00 (x)v 0 (x)  (<)u0 (x)v 00 (x)
This is equivalent to
u00 (x) v 00 (x)
(2.4.2) (>) .
u0 (x) v 0 (x)
We have shown that (2.4.2) holds (strictly), if and only if u is (strictly)
more risk-averse than v. Hence we use the ratio u00 /u0 to define a
measure of risk-aversion.
Definition 16. The Arrow-Prat measure of absolute risk-
aversion is defined as
u00 (y)
Ra (y) = .
u0 (y)
The previous derivation shows the following result.
Lemma 15. u is (strictly) more risk-averse than v if and only if
8y : Rau (y) (>)Rav (y).
Note that this measure is increasing in the second derivative of u(·),
so it is a measure of the concavity of the Bernoulli utility function. An
affine transformation v(y) = a + b u(y) leaves it unchanged because
because
Rav (y) = bu00 (y)/bu0 (y) = Rau (y).
To summarize, we have three equivalent statements that compare the
risk-aversion of one individual to another:
. The more risk averse individual has a higher risk premium for
any gamble (this is how we defined “more risk-averse than”).
. The more risk averse individual has a Bernoulli utility function
which is an increasing concave transformation of that of the
other.
. The more risk averse individual has a higher coefficient of ab-
solute risk aversion at all y.
66 2. DECISION THEORY UNDER UNCERTAINTY

The Arrow-Pratt measure of absolute risk aversion cannot only be used


to compare the risk attitude of two di↵erent individuals but it also
tells us how risk-attitudes change with the initial wealth level of an
individual. Consider a single decision maker with a concave Bernoulli
utility function u and initial wealth w. In addition to this wealth, the
decision maker faces some uncertainty. This is captured by a lottery of
the form
Lw = (⇡1 (w + a1 ), . . . , ⇡n (w + an )) .
Note that this lottery is indexed by the initial wealth level. We can
think of it as the combination of a certain level of wealth w plus the
lottery
(⇡1 a1 , . . . , ⇡n an ) .
The following results shows how the risk-premium changes with the
initial wealth.
Theorem 5. If Rau (y) is decreasing (increasing) in y, then for any
w, w > 0, outcomes a1 , . . . , an and probabilities ⇡1 , . . . , ⇡n ,
P u (Lw ) > (<)P u (Lw+ w
).
This Theorem shows that decreasing absolute risk aversion means
that the DM becomes more tolerant of additive risk if her wealth in-
creases. Therefore we say that the decision maker exhibits decreasing
(increasing) absolute risk aversion if the Arrow-Pratt measure decreases
(increases) with wealth. Note, however, that increasing absolute risk
aversion is empirically not very plausible.
Proof. We introduce the following auxiliary utility function, which
depends on w:
v w (x) = u(x + w).
Adding w to the initial wealth level is equivalent to switching the
utility function from u to v w while keeping the wealth in Lw constant.
X
U (Lw+ w ) = ⇡i u(w + ai + w)
i
X
w
= ⇡i v (w + ai )
i
w
= V (Lw ).
Hence we have
w
P u (Lw+ w
) = Pv (Lw ),
and
w
Rau (y + w) = Rav (y).
This implies that if Rau is decreasing, then v w
captures less risk
averse preferences that u. In particular we have
w
P u (Lw+ w
) = Pv (Lw )  P u (Lw ).
2.5. AN EXAMPLE: INSURANCE 67

But that means that the risk premium of i for Ly is decreasing in y.


The argument for the opposite case of increasing absolute risk aver-
sion is very similar. ⇤
As we have seen the measure of absolute risk-aversion can be used
to see how wealth changes the risk-attitudes of an individual when
she becomes richer but the absolute di↵erences in the wealth levels in
di↵erent states of the world remain the same. One might as a related
question: How does the risk-attitude change if wealth changes but the
randomness in the final wealth is proportional to the wealth level. To
formalize this, we consider lotteries of the form
(2.4.3) Lw = (⇡1 (1 + a1 )w, . . . , ⇡n (1 + an )w) .
We want to understand if the risk-premium changes in proportion with
w. It turns out that this is related to how the coefficient of relative
risk aversion changes with wealth. It is defined as:
yu00 (y)
Rr (y) = .
u0 (y)
We have the following result which (the proof is omitted):
Theorem 6. If Rru (y) is decreasing (increasing, constant) in y,
then for Lw defined in (2.4.3),
P u (Lw )
w
is decreasing (increasing, constant) as a function of w.
Obviously, constant relative risk-aversion implies decreasing abso-
lute risk-aversion since Rr (y) = yRa (y). The functional form for the
Bernoulli utility function with constant relative risk-aversion Rr (y) =
is (
log y if = 1,
u(y) = y1
1
if 6= 1.

2.5. An example: Insurance


Suppose a strictly risk-averse person has initial wealth w but would
lose all of it in an event occurring with probability ⇡. They can pur-
chase insurance K for a premium of K where is the rate at which the
insurance premium is charged. The individual’s levels of wealth in good
and bad states are therefore y0 = w K and y1 = K K. (Remark:
The budget constraint linking them is (1 )y0 + y1 = w(1 ).
The relative price of wealth in the two states y0 and y1 is set by the
rate of premium and the structure of the choice problem is like a
demand problem where the individual has an endowment w of y0 of
68 2. DECISION THEORY UNDER UNCERTAINTY

which they are a net seller.) Choosing an insured amount K implies


that the decision maker’s risk is described by the following lottery
LK = (⇡ (K K), (1 ⇡) (w K)) .
The decision maker’s problem problem is therefore given by
max U (LK ) = [(1 ⇡)u(w K) + ⇡u(K(1 ))]
K

Assuming that the Bernoulli utility function is di↵erentiable, the first


order condition for an interior solution is
(1 ⇡) u0 (w K) + (1 )⇡u0 (K(1 )) = 0,
or equivalently
u0 (K(1 )) 1 ⇡
= .
u0 (w K) ⇡1
The first-order condition is sufficient because u us strictly concave and
hence the objective function in the above maximization problem is
concave.
If ⇡ = then insurance is actuarially fair in the sense that the
expected monetary return from taking out an insurance contract is
zero and hence the expect profit of the insurance company is zero. In
this case we have
u0 (K(1 )) = u0 (w K)
from the FOC. Given concavity u00 (·) < 0, the solution to the first order
condition implies K(1 )=w K or w = K, which means that
full insurance is optimal. (All points on the budget constraint imply
the same expected wealth and the risk averse individual will choose the
unique point at which risk is eliminated.)
If > ⇡ then insurance is less than actuarially fair but a risk
averse consumer may still be willing to pay to reduce some but not all
of the risk they would face in the uninsured state. From the first-order
condition we get
u0 (K(1 )) > u0 (w K).
Since u is concave this implies that
K(1 )<w K
and hence K ⇤ < w. We see that full insurance is no longer optimal.
Reducing risk by increasing the insured value also decreases the expect
value of the decision makers wealth. For small amounts of risk the
decision maker is not willing to incur this decrease in expected wealth.
Instead, she is willing to tolerate small amount of risk even if is very
close to ⇡. Intuitively, for small amounts of risk, the decision maker is
almost risk-neutral.
2.6. CONCLUDING REMARKS 69

Now let us check how the insured amount depend on the initial
wealth. Let ↵ = K/w denote the fraction of insured wealth. We have
u0 (K(1 )) 1 ⇡
0
=
u (w K) ⇡1
0
u (↵w(1 )) 1 ⇡
() 0 =
u (w(1 ↵)) ⇡1
@↵
We want to compute @w
. Using the implicit function theorem we have
↵(1 )u00 (↵w(1))u0 (w(1 ↵)) (1 ↵)u0 (↵w(1 ))u00 (w(1 ↵))
@↵ (u0 (w(1 ↵)))2
= w(1 )u (↵w(1 ))u (w(1 ↵))+w u0 (↵w(1 ))u00 (w(1 ↵))
00 0
@w
(u0 (w(1 ↵)))2
00 0 0
↵(1 )u (↵w(1 ))u (w(1 ↵)) (1 ↵)u (↵w(1 ))u00 (w(1 ↵))
=
w(1 )u00 (↵w(1 ))u0 (w(1 ↵)) + w u0 (↵w(1 ))u00 (w(1 ↵))
↵(1 )u00 (↵w(1 ))u0 (w(1 ↵)) (1 ↵)u0 (↵w(1 ))u00 (w(1 ↵))
u0 (w(1 ↵))u0 (↵w(1 ))
= w(1 )u00 (↵w(1 ))u0 (w(1 ↵))+w u0 (↵w(1 ))u00 (w(1 ↵))
u0 (w(1 ↵))u0 (↵w(1 ))
00 00
↵(1 ) uu0 (↵w(1
(↵w(1 ))
))
(1 ↵) uu0 (w(1
(w(1 ↵))
↵))
= 00 u00 (w(1
w(1 ) uu0 (↵w(1
(↵w(1 ))
))
+ w u0 (w(1
↵))
↵))
1 w↵(1 )Ra (↵w(1 )) + w(1 ↵)Ra (w(1 ↵))
=
w2 (1 )Ra (↵w(1 )) + Ra (w(1 ↵))
1 Rr (↵w(1 )) Rr (w(1 ↵))
= 2
w (1 )Ra (↵w(1 )) + Ra (w(1 ↵))
The denominator is positive. Less than full insurance (↵ < 1) implies
that
↵w(1 ) < w(1 ↵)
@↵
Hence if Rr (y) is decreasing the numerator is positive and @w < 0.
Decreasing relative risk aversion implies that the fraction of insured
wealth is decreasing in w. Conversely increasing relative risk aversion
implies that the fraction insured wealth is increasing in w.

2.6. Concluding Remarks


There is some evidence that people are risk-averse even for gambles
that are very small compared to their lifetime wealth. It is hard to
explain this using a di↵erentiable utility function because the implied
concavity would be so high that it leads to implausible risk-premia
for larger gambles (see Rabin (2000)). Instead, risk aversion for small
gambles could be explained by a Bernoulli utility function with a (con-
cave) kink. Around the kink a decision maker is willing to reduce the
expected value of final wealth to avoid risk. The problem with this ex-
planation is that it does not work very well if we observe risk-aversion
for small gambles at many di↵erent wealth levels. In this case we would
70 2. DECISION THEORY UNDER UNCERTAINTY

have to assume that there are many kinks which is mathematically im-
possible if we want to maintain that u0 > 0. Instead one could abandon
the assumption that the decision maker globally optimizes all decisions
in order to maximize a single utility function. Instead one could as-
sume that she looks at narrow choice situations individually (narrow
bracketing). The leading theory using this feature is prospect theory
proposed by Kahneman and Tversky (1979). In this theory, a decision
maker has a reference point y0 and evaluates losses, i.e., y < y0 di↵er-
ently from gains y > y0 . This leads to a kink in the utility function at
y0 . Various theories have been proposed how the reference point y0 is
formed. Quite recently, Kőszegi and Rabin (2006) proposed a theory
where reference points based on expectations of future outcomes, which
has found many applications.
CHAPTER 3

Producer Theory

3.1. Technology
The neoclassical model of production that we study here treats
a firm as a black box that employs a given technology in order to
maximize profits. This abstraction ignores that a firm is composed of
many individuals. In reality, the owners of a firm could by assumed
to be interested profit maximization, but the management may have
di↵erent interests and aligning the incentives of employed managers
with those of the owners often involves additional costs. Likewise,
workers need to be provided with incentives to work. These issues will
be studied in the second part of this course and for a central part of
modern economic theory. Here we abstract from all this and briefly
introduce a simple model of production.
We look at the somewhat special case in which there is a single
output good and n input goods and the role of inputs and outputs do
not change across production plans. We denote output by y 0 and
the input vector by x = (x1 , . . . , xn ), xi 0. The technology of the
firm is described by a set Y ⇢ Rn+1+ with elements (y, x1 , . . . , xn ). The
element of Y are the feasible production plans. If (y, x) 2 / Y then it is
technologically infeasible to produce an amount y of the output with
input quantities given by x. It is more convenient to work with the
production function which denotes the maximal output that can be
produce for any vector of inputs x:

f (x) = max y s.t. (y, x) 2 Y.

We assume that f satisfies f (0) = 0 and that it is continuous,


strictly increasing and strictly quasi-concave.1 There is a lot of simi-
larity between consumer theory and producer theory. The production
function f (x) can be viewed as the analogous object to the utility
function u(x).
The set of input combinations producing exactly the same output
is known as an isoquant Q(y) = {x 0 | f (x) = y}. Its slope is known

1In order to model fixed costs of production, we may sometimes assume that
f (x) = 0 for a neighbourhood of x = 0 and otherwise strictly increasing. Later we
will also sometimes assume that f (x) is strictly concave instead of strictly quasi-
concave.
71
72 3. PRODUCER THEORY

as the marginal rate of technical substitution


dxj @f /@xi
M RT Sij = = .
dxi f (x)=y @f /@xj
(Note the similarity of isoquant to an indi↵erence curve in consumer
theory).
One di↵erence between a utility function and a production function
is that the utility function can be transformed by arbitrary monotonic
transformations without changing the preference relation, whereas a
transformation of f changes the production technology. y = f (x) is
measured in units of output which have a concrete meaning as opposed
to utility which is ordinal and the units are meaningless.
Returns to scale are concerned with the feasibility of scaling up and
down production plans. A production function that is homogeneous of
degree one has constant returns to scale (CRS). If a proportional
increase in all inputs leads to a more than proportional increase in
output we have increasing returns to scale (IRS):
f (tx) > tf (x) for all t > 1, x.
In the opposite case we have decreasing returns to scale (DRS):
f (tx) < tf (x) for all t > 1, x.
Note that these concepts are defined here as global conditions (for
all x). Of course there can also be production technologies that have
increasing returns at some scales and decreasing returns at other scales.
For example, scaling up the size of an operation could initially increase
efficiency which leads to increasing returns to scale at small scale. But
these gains may be diminishing once the firm reaches a certain size.
Increasing the scale further may lead to organizational problems inside
the firm which could lead to decreasing returns to scale for very large
operations. There are many other possibilities and for the purpose of
the general theory we do not want to impose restrictions on the shape
of the productions function.
It is clear that returns to scale are a↵ected by a transformation of
f . There is no analogous concept of returns to scale for utility.
Note that homogeneity of degree larger (smaller) than one implies
IRS (DRS) but the converse is not necessarily true. Similarly, if there
is a singe input, concavity of f implies DRS and convexity implies IRS.

3.2. Cost Minimization


The assumption that firms maximize profits implies that they pro-
duce a given level of output at minimal cost. We assume that the firm
buys inputs in competitive markets, i.e., it acts as a price taker with
respect to input markets. Considering cost minimization is useful to
3.2. COST MINIMIZATION 73

study competitive firms, but also firms with competitive input markets
that have market power in the output market.
We denote input prices or factor price by w 0.2 For given
output y, the optimal inputs solve the cost minimization problem:
c(y, w) = min w| x s.t. f (x) = y
x

First order conditions require @f /@xi = wi where is the La-


grange multiplier on the output constraint, and therefore imply equal-
ity between the marginal rate of technical substitution and input price
ratio, M RT Sij = wi /wj . The cost minimising input quantities are
known as the conditional factor demands, z(y, w) and the function
giving minimum cost given y and z is known as the cost function,
c(y, w) = w| z(y, w).
The cost minimisation problem is formally identical to the con-
sumer’s problem of minimising the expenditure required to reach a
given utility (if inputs are identified with commodities consumed, out-
put with utility and input prices with commodity prices). Results
derived for that problem can therefore be adapted to the current con-
text. Strict quasi-concavity of the production function guarantees that
conditional factor demands are unique because the set {x | f (x) y}
is strictly convex. The cost function is strictly increaing in y, and
(weakly) increasing and homogeneous of degree one, and concave in w.
Furthermore, Shephard’s lemma requires that its derivatives equal the
conditional factor demands @c(y, w)/@wi = zi (y, w). The conditional
factor demands are therefore homogeneous of degree zero in w and
obey symmetry and negativity conditions.
3.2.1. Cost functions. In the case of constant returns to scale,
the cost function and conditional factor demands are proportional to
y, c(y, w) = y(w). Marginal and average cost are equal at all y to
the unit cost function (w).
For non-increasing returns to scale, it is always possible to scale
down production so average cost cannot be higher at lower levels of
output. Average cost c(y)/y is therefore non-decreasing, which, by
dc
simple calculus, implies that marginal cost dy exceeds average cost yc .
For non-decreasing returns to scale, average cost c(y)/y is non-
dc
increasing, which implies that marginal cost dy is less than average
c
cost y .
Other technologies can give various shapes to average and marginal
cost curves. Simple calculus establishes the useful result that the av-
erage cost curve is always flat where average cost equals marginal cost
since ✓ ◆
d(c/y) 1 dc c
= .
dy y dy y
2Remember that this means that wi > 0 for all i.
74 3. PRODUCER THEORY

The lowest quantity q at which average cost reaches a minimum is


known as the minimum efficient scale.
In the following, we want to see how the efficient scale of production
for a given good determines the number of firms in a simple model of
market entry.
Consider the market for a single output y: Market demand for the
output is given by the demand function D(p) > 0. D(p) is the quantity
demanded at price p. We assume that
. D(p) is strictly decreasing,
. D(p) ! 0 as p ! 1,
. D(p) ! as p ! 0.
We denote the inverse demand curve by p(y) = D 1 (y). p(y) is the
price that leads to a demand equal to y.
On the producers side, we assume that there is a freely available
production technology with cost function c(y, w) that can be used by
any firm that wants to start producing a given output y. Let us suppose
that there are no costs associated with market entry or that these costs
are small. All factor prices are assumed to be fixed and given by the
vector w. Let us also assume for simplicity that there is a unique
efficient scale y e↵ (w) and denote the average cost at the efficient scale
by
c(y e↵ , w)
AC e↵ (w) = e↵ .
y (w)
To simplify the analysis, lets assume that there is some integer N ⇤ > 0
such that
D AC e↵ (w) = N ⇤ ⇥ y e↵ (w).
We want study a competitive equilibrium in this market.
Definition 17. Consider any price p > 0, number of active firms
N , and production levels y1 , . . . , yn . (p, N, y1 , . . . yn ) is a compatitive
equilibriaum if the following two conditions hold:
(1) Market Clearing:
N
X
yi = D(p)
i=1

(2) Profit Maximization: each active firm maximizes profits taking


the quilibrium price as given:
yi 2 arg max py c(y, w)
y

(3) No entry: No inactive firm can become active and produce


y > 0 making and make a positive profit:
yp c(y, w)  0 8y > 0.
The following result identifies a unique competitive equilibrium
3.2. COST MINIMIZATION 75

Theorem 7. There is a unique competitive equilibrium. In this


equilibrium N ⇤ are active, each produces at the efficient scale y1 =
. . . = yn = y ef f (w) and the equilibrium price is p = AC ef f (y).
Proof. We first show that (p = AC ef f (y), N ⇤ , y ef f (w), . . . , y ef f (w))
is a competitive equilibrium. Market clearing holds since N ⇤ is defined
above as the number of firms for which
D AC e↵ (w) = N ⇤ ⇥ y e↵ (w).
Profit Maximization: The average cost at the minimum efficient
scale is
c(y e↵ , w)
AC e↵ (w) = e↵
y (w)
Rearranging we get
AC ef f (y)y ef f (w) c(y e↵ , w) = 0,
which means the profit when producing y = y ef f (w) when the price
is p = AC ef f (w), is zero. Since there is a unique minimal efficient
scale (by assumption), any ŷ 6= y ef f (w) leads to a higher average cost
d > AC ef f (y) = p. But if the average cost is higher than the price,
AC
profits are negative. Therefore y = y ef f (p) maximizes profits.
No entry: We have already shown that with price p = AC ef f (w),
no firm can achieve a positive profit with any output level y > 0. Hence
the no-entry condition is satisfied.
Next we show that the equilibrium we found is unique. First, let
us verify that the equilibrium price has to be p = AC ef f (w). For
p > AC ef f (w), producing y ef f (w) leads to a positive profit
y ef f (w)p c(y e↵ , w) = y ef f (w) p AC ef f (w) > 0
Hence the no-entry condition is violated and p > AC ef f (w) cannot
hold for an equilibrium. Conversely, p < AC ef f (w) implies that any
production level has an average cost above the price, hence profit max-
imization implies that all firms choose y = 0, i.e. they are inactive.
But since D(p) > 0 by assumption, this violates market-clearing.
Hence the equilibrium price must be p = AC ef f (w). We have
shown above that at this price, the unique profit maximizing quantity
is y ef f (w). Hence all active firms must produce y ef f (w). Finally,
market clearing requires that exactly N ⇤ firms are active. ⇤
This simple model tells us that the number of firms active in a
market should be inversely related to the minimum efficient scale. In
industries where efficient production requires a large scale, it is more
likely to see just a few firms in the market. In this case, the assumption
of price-taking is no longer reasonable and we should use a model of
oligopoly which will be introduced in the second part of the course.
76 3. PRODUCER THEORY

3.2.2. Profit maximisation. Next, we consider the optimal scale


of production, which depends on the price of output p. We look at the
case that the output market is competitive and the firm acts as a price
taker. The firm maximizes profits:
max py w| x, s.t. y  f (x).
(y,x)

Given that f is strictly increasing, the constraint will be satisfied with


equality at the optimal solution. Therefore we have:
max pf (x) w| x.
x

If we substitute the optimal input choice for every y we obtain instead:


max py c(w, y).
y

The optimal solution yields the output supply function y(p, w) and
input demand functions x(p, w) and the profit function is given by
⇡(p, w) = py(p, w) w| x(p, w).
Note that the profit maximization problem di↵ers from the utility
maximization problem. The objective is not to maximize output (which
would correspond to utility in the UMP) but profit. Also there is no
budget constraint. We will see later that this eliminates income e↵ects
so that, for example, the substitution matrix for input demand func-
tions is symmetric (remember, the substitution matrix of Marshallian
demand is not symmetric).
In the context of utility maximization, we used continuity of the
utility function to establish the existence of a solution and quasi-concavity
to show that the optimal solution is unique. For profit maximization,
both results do not follow from continuity and quasi-concavity. Let us
consider the uniqueness of the solution first. Notice that strict quasi-
concavity of f does not guarantee that pf (x) w| x is strictly quasi-
concave. Therefore, in the following, we assume strict concavity of
f to ensure uniqueness.
Assuming a solution exists with finite y 6= 0 and x 0, then it
satisfies the first order conditions
@f
p wi = 0.
@xi
Hence, the marginal rate of technical substitution any two inputs must
be equal to the input price ratio
wi
M RT Sij = ,
wj
@f
and the marginal product @x i
of each input must equal the ratio
between the input price and the output price:
@f wi
= .
@xi p
3.2. COST MINIMIZATION 77

Using similar arguments as in the optimization problems in con-


sumer theory we can show that the profit function has the properties
that it is
. increasing in p,
. decreasing in w,
. homogeneous of degree one:
⇡( p, w) = ⇡(p, w),
. convex in (p, w),
. Hotelling’s Lemma holds
@⇡(p, w) @⇡(p, w)
= y(p, w) and = xi (p, w)
@p @wi
for all i = 1, . . . , n.
The last property implies output supply and input demands are the
derivatives of the profit function. (Note the di↵erence to utility maxi-
mization and Roy’s identity.)
For supply and demand functions we obtain
. homogeneity of degree zero:
y(tp, tw) = y(p, w) and x(tp, tw) = x(p, w)
. and symmetry and positive semi-definiteness of the sub-
stitution matrix:
0 @y(p,w) @y(p,w) @y(p,w)
1
· · ·
B @x@p1 (p,w)
@w1
@x1 (p,w)
@wn
@x1 (p,w) C
B · · · C
B @p @w1 @wn C.
B .. .. .. .. C
@ . . . . A
@xn (p,w) @xn (p,w) @xn (p,w)
@p @w1
··· @wn

Note that this is an (n+1)⇥(n+1) matrix that also includes derivatives


of y and derivatives w.r.t. the output price. The latter result follows
from convexity of the profit function and Hotelling’s lemma (using the
same argument that is used to show the corresponding properties of
the Slutsky matrix.)
As a particular implication of this, @y/@p 0 so supply functions
must slope up in own prices - the law of supply, and conversely,
@xi (p,w)
@wi
 0. An increase in price of an output increases its supply
and an increase in an input price reduces its demand. (Note that this
result needs no qualification regarding compensation since there is no
budget constraint associated with the producer problem so there are
no income e↵ects).
Now let us turn to the problem of existence of an optimal solution
to the profit maximization problem. So far we have assumed that an
optimal solution exists. This is not always guaranteed because profits
may grow infinitely as y approaches infinity. Since there is no budget
constraint, the scale of the firm is not limited and if returns to scale
78 3. PRODUCER THEORY

are increasing it may be the case that the is no optimal finite scale. To
see this most clearly, let us consider the first-order condition for y if
we substitute the cost function in the profit function:
max py c(y, w).
y

The first order condition is that the output price equals marginal cost:
dc(y, w)
p= .
dy
Under constant returns to scale we have
c(y, w) = y (w),
that is, the marginal cost (w) is independent of y. If (w) < p,
this means that there is no optimal production plan because any pos-
itive output yields strictly positive profit and increasing y lets profits
increase without bound.
In the knife-edge case that (w) = p, profits are equal to zero for
every y and there is an infinite number of optimal production plans.
The optimal scale of production is not determined. (Notice that con-
stant returns to scale violate strict concavity of the production function
which we used above to argue that an optimal solution (if it exists) must
be unique.)
If there are increasing returns to scale and there exists a produc-
tion level y such that average costs c(y, w)/y are smaller than p, then
producing y yields positive profits and since average costs are decreas-
ing with IRS, profits increase without bound if y is increased further.
Again, there is no optimal production plan.
The possibility of indefinitely increasing profits without limit will
be avoided if either
. at sufficiently high output levels there are decreasing returns
to scale and marginal costs raise above p.
. at sufficiently high output levels the assumption of price-taking
behaviour does not hold. In this case our model of profit-
maximization simply does not apply and we need to study the
case of monopoly or oligopoly.
. there are economic forces (such as free entry) which drive input
or output prices to the point where positive profits can not be
earned.
CHAPTER 4

General Equilibrium

4.1. Exchange economies


Consider a population of N individuals indexed by i 2 I = {1, . . . , N },
with individual endowments of goods given by vectors ! i 2 Rn+ , i 2 I.
We denote by ! = (! 1 , . . . , ! N ) the vector of all initial endowment
vectors, and by X
!= !i
i2I
the total (or aggregate) endowment in the whole economy and
assume that ! 0. An allocation x is a vector of consumption
bundles xi 2 Rn+ , i 2 I, so we have x 2 RnN
+ . An allocation x is said
to be feasible if the aggregate endowments are sufficient to cover the
total consumption for each good:
X
xi  !.
i2I

The set of feasible allocations is called F (!). The initial endowments


trivially constitute a feasible allocation.
We assume that each individual i has preferences represented by
a utility function ui that is continuous, strictly increasing, and
strictly quasi-concave.1

4.2. Barter
Before we consider trade in markets coordinated by a price system,
let us look at outcomes that may arise if individuals meet and exchange
their endowments. If all individual endowments and preferences are
commonly know by everybody, and we are in an ideal situation where
people can meet and bargain at no cost, it would be plausible to expect
that all potential gains from trade are realized. We would expect that a
final allocation would be implemented, that cannot be improved upon
unless some individuals are made worse o↵. Moreover, no individual
or group of individuals could be forced to accept an allocation that
1A function f (x) is strictly increasing if f (x)f (y) for x y and f (x) > f (y)
for x y. Together with strict quasi-concavity, this implies that f (x) > f (y) if
x y and x 6= y. This property is called strongly increasing and in the textbook
it is directly assumed that the utility functions ui are strongly increasing. Strong
monotonicity implies that the function is strictly increasing in each component.
79
80 4. GENERAL EQUILIBRIUM

they could improve upon by themselves. The set of allocations that


satsify these requirements is called the core. To define this formally,
we first defined what it means that an allocation can be blocked by a
coalition (that is a group) of individuals.
Definition 18. A coalition S ✓ I blocks allocation x 2 F (!)
if there exists an allocation y 2 RnN
+ that is
(1) feasible for S:
X X
yS = yi  !i = !S
i2S i2S

(2) and makes no-one in S worse o↵ and at least one individual


better o↵. Formally:
ui (y i ) ui (xi )
for all i 2 I with at least one strict inequality.
Now we can defined the core:
Definition 19. The core of an exchange economy, denoted
C(!), is the set of feasible allocations that are unblocked.
One particular implication of the core is that all allocations x 2
C(!) are Pareto efficient:
Definition 20. An allocation x is Pareto efficient, denoted x 2
P E(!), if x is feasible and there is no other feasible allocation z such
that ui (z i ) ui (xi ) for all i 2 I with strict inequality for at least one
i.
If there exists such an allocation z we say that x is Pareto-dominated
by z and z is called a Pareto-improvement for x. So in other words,
a Pareto efficient allocation is Pareto-undominated.
To see that
(4.2.1) x 2 C(!) ) x 2 P E(!),
note that x 2 / P E(!) implies that x is Pareto-dominated by some
allocation y. But that means that the grand coalition S = I blocks
x, because y satisfies conditions 1 and 2 above for S = I. Hence
x2 / P E(!) implies x 2/ C(!).
Obviously the converse of (4.2.1) is not true. For example, Pareto-
efficiency does not guarantee that individuals are at least as well o↵ as
with their initial endowments which is required by the core. Compare
the contract curve in the Edgeworth box with the Core, which is the
part of the contract curve that is contained in the lens-shaped region
between the indi↵erence curves that go through the initial endowments.
Note also that the set of pareto efficient allocations P E(!) only de-
pends on the aggregate endowment ! available in the economiy. By
4.3. COMPETITIVE TRADING 81

contrast, the core C(!) depends on ! = (! 1 , . . . , ! N ), i.e., the individ-


ual endowment vectors. Depending on how ! is allocated across indi-
viduals initially, di↵erent pareto-efficient allocations satisfy the stronger
restrictions of the core.

4.3. Competitive Trading


The core arises as the set of possible allocations in a thought exper-
iment where all members of the economy can get together and have the
necessary information to reach an outcome that cannot be improved
upon by further exchange (bilateral or multilateral). For a small num-
ber of individuals, the core may seem plausible as the set of potential
solutions but in a large society, it seems infeasible that all individu-
als meet and agree on an allocation that cannot be improved upon or
blocked by any coalition.
Next, we want to investigate allocations that arise if trade is or-
ganized in a decentralised way and the only coordination that takes
place between individuals is achieved through a price vector that is
the same for everybody. If we assume that individuals act as price
takers, i.e., they chose their optimal consumption bundles assuming
that they have no influence on the price, then the decision problem of
an individual is exactly the same as the consumer choice problem with
endowments studied previously in Chapter 1.
Gross demands are given by f i (p| ! i , p) and net demands or excess
demands2 are therefore
z i (p) = f i (p| ! i , p) !i.
Aggregate access demands are given by
X X
zk (p) = zki (p) = fki (p| ! i , p) !ki ,
i2I i2I

for each good k, and the vector of aggregate excess demand is denoted
z(p).
Definition 21. A price vector p 0 is a Walrasian equilibrium
if z(p) = 0.
A Walrasian equilibrium is also called competitive equilibrium,
general equilibrium, market equilibrium or price-taking equi-
librium. The definition requires that aggregate access demand for all
goods is equal to zero. This is equivalent to the condition that gross
demands at those prices constitute a feasible allocation. If trade takes
2These demands are sometimes referred to as notional demands to convey the
fact that they are the demands that would be expressed by individuals who are
able to realise demands on all markets. If failure of aggregate market clearing
requires some individuals to be rationed on some markets then this will cause
demands expressed on other markets to deviate from these notional demands. This
distinction is important for the analysis of disequilibrium situations.
82 4. GENERAL EQUILIBRIUM

place at equilibrium prices, perfect coordination is achieved because


every unit of excess demand by some individual finds a matching unit
of excess supply by another individual.

4.4. Walras’ Law


The assumption made about the utility functions guarantee that ag-
gregate excess demand is continuous and homogeneous of degree zero
z(p) = z( p).3 Hence, if any price vector p constitutes a Walrasian
equilibrium then so does any multiple p, > 0. Finding an equilib-
rium therefore involves finding a vector of n 1 relative prices which
ensures satisfaction of the n conditions, z j (p) = 0, j = 1, . . . , n.
At first it might seem that this gives an inadequate number of free
prices to solve the required number of equalities but this is not so
because of Walras’ Law. Each individual is on their individual bud-
get constraint the value of their excess demand is zero, p| z i (p) = 0.
But then theP value of aggregate excess demand must also be zero,
| | i
p z(p) = i p z (p) = 0, which is Walras’ Law. Hence, if prices
can be found to clear only n 1 of the markets then the clearing of
the remainingP market is guaranteed: zj (p) = 0, j = 1, . . . , n 1 )
zn (p) = p1n k<n pk zk (p) = 0. Note that Walras’ Law holds both
in and out of equilibrium because it is follows from the fact that all
consumers spend their entire budget (since preferences are monotonic).

4.5. Existence of Equilibrium


We can show that a Walrasian equilibrium exists if aggregate excess
demand is continuous, homogeneous of degree zero, satisfies Walras’
law and satisfies the following condition:
Assumption 2 (Unboundedness of Aggregate Excess Demand).
For any sequence of price vectors (pm ) that converges to a vector of
prices where some prices are positive and at least one zero, i.e., pm !
p̄ 6= 0, and p̄k = 0 for some k, there is some good k 0 with p̄k0 = 0, such
that zk0 (pm ) is unbounded above.
To see that these conditions, which are fulfilled under our assump-
tions on the utility function, are sufficient to guarantee existence of
equilibria, we first look at the case that n = 2. The proof consists of
five steps:
1. Normalise p2 = 1. As mentioned before, we only need to find
n 1 relative prices and can set one price equal to 1 because
aggregate excess demand is homogeneous of degree zero. (If
z(p1 , p2 ) = 0 then z( pp12 , 1) = 0 so if we can find an equilibrium
with p2 6= 1 we can also find one where p2 = 1.)
3You can find a proof of this in the textbook.
4.5. EXISTENCE OF EQUILIBRIUM 83

2. Walras’ law: It is sufficient to find p⇤1 > 0 such that z1 (p⇤1 , 1) =


0. Walras’ law guarantees that the market for the second good
is in equilibrium if the aggregate excess demand for the first
good is zero. Hence z1 (p⇤1 , 1) = 0 implies that (p⇤1 , 1) is an
equilibrium.
The remaining steps show that there exists p⇤1 such that
z1 (p⇤1 , 1) = 0.
3. First we show that z1 (p1 , 1) > 0 for some price p1 > 0. This
follows from the unboundedness assumption on aggregate ex-
cess demand. Consider a sequence of prices pm 1 ! 0 (where
wez1 (p⇤1 , 1) = 0. Walras’ law guarantees that the market for
the second good is in equilibrium if the aggregate excess de-
mand for the first good is zero. Hence z1 (p⇤1 , 1) = 0 implies
that (p⇤1 , 1) is an equilibrium.
The remaining steps show that there exists p⇤1 such that
z1 (p⇤1 , 1) = 0.
3. First we show that z1 (p1 , 1) > 0 for some price p1 > 0. This
follows from the unboundedness assumption on aggregate ex-
cess demand. Consider a sequence of prices pm 1 ! 0 (where
we hold p2 = 1 constant.). By Assumption 2, if some but not
all prices converge to zero, aggregate excess demand must be-
come arbitrarily large for one of the commodities whose prices
converge to zero. Here, there is only one such good, good 1.
Therefore z1 (pm 1 , 1) becomes arbitrarily large, and we can find
a price in the sequence pm m
1 for which z1 (p1 , 1) > 0 and we can
set p1 = pm 1 .
4. Next we show that z1 (p1 , 1) < 0 for some price p̄1 > 0. Again
we use the unboundedness assumpt hold p2 = 1 constant.).
By Assumption 2, if some but not all prices converge to zero,
aggregate excess demand must become arbitrarily large for one
of the commodities whose prices converge to zero. Here, there
is only one such good, good 1. Therefore z1 (pm 1 , 1) becomes
arbitrarily large, and we can find a price in the sequence pm 1
for which z1 (pm 1 , 1) > 0 and we can set p 1
= p m
1 .
4. Next we show that z1 (p1 , 1) < 0 for some price p̄1 > 0. Again
we use the unboundedness assumption. This time we con-
sider a sequence of prices for good one that diverges to infinity
pm1 ! 1. Consider aggregate access demand for good two.
By homogeneity we have
z2 (pm m
1 , 1) = z2 (1, 1/p1 ),

and 1/pm m m
1 ! 0. Setting p̃2 = 1/p1 and p̃1 = 1, we can
argue as in step 3 with the roles of good one and two reversed.
Now we have that z2 (1, p̃m
2 ) becomes arbitrarily large as m !
1 (and p̃m2 ! 0.) Therefore z2 (pm
1 , 1) also becomes becomes
84 4. GENERAL EQUILIBRIUM

arbitrarily large as pm 1 ! 1, and we can find some p̄1 > 0 such


that z2 (p̄1 , 1) = z2 (1, 1/p̄1 ) > 0. But if z2 (p̄1 , 1) > 0, Walras’
law implies that z1 (p̄1 , 1) < 0 because otherwise p̄1 z1 (p̄1 , 1) +
1 z2 (p̄1 , 1) > 0.
5. Finally, since z1 is continuous and takes positive and nega-
tive values, there exists some p⇤1 between p1 and p1 such that
z1 (p⇤1 , 1) = 0. This follows from the intermediate value theo-
rem.
To prove existence for general n we cannot reduces the problem to
searching for a single price. The proof becomes much more involved
and makes use of a fixed point theorem. The complete proof can be
found in the textbook by Jehle/Reny (Theorem 5.3).

4.6. Welfare
Simply looking at the representation of equilibrium in an Edgeworth
box makes clear that neither individual can be made better o↵ without
making the other worse o↵.4 Moreover, in the Edgeworth box, the
equilibrium allocation, which is given by
xi (p⇤ ) = z i (p⇤ ) + ! i , i 2 I,
is in the core. This is true more generally for any number of goods
and the Pareto-efficiency property of Walrasian equilibria is called the
First Fundamental Theorem of Welfare Economics:
Theorem 8. In the model of an exchange economy considered here,
any Walrasian equilibrium is Pareto efficient.
To see this consider any x 2 W (!), where W (!) denotes the set of
all Walrasian equilibrium allocations (there may be many), which de-
pends on the vector of initial endowments ! = (! 1 , . . . , ! N ). Clearly,
x is feasible. To show that it is in the core, we have to show that it is
unblocked. If there was a blocking coalition S 2 I, then the coalition
must propose y such that y S is feasible for S:
X X
yS = yi  !i = !S .
i2S i2S

But this means that y S must also be a↵ordable for S at any price
vector p 0, because we can multiply both sides of the inequality by
p. In particular if p⇤ is the equilibrium price vector that corresponds
to x, we must have
(4.6.1) p⇤| y S  p⇤| ! S .
4The budget line separates the upper contour sets of the two individuals.
There-
fore, moving to allocation that makes one individual better o↵ must make the other
individual worse o↵.
4.6. WELFARE 85

Secondly, in order for S to be a blocking coalition, the proposal y


must also make no-one in S worse o↵ and at least one individual better
o↵:
ui (y i ) ui (xi )
for all i 2 S, with strict inequality for one i. But if ui (y i ) ui (xi ),
then we cannot have
(4.6.2) p⇤| y i < p⇤| ! i .
If (4.6.2) was fulfilled, we could increase the consumption level of any
good in y i and make i strictly better o↵. If the increase is small, the
new bundle would still be a↵ordable for i at prices p⇤ . This is not pos-
sible because xi is i’s equilibrium consumption, in particular, it is the
optimal consumption level given prices p⇤ and the initial endowment
! i . So there cannot be a strictly better bundle that is a↵ordable, and
we must have for all i:
p⇤| y i p⇤| ! i .
Moreover, for consumers i 2 S, where
ui (y i ) > ui (xi ),
we must have
p⇤| y i > p⇤| ! i ,
because WARP implies that any bundle y i that is strictly better than
i’s optimal bundle xi for prices p⇤ , cannot be a↵ordable at prices p⇤ .
These inequalities together imply (adding up over all i 2 S) that
p⇤| y S > p⇤| ! S .
But this contradicts the requirement (4.6.1) that y must be a↵ord-
able for S. So it is impossible for any coalition to block x and hence
x 2 C(!).
4.6.1. Finding Pareto-Efficient Allocations and Walrasian
Equilibria. In a Pareto-Efficient allocation, the utility of an individual
i, must be maximized given the utility levels of all other individuals.
Conversely, if we fix the utility levels for j 6= i and maximize ui , we
obtain a Pareto efficient allocation.
Hence x 2 P E(!) is equivalent to x being an optimal solution to

max ui (xi ) s.t. uj (xj ) ûj , 8j 6= i,


x2F (!)
j
where û , are feasible utility levels for j 6= i. This means that there
exists a feasible allocation x̂ 2 F (!) such that uj (xj ) = ûj for all
j 6= i. The Lagrangian to this problem is
n
!
X X X j
L = ui (xi ) + µj (uj (xj ) ûj ) + k !k xk
j6=i k=1 j6=i
86 4. GENERAL EQUILIBRIUM

where µj are the multipliers for the utility constraints and k are the
multipliers of the feasibility constraints. Therefore (defining µi = 1),
x is also the solution to
XN
max µi ui (xi ).
x2F (!)
i=1

Conversely, if we fix arbitrary weights µ1 , . . . , µN 0 (not all zero),


then any solution to the second maximization problem is again a Pareto-
efficient allocation.
If we consider the last maximization problem:
N
X N
X
i i i
max µ u (x ) s.t. xik  ! k , 8k
x2RnN
+ i=1 i=1
with multipliers k for the feasibility constraint, first-order conditions
are
@ui (xi )
µi  k with ”=” if xik > 0.
@xik
If utility functions are strictly quasi-concave, first order conditions
are sufficient and the following set of equations (together with the fea-
sibility constraints) characterize the set of interior solutions: for all
individuals i, j 2 I and goods k, `:
@ui (x⇤i ) @uj (x⇤j )
@xik k @xjk j
i
M RSk` (x⇤i ) = @ui (x⇤i )
= = @uj (x⇤j )
= M RSk` (x⇤j ).
`
@xi` @xj`

For example for the Edgeworth box (N = n = 2), we obtain one


equality
1 2
M RS12 = M RS12
and two equalities from the binding feasibility constraints. Since an
allocation is described by four numbers xik , this leaves one degree of
freedom and indeed the set of contract curve which describes the
Pareto-efficient allocations is a one-dimensional object.
If we take any (interior) Pareto-efficient allocation x⇤ , we can use
the condition
k j
(4.6.3) i
M RSk` (x⇤i ) = = M RSk` (x⇤j ),
`

to define a price vector p . If we knew the Lagrange multipliers for
the resource constraint we could simply set p⇤k = k , but since we can
normalize one price to one we can set p⇤N = 1 and p⇤k = k / N =
M RSkNi
(x⇤i ).
This construction of prices can be used in two ways:
(1) For given x⇤ 2 P E(!) we can find initial individual endow-
ments ! i such that p⇤ is a Walrasian equilibrium and the equi-
librium allocation is given by x⇤ . The easiest way to construct
4.7. TIME 87

such endowments is to set ! i = x⇤i for all i. More generally


we can take any endowments ! i such that
(4.6.4) p⇤| x⇤i = p⇤| ! i
for all i. The only additional restriction that we need here is
that theP individualPendowments add up to the total endow-
⇤i i
ment: i2I x = i2I ! .
To see that p⇤ is an equilibrium for these endowments, note
the following: If an individual faces prices p⇤ then the optimal
consumption bundle satisfies
✓ ◆
i i p⇤k k
M RSk` (x ) = ⇤ = .
p` `

Moreover the budget must be exhausted at the optimal bun-


dle therefore the optimal bundle is the solution to (4.6.3)
and
P (4.6.4), which is x⇤i . Since this is true for all i and
P
⇤i i ⇤
i2I x = i2I ! , all markets clear at p and hence this
is a Walrasian equilibrium.
This proves the Second Welfare Theorem which says
that any x 2 P E(!), can be achieved as a competitive equilib-
rium x 2 x 2 W (! ˆ 1, . . . , !
ˆ N ) for some endowments ! ˆ 1, . . . , !
ˆN
with ! ˆ = !, if utility functions satisfy the assumptions made
in the beginning (in particular strict quasi-concavity). In other
words, we can redistribute initial endowments so that for the
new endowments ! ˆ 1, . . . , !
ˆ N we have x 2 W (! ˆ 1, . . . , !
ˆ N ).
(2) Secondly, we can use the construction of prices to find an equi-
librium for given endowments. We must find a Pareto efficient
allocation x⇤ such that the constructed prices p⇤ satisfy
p⇤| x⇤i = p⇤| ! i
for all i. (This is sometimes very easy, for example if the
M RS of one individual is constant for all x 2 P E(!). If
the MRS varies with x, finding a solution may be a bit more
complicated.)

4.7. Time
So far, we have considered a static model without uncertainty. If
we want to introduce time, we can keep the same framework but index
consumption by the time period. Instead of a single consumption level
xik for each consumer and each good k, we introduce consumption levels
in each period xitk . Similarly, in each period each consumer has an
i
endowment of each good !tk .
To formulate the feasibility constraints, we need to distinguish per-
ishable from durable goods. If all goods are perishable, i.e., they cannot
88 4. GENERAL EQUILIBRIUM

be stored and consumed in a later periods, a feasibility constraint has


to be imposed for each t and k:
X X
8t, k : xitk  i
!tk .
i i

Without uncertainty about future endowments, and if all individ-


uals have time-consistent preferences, then it is enough to consider a
static model with an extended set of (time-indexed) commodities. In
period one, individuals can trade commodities for the current period.
For future periods, they trade claims on future endowments. This re-
quires that individuals can enter binding agreements that require them
to deliver part of their endowments to another individual in a future
period. Otherwise, individuals might not be willing to give up con-
sumption today for the promise of higher future consumption, because
they must expect that the promises of future consumption will be bro-
ken.

4.8. Uncertainty
Uncertainty can be incorporated by introducing di↵erent states of
the world s 2 S with (known objective) probabilities ⇡s . We then
proceed as in the case of time and index commodities by the state of
the world. Hence, we have consumption levels xisk and endowments
i
!sk . Once the uncertainty as been resolved and we know what is the
state of the world, the feasibility constraint is the same as in the world
without uncertainty. Hence, we have
X X
8s, k : xisk  i
!sk .
i i

Before uncertainty is resolved, agents can trade assets that guaran-


tee consumption contingent on the state of the world. This is useful
because they can transform an uneven distribution of endowments in
di↵erent states of the world that exposes everybody to a lot of risk,
into a distribution where individual consumption levels are similar in
all state of the world. A “risk-averse” individual will be willing to
engage in such exchanges. The following example will illustrate this.

4.9. Application: Risk-Sharing


Consider an economy with two individuals i = 1, 2. Both are
expected utility maximizers and despite the fact that there are only
two individuals, we consider the case that they are both price-takers.
There are two states of the world s = 1, 2, and the probability of state
one—which will be “the good state”—is ⇡. The probability of the “bad
state” is 1 ⇡. There is a single good which we could interpret as money
4.9. APPLICATION: RISK-SHARING 89

or a basket of consumption goods. In each state s each consumer i has


an endowment of the good !si 0 and total endowments are given by
! s = !s1 + !s2 ,
for s = 1, 2. We assume that ! 1 ! 2 so that the “better state” one
has the higher total endowment. An allocation x = (x11 , x12 , x21 , x22 ),
xis 0, specifies a consumption level for each individual in each state
of the world and an allocation is feasible (x 2 F (!)), if for both states
s = 1, 2:
xs = x1s + x2s  ! s .
The individuals have preferences with an expected utility represen-
tation and we assume that both have a Bernoulli utility function
⇢i x
ui (x) = e ,
where ⇢i > 0 is an individual-specific preference parameter. Note that
since there is only a single consumption good (or money), we can talk
about risk-aversion in a well-defined way.5 It turns out that for this
Bernoulli utility function each individual has constant absolute risk-
aversion:
2 i
i (⇢i ) e ⇢ x
Ra (x) = = ⇢i .
⇢i e ⇢ i x
Hence it will be easy to analyse how equilibria and efficient allocations
depend on how risk-averse the individuals are.6
4.9.1. Efficiency. I want to start by characterizing Pareto effi-
cient allocations. There are two possibilities to extend our definition
to a model with uncertainty. The first, ex-post (Pareto) efficiency
demands that for every state of the world, the allocation (x1s , . . . xN
s )
is Pareto efficient—that is, there is no alternative feasible allocation
that makes no-one worse o↵ in state s and make at least one individual
strictly better o↵ in state s. If this condition holds for every state of
the world, then an allocation is ex-post efficient.
In our simple model with one commodity, ex-post efficiency is equiv-
alent to requiring that in every state of the world s:
(4.9.1) x1s + x2s = ! s .
Given that Bernoulli utility functions are strictly increasing, if this
condition holds for an allocation xs for state s, it is impossible to
make someone better o↵ in this state without reducing the utility of
the other individual. Conversely, if the condition is violated for some
state s, then there are unallocated resources and we can assign them to
5If consumption could not be ordered linearly, e.g., because there are multi-
dimensional consumption bundles, our definition of risk-aversion would not apply.
6The assumption of constant absolute risk aversion is not very plausible, em-
pirically. I only use it here to illustrate some features of risk-sharing in a framework
with simple explicit solutions.
90 4. GENERAL EQUILIBRIUM

one of the two individuals which will strictly prefer the new allocation
while the other is not worse o↵. Therefore, for the case of a single
commodity, the condition is equivalent to ex-post efficiency.7
The second concept of efficiency is ex-ante Pareto efficiency.
For this concept we consider the expected utility of an individual for a
given allocation x which is given by
⇥ ⇤
U i (xi ) = E ui (xis ) = ⇡ui (xi1 ) + (1 ⇡)ui (xi2 ).
A feasible allocation x 2 F (!) is ex-ante efficient if there is no al-
ternative feasible allocation y 2 F (!) that ex-ante Pareto dominates
x—that is, there is no y 2 F (!) such that for all i,
U i (y i ) U i (xi ),
with strict inequality for at least one i.
Ex-ante efficiency implies ex-post efficiency: Clearly, if (4.9.1) is
violated for one state of the world, it is possible to increase expected
utility of one individual without hurting the other. The converse is not
true: In the simple example considered here, all allocations where the
feasibility constraints are binding in both states are ex-post efficient.
But we will see now that only a subset of these are ex-ante efficient. In
the example, we see that with risk-averse individuals, ex-ante efficiency
also requires optimal risk-sharing.
To obtain all ex-ante efficient allocations we can solve the following
maximization problem:
max µ1 U 1 (x1 ) + µ2 U 2 (x2 )
x2R4+

s.t. 8s : x1s + x2s  ! s


where µi 0 with at least one positive µi . The Lagrangian for this
problem is
L= µ1 ⇡u1 (x11 ) + (1 ⇡)u1 (x12 )
+ µ2 ⇡u2 (x21 ) + (1 ⇡)u2 (x22 )
+ 1 !1 x11 + x21 + 2 !2 x12 + x22 .
The first order conditions for an interior solution are
1 1 2
M RS12 = = M RS12
2
where the marginal rates of substitution are given by
@U i (xi )
⇢i xi1
i @xi1 ⇡ ⇢i e ⇡ i i xi1 )
M RS12 = @U i (xi )
= ⇢i xi2
= e⇢ (x2 .
1 ⇡ ⇢i e 1 ⇡
@xi2

7If
n > 1, ex-post Pareto efficiency does not only rule out unallocated resources
but also required an efficient allocation of resources in each state.
4.9. APPLICATION: RISK-SHARING 91

Note that given that the Bernoulli utility functions are strictly concave,
the first-order conditions are also sufficient. Given the first order con-
dition, we obtain that ex-ante efficiency holds for an interior solution
if and only if
⇢1 (x12 x11 ) = ⇢2 (x22 x21 ).
Moreover, the allocation must also be ex-post efficient so the feasibility
constraints hold with equality. Inserting x2s = ! s x1s and rearranging
we therefore obtain
⇢2
x11 = x12 + 1 (! 1 ! 2 ) .
⇢ + ⇢2 | {z }
0

If we vary x12 on the interval (0, ! 2 ), we obtain all interior allocation


that are ex-ante efficient (the allocation for the second individual is of
course obtained from x2s = ! s x1s ).
Now we can analyse the implications of ex-ante efficiency: Note first
that x11 x12 (and also x21 x22 ). For a given allocation of consumption
in state 2, the allocation in state one is equal to xi2 plus a share of the
extra resources available in the good state ! 1 ! 2 . The share depends
on the risk-aversion parameters of the two individuals. In the special
case that ! 1 = ! 2 , both individuals have the same consumption level
in both states of the world (one may consume more than the other
in both states but all risk is eliminated in an efficient allocation). If
! 1 > ! 2 , it is impossible to have constant consumption across states
for both individuals. In an efficient allocation, the individual with
the lower risk aversion parameter ⇢i is allocated a larger share of the
additional endowment in the good state. In the extreme case where
one individual, say i = 1, becomes extremely risk averse ⇢1 ! 1 or
2
the other becomes risk-neutral ⇢2 ! 0 we have ⇢1⇢+⇢2 ! 0 so that
x11 = x12 and x21 = x22 + (!1 !2 ). This result can be easily understood
by looking at how the marginal rates of substitution depend on ⇢i . If
⇢i ! 1 we have
(
i ⇡ i i i 0, if xi2 < xi1 ,
M RS12 = e⇢ (x2 x1 ) !
1 ⇡ 1, if xi2 > xi1 .
This means that the indi↵erence curves in the Edgeworth box become
L-shaped. A point where two indi↵erence curves touch can therefore
only be at the kink of the L-shaped indi↵erence curve, i.e., at a point
where xi1 = xi2 . On the other hand, if ⇢j ! 0, we have
j ⇡ ⇢j (xj2 xj1 ) ⇡
M RS12 = e ! .
1 ⇡ 1 ⇡
In the limit, j has the same marginal rate of substitution as she was
maximizing the expected value ⇡xj1 + (1 ⇡)xj2 of her consumption
(rather than expected utility). In this sense, j becomes risk-neutral.
She is willing to tolerate any risk (i.e., j does not care if xj1 6= xj2 ) as long
92 4. GENERAL EQUILIBRIUM

as the expected value of consumption remains constant. Therefore,


if we start with an allocation where xi1 > xi2 , we can always make
individual i, who is strictly risk averse, strictly better of by decreasing
xi1 and increasing xj2 in a way that the expected value remains constant.
By holding the expected value if i’s consumption constant we are also
holding the expected value of j’s consumption constant. So we are
moving from an initial allocation with xi1 > xi2 along an indi↵erence
curve of j and thus we obtain a Pareto improvement. Only when
we reach xi1 = xi2 , we have a Pareto efficient allocation. As long as
one individual is extremely risk-averse or the other is risk-neutral, the
more risk-averse individual has the same consumption level in both
states and the less risk averse individual absorbs all the variation in
the aggregate endowment.
If both risk-aversion parameters are finite and strictly positive, both
individuals bear some risk but the larger share of the extra endowment
in the good state is consumed by the the less risk-averse individual.

4.9.2. Equilibrium Prices. As discussed previously, for a given


(interior) allocation x that is Pareto efficient, we can use the marginal
rates of substitution to construct equilibrium prices. In the present
example, this is particularly simple because we have
p⇤1 ⇡ ⇢1 (x12 x11 ) ⇡ ⇢1 ⇢2
(! 1 !2 )

= e = e ⇢1 +⇢2
p2 1 ⇡ 1 ⇡
⇢2
where the second equality is obtained by inserting x12 x11 = ⇢1 +⇢2
(! 1
⇢1 ⇢2
(! ! )
! 2 ). So we can set p⇤2 = 1 and p⇤1 = 1 ⇡ ⇡ e ⇢1 +⇢2 1 2 . Note that the
equilibrium prices only depend on the risk-aversion parameters and
the aggregate initial endowments. This is not always the case and
depends on the particular choice of Bernoulli utility functions in this
example. Note also that these prices need not hold in equilibrium if the
equilibrium allocation is not interior. Only if we have initial endow-
ments for which the optimal choices satisfy 0 < xis (p⇤ ) < !s , can we
conclude that equilibrium prices are given by p⇤ . Otherwise we have
to deal with corner-solutions. To conclude the discussion of equilib-
p⇤
rium prices, note that !1 = !2 implies p1⇤ = 1 ⇡ ⇡ . This implies that
2
the equilibrium allocation depends on the relative probabilities of the
di↵erent state. For !1 > !2 we can look at extreme cases again. If
p⇤
one individual becomes risk-neutral ⇢i ! 0, we have p1⇤ = 1 ⇡ ⇡ as well.
2
This must hold because the indi↵erence curves of this individual be-
come straight lines with slope 1 ⇡ ⇡ , therefore this is the only price ratio
that can prevail in equilibrium. If both individuals become extremely
risk-averse (⇢i ! 1), then the price ratio converges to zero
p⇤1
! 0.
p⇤2
4.9. APPLICATION: RISK-SHARING 93

Intuitively, both individuals are willing to sell consumption in the good


state at a very low price because they are not willing to tolerate highly
variable consumption across states.
Bibliography

Deaton, A., and J. Muellbauer (1980): Economics and Consumer


Behavior. Cambridge University Press.
Debreu, G. (1959): Theory of Value. Yale University Press.
Kahneman, D., and A. Tversky (1979): “Prospect Theory: An
Analysis of Decision under Risk,” Econometrica, 47(2), 263–292.
Kőszegi, B., and M. Rabin (2006): “A Model of Reference-
Dependent Preferences,” Quarterly Journal of Economics, 121(4),
1133–1165.
Kreps, D. M. (1988): Notes on the Theory of Choice. Westview Press.
Mas-Colell, A., M. D. Whinston, and J. R. Green (1995):
Microeconomic Theory. Oxford University Press.
Rabin, M. (2000): “Risk Aversion and Expected-Utility Theory: A
Calibration Theorem,” Econometrica, 68(5), 1281–1292.

95

You might also like