Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

B&N and Amazon

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

NBER WORKING PAPER SERIES

MEASURING PRICES AND PRICE COMPETITION ONLINE:


AMAZON AND BARNES AND NOBLE

Austan Goolsbee
Judith Chevalier

Working Paper 9085


http://www.nber.org/papers/w9085

NATIONAL BUREAU OF ECONOMIC RESEARCH


1050 Massachusetts Avenue
Cambridge, MA 02138
July 2002

We would like to thank Michael Smith, Scott Schaefer, and participants at the Yale applied micro lunch for
helpful comments. We benefited greatly from expert research assistance by Chip Hunter, Patrik
Gaggenberger, and Tina Lam. We thank Madeline Schnapp for assistance and rely heavily on her 6/21/2001
presentation at the UCB/SIMS Web Mining workshop for data. Goolsbee would like to thank the National
Science Foundation (SES 9984567), the Sloan Foundation and the American Bar Foundation for financial
support. We have no reason to thank the University of Chicago Press. The views expressed herein are those
of the authors and not necessarily those of the National Bureau of Economic Research.

© 2002 by Austan Goolsbee and Judith Chevalier. All rights reserved. Short sections of text, not to exceed
two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is
given to the source.
Measuring Prices and Price Competition Online: Amazon and Barnes and Noble
Austan Goolsbee and Judith Chevalier
NBER Working Paper No. 9085
July 2002
JEL No. L8

ABSTRACT

Despite the interest in measuring price sensitivity of online consumers, most academic work on
Internet commerce is hindered by a lack of data on quantity. In this paper we use publicly available data
on the sales ranks of about 20,000 books to derive quantity proxies at the two leading online booksellers.
Matching this information to prices, we can directly estimate the elasticities of demand facing both
merchants as well as create a consumer price index for online books. The results show significant price
sensitivity at both merchants but demand at Barnes and Noble is much more price-elastic than is demand
at Amazon. The data also allow us to estimate the magnitude of retail outlet substitution bias in the CPI
due to the rise of Internet sales. The estimates suggest that prices online are much more variable than the
CPI, which understates inflation by more than double in one period and gets the sign wrong in another.

Austan Goolsbee Judith Chevalier


Graduate School of Business Yale School of Management
University of Chicago 135 Prospect Street
1101 E. 58th St. New Haven, CT 06520
Chicago, IL 60637 and NBER
and NBER judith.chevalier@yale.edu
goolsbee@gsb.uchicago.edu
In the earliest days of Internet commerce, many economists and media observers

predicted that competition among Internet retailers would quickly resemble perfect

competition.1 After all, the Internet already reduces search costs relative to visiting physical

stores and shopbots and comparison sites could be expected to lower search costs still

further.

Two strands of research have addressed the question of price competition on the

Internet. The first set of papers examines patterns of prices for homogeneous goods. Using

price dispersion to measure the extent of competition has been used extensively in

traditional bricks and mortar retail settings (see Sorensen, 2000; Milyo and Waldfogel, 1999,

for example). Researchers have examined the degree of price dispersion amongst Internet

retailers, as well as between Internet retailers and bricks-and-mortar retailers. The general

consensus of these papers is that price dispersion amongst Internet retailers is large, and that

online retailers charge prices that are either modestly lower or actually higher than their

offline counterparts.2 These results seem incompatible with the idea that the Internet has

completely eliminated consumer search costs. An important advantage of this strand of

research is that these studies require only publicly available price data. However, a concern

with these findings is that, while relatively high prices are posted at some Internet sites, few

or no transactions may be taking place at those relatively high prices. Without quantity data,

it is impossible to know.

1 See, for example, Kuttner (1998).


2Work by Lee (1997) for cars and Bailey (1998) for books, CDs, and software suggest that prices were actually
higher online than in retail stores. More recent work by Brynjolfsson and Smith (2000) for books and CDs and
by Clay et al. (2000) for books has found prices the same or lower online but that online price dispersion is
quite high, perhaps greater than in retail stores. Carlton and Chevalier (2001) show, among other things, the
existence of price dispersion among online fragrance retailers.
A second strand of research attempts more direct measures of consumer price

sensitivity.3 The general consensus from this work seems to be that Internet markets do

seem competitive in the sense that demand for a seller appears to be quite elastic to the

seller’s own price or to competitors prices. One important drawback of this research is that

all of the papers rely on proprietary information on firm sales or consumer buying patterns.

In general, there has been little overlap in the industries studied by the two approaches.

We examine online books, in part because this is the most-studied Internet retail

category, but also because it is one of the largest online sales categories. We develop a

method to estimate directly the own- and cross-price elasticities of demand at Amazon and

Barnes and Noble.com (hereafter, BN.com). We also compute a Fisher-ideal price index for

online books. To do these things we need only 2 sources of data: publicly available

information on prices and sales ranks at the two leading sites and data from simple

experiments which anyone can conduct for less than $50.

Our results show several things about prices and competition in the online book

industry. First, having sales data matters for the results. The prices of online books, for

example, look dramatically different when books are weighted by sales compared to when all

books are weighted equally (as assumed in the conventional literature). Also, it is clear that

online inflation behaves quite differently in this period than does the CPI for recreational

books. Indeed, our best estimates suggest that the CPI misstates the true inflation rate by

almost a factor of three in one part of our sample and gets the sign wrong in another.

Second, we show that there is significant price sensitivity for online book purchases at both

3 Goolsbee (2000; 2001) finds a large cross-price elasticity of online retail and online computers with respect to
physical retail prices. Ellison and Ellison (2001) find large elasticities for computer memory and motherboards
from data on a private computer parts retailer. Brown and Goolsbee (2002) and Scott Morton and Zettlemeyer
(2002) examine the impact of Internet shop-bots on prices of life insurance and for cars and find that the
Internet leads to significantly lower prices. Smith and Brynjolfsson (2001) examine customer behavior at a
book price comparison site but find that brand still matters a lot for consumers' click through probabilities.
sites. The demand at BN.com, however, is much more price sensitive, both to its own and

to the rival's price, than is demand at Amazon. Third, looking across different time periods,

our results show that using measured price dispersion to infer the degree of price

competition, as is commonly done in the literature, can be misleading.

The question of how pricing impacts consumer purchasing online is interesting in

and of itself, but also has implications for public policy questions. In this paper, we discuss

one such application, measuring the potential magnitude of retail outlet substitution bias in

the consumer price index arising from Internet commerce. The potential importance of

retail outlet substitution bias has been highlighted in other literature (see Schultze and Macki,

2002; Boskin et al., 1996; Reinsdorf, 1993) but this is one of the first pieces of direct

evidence on the subject and the only one relating to the rise of the Internet.

The paper proceeds as follows. Section I provides the background and describes the

data. Section II presents the methodology for translating sales ranks into sales quantities.

Section III presents price indices for Amazon and Barnes and Noble and assesses the impact

of price movements at Amazon and BN.com on retail outlet substitution bias in the CPI.

Section IV provides evidence on the demand elasticities. Section V briefly describes some

robustness checks. Section VI concludes.

I. Background and Data

Amazon began selling books online in 1995, one of the first electronic commerce

firms. By 1999, books were the second largest retail segment (after computers) sold over the

Internet (BCG, 2000). Online book sales grew from essentially nothing in 1995 to more

than $2 billion in 2000 (Forrester, 2001). Today such sales make up between 7.5% and 10%

of total book sales in the U.S (American Booksellers Association, 2002; Cader, 2001).
Within the online bookstore industry, the two dominant players are Amazon and Barnes and

Noble (BN.com). These two firms account for more than 85% of online book sales and

Amazon sells between 75 and 90 percent of that (New York Times, 2001; NetRatings, 2001;

Brynjolfsson & Smith, 2001).

More detail about the business operations of the online merchants can be found in

Rayport (1998). For purposes of this paper, what is relevant is that online bookstores tend

to have much larger selection of titles than even the largest physical bookstores. A large

superstore might have as many as 150,000 titles whereas Amazon and BN.com claim to have

millions of titles available (although for books outside of the top 200,000, this may involve

waiting two weeks or more to actually receive the book).

A customer visiting one of the sites and looking for a book would typically face a

screen giving the price of the book, the relative sales ranking at the site, information on the

shipping time/availability, a brief description of the book, customer reviews of the book and

other books and authors that are popular among people interested in the book, and the price

for a used version of the book (if available).

We collected data during three different weeks in 2001 on about 18,000 different

books from the websites of Amazon and BN.com. We did this by ISBN number. Since a

tiny fraction of books in print account for most book sales, building a large and

representative sample is not easy. To get books across several parts of the sales distribution,

we combined ISBN numbers from three sources. First, we included all books that appeared

on any Publishers' Weekly best-seller list from 1996 to March 2000 (predating our sample).

Second, we include all books that were searched for at Dealtime.com from August 25 to

November 1, 1999 as compiled in Smith and Brynjolfsson (2001).4 Third, we took a random

4 We thank Michael Smith and Erik Brynjolfsson for providing this list to us.
sample of about 3,000 books Book in Print (2000). In total, these three methodologies give

us approximately 26,000 ISBN numbers of which typically about 18,000 had price and rank

data at Amazon and about 13,000 had price and rank data at BN.com. The difference arises

because BN.com does not report values for books with rankings greater than about 630,000

whereas Amazon's are not censored (and go to over 2,000,000). We will address this

asymmetric censoring of rankings in our results below.

Our three samples were taken during the weeks of April, June and August of 2001.

During this period there were major price changes by both sellers. We do not look at price

changes over very short time horizons because of the way the ranks are updated at the sites.

Amazon claims that for books in the top 10,000 ranks, the rankings are based on the last 24-

hours and updated hourly. For books ranked 10,001-100,000, the ranks are updated once

per day. For books ranked greater than 100,000, the sales ranks are updated once per month

(Amazon, 2001). Many hundreds of thousands of books, however, have a rank but almost

certainly have less than one sale per month. The Chicago Sun-Times (2001) claims that for

these rarely purchased books, Amazon bases the rank the total sales since Amazon's

inception. BN.com claims to update all the rankings daily (BN.com, 2001).5

In the first period of our sample, taken during the week of April 13, 2001, prices had

been quite stable for some time. The general price structure at Amazon and BN.com was to

discount hardback books at 20% off their retail price, paperback books at 10% off, New

York Times bestsellers at 40% off and textbooks at no discount (some other types of books

were also sold with no discounts and there are periodic editor picks and the like that receive

5
Since BN.com provides rankings on tens of thousands of books that average far less than one sale per day,
this statement cannot be completely accurate. They would not provide us any more detail in their ranking
system (despite repeated requests).
further discounts). The sites do differ in their classifications of some of the books and

Amazon tends to use New York Times bestseller lists with a lag whereas BN.com does not.

Starting June 20th, 2001, Amazon conducted a two-week pricing experiment in

which it raised the prices of many of its books. Our second data collection occurred during

the week of June 23rd, 2001. Amazon announced the launch of free shipping for all buyers

purchasing more than two books while simultaneously increasing overall prices rather

significantly.6 During this period, they eliminated all discounts for most paperback books,

maintained no discount for textbooks, and reduced the discount on hardback books to 10%.

BN.com generally maintained their previous pricing structure.

The pricing regime of June did not last long. On July 3rd, BN.com launched free

shipping with the purchase of two items. At that time, BN.com vice chairman Steve Riggio

contrasted the BN.com strategy to Amazon’s by noting “we’re offering free shipping

without changing our prices or making any fine-print exceptions.” On July 4th, Amazon.com

removed the free shipping offer and changed prices again. The company claimed that the

two-week price change was merely an experiment and that it was intended to be short-lived.

In the third period of our sample (conducted during the week of August 3, 2001),

Amazon had reinstated the 20% discount but now applied it only to books over $20. Books

under $20 generally received no discount, nor did textbooks. In this new period, BN.com’s

policy was not explicitly stated but they appeared to move away from the standard discounts

of 10 and 20 percent for paper and hardback books.

Importantly for our estimation, the pricing at these sites is set at a general level. That

is, broad categories of books all receive the same discount off of the manufacturer’s

6 Amazon.com made the following statement during this time: "We've also changed our pricing on some

books, CDs, DVDs, and videos: for some products prices have stayed the same, for some products prices are
lower, and for some products we've reduced our discounts." (www.internetnews.com, July 2, 2001). However,
our observation from the data is that prices mainly increased.
suggested retail price. Individual book pricing appears to be done only for a very small

number of editor's picks. The sales of a particular book (relative to the book’s broad

category) does not seem to impact the book’s pricing. During this time period, price

differences between the sites mostly reflect differences in the prices charged for a particular

category of book or differences in the categorization scheme (for example, whether or not

the Chicago Manual of Style is classified as a textbook).

II. Computing Sales Quantities

Our basic approach is to translate the observed sales ranking of each book into a

measure of quantity. To do so, we need to know the probability distribution of book sales.

A standard distributional assumption for this type of rank data is a Pareto distribution (i.e., a

power law).7 In the Pareto distribution, the probability that an observation, s, exceeds some

level, S, is an exponential function

Pr( s > S ) = (k / S )
θ

where k and θ are the parameters of the distribution. The most important parameter

is θ, the shape parameter that indicates the relative frequency of large observations. If θ is 2,

for example, the probability of an event decreases in the square of the size. With a value of

1, it decreases linearly.

If there are a sufficient number of books to eliminate discreteness problems, the

probability that a book's sales exceed some level S can be approximated as (Rank-1)/(Total

Number of Books). Taking logs, we can translate sales into ranks according to

ln( Rank − 1) = c − θ ln(Sales) . (1)

7
Marden (1995) shows that rank data is approximated well by a Pareto distribution. More details on the Pareto
and its application can be found in Johnson and Kotz (1970) or Goolsbee (1999).
Evidence that the Pareto distribution fits well for books can be found using the

weekly Wall Street Journal book sales index which, unlike other bestseller lists, gives an index

of the actual quantity sold. This index is constructed by surveying Amazon.com, BN.com,

and several large brick and mortar book chains. Using the data from April to August 2001,

we regress log(rank – 1) on log sales for each book-week observation in the data set, as well

as weekly dummies. The regression specification fits very well. The R-squared of this

regression is 0.94. The estimated value of θ is 1.49.

We could use this estimate of θ to translate sales ranks into quantities in our main

sample, but sales online may have a different distribution than sales in stores. We are able to

get several independent estimates of θ strictly for online books (as described below) and they

are all quite close.

The first estimate of θ comes from a non-linear least squares regression of the form

Rank=A x (Sales)θ conducted for us by a single publisher on their own book sales at

Amazon. This was done for the subsample of this publisher’s books that had sales ranks at

Amazon in the top 15,000 over the course of one week. They conducted two regressions

giving estimates of 0.9 and 1.3. They did not provide us with standard errors on either

estimate.

For the second estimate, we conducted our own experiment. A publisher who

would not give us direct information on rankings and sales of their books did tell us of a title

they had with steady sales at Amazon.com of 14 copies per week (i.e., about two copies per

day). We observed this book to be ranked 14,468. We then purchased 6 copies of the book

in a 10-minute period and observed its rank rise to 2,854. Assuming 2 sales in a day

corresponded to the first ranking and 8 sales in a day corresponded to the second ranking,
we can solve for the implied Pareto shape parameter, θ. In this case it is equal to equal to

1.17.

Third, an author, Gene Weingarten did a similar experiment to ours with his own

book (see Weingarten, 2001). According to the author, his new book’s ranking was

1,484,129. Purchases of 20 copies in an hour sent the book to rank 1,297. Purchases of

another 5 copies moved the book to rank 1,025. Assuming that the daily sales at a rank of

1,484,129 is very close to zero, this implies a θ of 1.05.

Finally, Poynter (2000), gives an estimate of actual sales in seven different rank

ranges (e.g., ranks 450 to 750 average 90 sales per week). Taking the mid-point of his ranges

and regressing the log rank on the log sales yields an estimate of θ of 1.199 (with a standard

error of .102 and an R2 of .97).

Thus all of these experiments suggest fairly consistent estimates of θ in the relatively

tight range of 0.9 to 1.3 and can be used to translate ranks into sales. We will use 1.2 as the

basic estimate.

III. Price Indices and Price Dispersion

Given this estimate of the shape parameter, we can compute the implied sales for

any book that has a sales rank. Using these sales weights, we can also compute a price index

to compare prices across sites within a time period or across time within a site. Since the

BN.com books are censored at ranks of approximately 600,000, we restrict the sample here

to books with data on ranks in all periods at both sites.8 The price indices will allow us to

8
Given that the previous section demonstrated that sales are dropping almost linearly with the sales rank, there
is little impact of the cutoff rule on the results.
determine, with proper weighting, the rate of inflation at the online book stores as well as to

know which of the sites is more expensive.

We start, in Table 1, by showing that it can be somewhat misleading to use

unweighted prices. Previous work has, by necessity, not had sales weights and thus has

calculated equally weighted price indices. The first two panels show that the unweighted

prices are notably different from the weighted (where the sales weights here are just the

actual sales of the given book in the period as estimated using a Pareto shape parameter θ of

1.2 and normalizing the sales of the highest selling book to 1). Sales of inexpensive books

are much greater than the sales of expensive books, leading the sales-weighted average prices

to be less than half of the raw averages.9 The percent discount from the manufacturer's list

price is given in parentheses and shows the same major difference between the weighted and

unweighted data.

To construct a proper price index we use the sales quantities at the two sites and

compute a Fisher ideal index in each of the three periods (with BN.com providing the base

of 100.0 in each). This is reported in the middle panel of table 1. In the first period,

Amazon has lower prices than BN.com. In the second period, with Amazon's price increase

experiment, Amazon's prices are about 3.5% higher than BN.com's. In the week of August

3rd, Amazon's prices are again lower than at BN.com (note that these indices show only

9
These prices do not include shipping charges. As we discuss later, it is not obvious how to best include
shipping prices since the shipping charge schedule is non-linear and there are multiple shipping options
available at each site. Here, the inclusion of any shipping charges would raise BN.com and Amazon.com’s
prices symmetrically in period 1, as the two sites had identical shipping price schedules at that time. In period
2, Amazon’s price would fall relative to BN.com’s price, as Amazon was offering free standard shipping with
the purchase of two items. For a benchmark, the marginal price of shipping a third book at Amazon would be
zero, versus 0.99 cents at BN.com. In period 3, BN.com’s relative price would be lower it was offering free
standard shipping with the purchase of two items, while Amazon.com had reverted to its previous shipping
schedule.
prices at Amazon relative to BN.com at each point in time, not the prices within Amazon

across time).

To examine the extent of inflation in the same site over time, the bottom panel of

the table creates a Fisher Ideal, chain-weighted price index where the base period at each site

is 100.0 in the week of April 14th (note that the price indices here are not comparable across

sites, only across time for a given site). The indices show modest inflation at BN.com from

April to June followed by noticeable deflation from June to August. Prices at Amazon

behaved even more dramatically in this time period, showing almost 10 percent inflation in a

period of only 2 months followed by significant deflation over the next month and a half.

These dramatic price movements for online books could have potentially important

consequences for consumers. However, online book vendors are not sampled in

constructing the Consumer Price Index (CPI) for recreational books. As outlined by the

Bureau of Labor Statistics (Cage, 1996), the CPI relies on the Point of Purchase Survey to

determine what types of retail outlets the BLS should survey when including price quotes in

the CPI. These weights enter the sampling only with a lag of several years. In the case of

books, the weights in the CPI during our period are based on purchase patterns from 1995-

1998. At the start of that time, online books sales were virtually nonexistent. Even in 1998,

online books sales were significantly smaller than at the time of our sample. Since today

online sales account for close to 10% of book purchases, this could mean a serious retail

outlet bias.

In Table 2, we show the inflation rate at the two online sites over the period and the

inflation rate as given by the official CPI for recreational books. In the final column of the

table, we approximate the true inflation rate assuming that the CPI accurately reflects the

behavior of bricks and mortar booksellers and that it accurately reflects the behavior of
online booksellers other than Amazon and BN.com. We then recalculate a CPI for

recreation books, giving our estimated price series for Amazon and BN.com but giving them

a share weight of 8% of total books (with Amazon having 75 percent of that). As the table

indicates, the retail outlet bias caused by neglecting the Internet merchants is extreme. Our

best estimate of the inflation rate suggests that the CPI’s inflation rate for retail books is

mistaken by more than a factor of two in the first period and gets the wrong sign in the

second period. As the first (to our knowledge) micro estimate of the magnitude of retail

outlet substitution bias arising from the Internet, it suggests further examination.10

In addition to a price index, we can also compute the standard deviation of the

discount from the list price of books at each site to get a measure of price dispersion (which

is the measure typically used in the literature as a proxy for market power). Previous papers

have had only unweighted dispersion data when making their calculations. In table 3, the

weighted dispersion measures show similar dispersion across sites in periods one and three

and considerably more dispersion at Amazon in period two. In the unweighted data, there is

a bit more dispersion at Amazon in the first and third period than at BN.com and the

unweighted dispersion falls significantly in period two. In the framework of the existing

literature, which interprets price dispersion as evidence of market power, this would be

interpreted as approximately equal market power at the two sites with a significant increase

(or decrease depending which data one used) in Amazon's market power in period two.

Using actual quantity data, we will be able to estimate the amount of competition in the

different periods and thus to check the existing approach.

10
Notice that our calculations assume that the fraction of consumers purchasing books online is independent
of the ratio of online to bricks and mortar prices. Obviously, if the ratio of online to offline sales responds to
the price ratios, the outlet substitution bias that we identify could be magnified.
The data on price differences across sites and across time also motivates such

empirical analysis since it means there is likely to be considerable variation across sites for

price of the same books. Another way to think about the price variation is to note that,

pooling all the periods together, about 36 percent of the books have identical prices across

the two sites. Amazon's prices are higher for about 28 percent of the books and lower for

about 35 percent. Of the two-thirds of books where prices differ across site, about nine out

of ten of them have price differences of more than 5 percent.

IV. Estimating the Demand System

A. Empirical Approaches

Given the price variation across sites and across time and the measures of

quantity each period coming from the sales ranks, we can consider the price sensitivity of

online book sales. We start with each cross-section separately, and ask, essentially, whether

relative sales across sites are lower when the relative price is higher.

Calling the total sales of book b at site s during week t, Qbts , and log sales qbts , we

assume that the log sales of a book depends on book and site dummies, the characteristics of

the book at a site, denoted x, including things like the shipping time and the customer

reviews, and the log of the price at the site , pbts , and at the competitor site, pbt− s according to

qbts = f b + f s + β s pbts + α s pbt− s + Γ' xbts + ε bts . (2)

With cross-sectional book data across sites, we can estimate a relative elasticity of

substitution by estimating the relative demand as a function of relative prices

( qbts − qbt− s ) = ( f s − f − s ) + ( β s − α − s )( pbts ) + ( β − s − α s ) pbt− s + Γ' ( xbts − xbt− s ) + ηbt .

(3)
As the equation makes clear, the coefficients on log price are not the true elasticities

but rather a combination of the own and cross-price elasticities of demand because a change

in the price at one site affects relative demand in two ways. One is by reducing demand via

the negative own price elasticity. The other is by raising the demand at the other site via the

cross-price elasticity.

With more than one time period and with price changes over time, we can estimate

the own- and cross-price elasticities directly. Differencing the data across time, we see that

(qbts − qbts −1 ) = ( f t − f t −1 ) + ( β s )( pbts − pbts −1 ) − (α s )( pbt− s − pbt− s−1 ) + Γ' ( xbts − xbts −1 ) + ω bt
. (4)

which gives the demand coefficients α and β separately.11

To translate this model into one that we can use rank data for requires us only to

substitute the log sales rank for the log quantity in equation (4) according to the Pareto

relationship in (1):12

(rbts − rbts −1 ) = θ ( f t − f t −1 ) + θ ( β s )( pbts − pbts −1 ) − θ (α s )( pbt− s − pbt− s−1 ) + θΓ' ( xbts − xbts −1 ) + θω bt .

In other words, estimating the equations using log ranks, r, rather than actual quantities,

yields the correct elasticity but scaled up by the Pareto shape parameter, θ, which we

estimated above.

Results

A. Cross-Sectional Results

Before estimating the parametric model, we present suggestive evidence about price

sensitivity that does not rely on the Pareto assumption. Table 4 presents the mean

difference in log ranks for books where the prices are lower at Amazon than at BN.com, the

11
Of course, it is equivalent here or in (3) to include book and time period or book and site (respectively) fixed
effects rather than differencing the data.
12 For simplicity we refer to the dependent variable as the log rank rather than being more precise and calling it

the log of the sales rank minus one which we use in the empirical work.
same as at BN.com, and greater than at BN.com. In the pooled sample and in every period

individually, when the relative price is lower at Amazon, the sales ranks also tend to be lower

(meaning greater sales). In August of 2001, for example, among books whose prices were

higher at Amazon than at BN.com, Amazon's sales ranks averaged about 31 percent higher

than at BN.com. For books whose prices were the same at the two sites, Amazon ranks

were less than 1 percent lower than BN.com's. Among books whose prices were lower at

Amazon, the sales ranks at Amazon were about 61 percent lower than at BN.com.

In column 1 of Table 5, we present a probit specification. The dependent variable

takes the value of one if the sales rank at Amazon is higher than the sales rank at BN.com

(note: higher ranks correspond to lower sales). The independent variables are the relative

price at Amazon (PA/PB) and time dummies. The coefficient on the relative price shows a

large and statistically significant positive coefficient. That is, when Amazon.com has

relatively higher prices, it has relatively higher ranks (lower sales).

To compute a price elasticity, however, we invoke the Pareto assumption. Because

the sales rank data are censored at BN.com, we estimate equation (3) for that site using the

trimmed least absolute deviations (LAD) panel Tobit estimator of Honore (1992). Because

we don’t know the exact censoring point, we used the highest observed rank at BN.com in

the sample. We tried censoring all Amazon books at this same level but it had no impact on

the results so we report the OLS results for Amazon.

The results are reported in columns (2)-(4) of the table. The dependent variable is

the sales rank. The explanatory variables included in the regression are the price, a site

dummy, availability dummies (i.e., ships in 24 hours, 2-3 days, and so on) and dummies for

each individual book title. Including dummies for individual book titles is crucial. This
ensures that our identification derives from the differences in prices of the same book across

sites.

The top panel uses prices without shipping charges. As there are numerous shipping

choices (ground, air, priority, FedEx) and the prices depend on how many books one orders

according to a two-part tariff, it isn't clear what shipping price to use. The lower panel uses

prices that include the incremental shipping charge if a person was adding this as a second

book to their order. This was 99 cents at both sites in period 1, zero at Amazon and 99

cents and BN.com in period 2, and then zero at BN.com and 99 cents at Amazon in period

3. The results do not differ much between the two (nor with other choices of the shipping

price), so we will not include shipping in the subsequent regressions.

All price coefficients are highly significant and in the range of 2.1 to 2.5. With our

mean estimate of the Pareto shape parameter, this indicates an elasticity -2.5 and -3. In

words, this says that a one percent increase in the price of a book at site A relative to site B

reduces sales at A by 2.5 to 3 percent relative to sales of that book at site B.

B. Panel Results

The relative price elasticity above is notably large, suggesting the importance of

competition. To break out the own- and cross-price components of this number, however,

requires variation in prices across time. To get such variation, we use pairs of time periods

for each site. One reason to do this is that we can then use trimmed LAD estimation to

allow for the censoring problem with the Barnes and Noble data.13 A second reason to do

this is that it gives us a closer look at Amazon’s pricing changes the summer of 2001. We

present these results in table 8 for three cases—the change in log ranks between periods 1

13 Trimmed LAD estimation panel procedures for data sets with more than two time periods are not well-

developed. The survey of Chay and Powell (2001), for example, present trimmed LAD results only for pairs of
time periods rather than for the entire panel.
and 2, the change in log ranks between periods 2 and 3 and the “long-difference” estimator

comparing the change in log ranks between periods 1 and 3. The BN.com results use the

trimmed LAD.

Table 6 presents the results. Interestingly, the sum of the own price elasticity

at each site plus the cross-price elasticity at the other site do approximately equal the same

value (as assumed in our specification of equation (3)). However, this conceals an extreme

difference in the source of the relative price sensitivity across the two sites. BN.com has a

large own price elasticity with a small cross-price from Amazon. Amazon has the reverse.

With the Pareto parameter of 1.2, BN.com’s own-price elasticity of demand is around -3.5.

At Amazon, on the other hand, it is actually less than one in absolute value, at -0.45.

The low price elasticity of demand at Amazon is important and puzzling. Of course,

standard calculations for static imperfectly competitive markets suggest that a firm should

choose prices such that the elasticity of demand exceeds 1 in absolute value. However, we

are not the first to obtain results estimates of relatively inelastic demand for retail

establishments. For example, Hoch et. al. (1995) estimate store-level price elasticities of

demand of less than 1 in absolute value for many stores in the Dominick’s supermarket

chain. Note, however, that a firm maximizing dynamic profits might choose a price below

this static profit-maximizing level.14 Prices below the single-period profit-maximizing level

would be attractive in a growing market with consumer switching costs, for example. This

possibility has been raised in the popular press, where speculation abounds as to whether

Amazon's prices are sustainable or are artificially low (see, for example, Hansell, 2001).

When Amazon's growth stops, we may see prices rise substantially.

14 See Klemperer (1987) and Chevalier and Scharfstein (1996) for discussion.
A second factor to consider is that a one percent increase in the price at Amazon

reduces quantity by about 0.5 percent at Amazon but raises quantity at BN.com by 3.5

percent. Given that Amazon sells somewhere between 3 and 10 times as many books as

BN.com, this is very close to the same number of books, implying that every customer lost

by Amazon instead buys the book at BN.com. This is likely to be an unrealistically high

degree of switching but the data, at the least, seem to suggest that the cross-price effect is

important. The reverse is not true, however. Raising prices by one percent at BN.com

reduces sales about 4 percent but increases sales at Amazon by only about 0.2 percent.

Many of the lost customers from BN.com evidently do not just go buy the book from

Amazon.

Recall that previous research on Internet bookselling has used price dispersion as a

proxy for market power. Recalling Table 3, we showed that price dispersion between sellers

is quite volatile over the three time periods. Using the price dispersion data to infer market

power would lead one to have differing conclusions about the degree of market power for

each of the three periods. However, when we use our “quantity” data to estimate price

elasticities, we observe very little change in the price sensitivity/market power of the two

merchants across the different time periods, despite large shifts in the measured dispersion.

We checked the robustness of our results across books of different types and found

little difference in the panel specifications of including category-price interaction terms.

Table 7 shows the “long difference” panel specifications for both Amazon and Barnes and

Noble, allowing fiction and non-fiction books to have different own- and cross-price

elasticities of demand. While fiction books appear slightly more own- and cross-price elastic

in both sites’ specifications, differences in the coefficients for fiction and non-fiction books

are not statistically different from one another. Across our specifications, there was little
evidence that the price elasticity of demand varied systematically by type of book. We tried

many categories of books as well as formats. All showed the same basic results.

VI. Conclusion

This paper has used publicly available data on the prices and sales ranks of more

than 18,000 different books at Amazon and BN.com to estimate price indices and the

amount of price competition online. To do this, we develop a method of converting sales

ranks into actual quantity measures for every book. The results using such data indicate that

prices were much more variable online than in retail stores during the time period of this

sample and point to an important outlet substitution bias in the CPI for recreational books

over this time period. Second, the results show that there is significant price sensitivity of

online customers both to a site's own price as well as to leading rival’s price. This is much

more true at BN.com, however, where the own price elasticity of demand is close to -4 and

the cross-price elasticity very high, than at Amazon where the price elasticity is around -0.6,

and the cross-price elasticity is relatively small. The results also show that using price

dispersion as a proxy for market power is not appropriate in our data.

Taken together, our results point to Amazon as a clear market leader in the online

book business with BN.com serving as more of a price-taking fringe. The usefulness of the

sales rank data in allowing us to actually estimate the degree of market power in markets with

little publicly available quantity data raises the question of whether similar information could

be gathered for other industries.


Table 1: Prices and Price Indices
BN.com Price Amazon Price

Books Equally Weighted (% discount)


period 1 $24.23 (11.3%) $23.58 (15.4%)
period 2 $24.39 (10.9%) $26.75 (2.1%)
period 3 $24.51 (10.6%) $22.26 (19.2%)

Books Weighted by Sales (% discount)


period 1 $13.09 (34.9%) $14.32 (26.1%)
period 2 $13.24 (34.0%) $15.72 (18.4%)
period 3 $13.12 (33.4%) $15.25 (29.7%)

Price Index within Period, Across Site


period 1 (April 14, 2001) 100.0 96.7
period 2 (June 23rd, 2001) 100.0 103.4
period 3 (August 3rd, 2001) 100.0 95.3

Price Index within Site, Over Time


period 1 (April 14, 2001) 100.0 100.0
period 2 (June 23rd, 2001) 100.7 109.5
period 3 (August 3rd, 2001) 98.6 105.3

Notes: Authors' calculations as described in the text. When shipping is included, it is marginal shipping
assuming the customer is buying this as their second book. Sales weights in the second panel are current
period sales estimated using a Pareto shape parameter of 1.2 as described in the text.

Table 2: Inflation Using Different Price Indices (in percent)


Period CPI BN.com Amazon “True”
Inflation
Period 1 to 2 0.4 0.7 9.5 1.0
Period 2 to 3 0.1 -2.1 -3.8 -0.2
Period 1 to 3 0.5 -1.4 5.3 0.8
Notes: The inflation calculation uses the prices without shipping costs.

Table 3: Standard Deviation of Discount off of Suggested Retail Price (in percent)
Amazon BN Amazon BN
Weighted Weighted Unweighted Unweighted

April 14th, 2001 .119 .112 .128 .091


June 23rd, 2001 .162 .115 .067 .092
August 3rd, 2001 .114 .110 .131 .091

Notes: Authors' calculations.


Table 4: Relative Sales Ranks Across Sites as a Function of Relative Prices Compares ranks
as a function of the relative prices of observations at Amazon.com and BN.com.
PAMZN > PBN PAMZN = PBN PAMZN < PBN

Number of Books 10,089 13,297 12,602

Avg. ln (RankAMZN) - ln (RankBN)


Pooled: -.017 -.225 -.627

By Week:
April 14th, 2001 .232 -.142 -.662
June 23rd, 2001 -.151 -.569 -.580
August 3rd, 2001 .313 -.008 -.607

Notes: Authors' calculations. Negative values indicate lower ranks (i.e., higher sales) at Amazon.

Table 5: Cross-Sectional Evidence on Price Sensitivity of Demand.


(1) (3) (4) (5)
Probit of Trimmed LAD Trimmed LAD Trimmed LAD
RankAMZN>RankBN Period 1 Period 2 Period 3
Pooled

ln (PAMZN/ PBN) 2.573


(.055)
ln (P) 2.135 2.550 2.124
(0.080) (0.105) (.068)

Shipping Dums No Yes Yes Yes


Amazon Dum. No Yes Yes Yes
Time Dummies Yes No No No
ISBN Dummies No Yes Yes Yes

n 34,156 33,650 27,094 30,638

ln (P+Shipping) 2.276 1.919 2.458


(0.085) (0.095) (.069)

Shipping Dums Yes Yes Yes


Amazon Dum. Yes Yes Yes
Time Dummies No No No
ISBN Dummies Yes Yes Yes
n 33,650 27,094 30,638

Notes: Dependent variable in column 1 is the {0,1} variable of whether the rank is higher at Amazon than at
BN.com. Dependent variable in columns 2-4 is the log of the rank. This is censored as described in the text.
Standard errors are in parentheses.
Table 6: Two period panel estimates of online book demand system
Dep Var.: (1) (2) (3)
ln(Rank) t2 , t1 t3 ,t2 t3 , t1
BN.COM (trimmed LAD)
ln (Pown) 4.396 2.985 2.894
(0.182) (0.128) (0.128)
ln (Pcross) -3.825 -2.403 -2.696
(0.181) (0.128) (0.114)

n 24738 24738 24738

Shipping dummies Yes Yes Yes


Time x age dummies Yes Yes Yes
ISBN dummies Yes Yes Yes

AMAZON
ln (Pown) 0.262 0.256 0.371
(.032) (.048) (0.050)
ln (Pcross) -0.047 -0.131 -0.189
(.090) (.081) (0.073)

n 24738 24738 24738


R-squared 0.97 0.97 0.96

Shipping dummies Yes Yes Yes


Time x age dummies Yes Yes Yes
ISBN dummies Yes Yes Yes

Notes: The dependent variable is the log of the sales rank. This is censored as described in the text. Standard
errors are in parentheses. The cross price is the price for the same book at the competitor's site.

Table 7: Two period panel estimates of online book demand system by book type
Dep Var.: (1) (2)
ln(Rank) Amazon BN.com
Time periods t3,t1
ln (Pown) Fiction 0.491 3.732
(0.129) (0.471)
ln(Pown) Nonfiction 0.3469 3.142
(0.054) (0.150)
ln (Pcross) Fiction -0.419 -2.982
(0.232) (0.489)
ln(Pcross) Nonfiction -0.160 -2.482
(0.078) (0.145)

Notes: The dependent variable is the log of the sales rank. This is censored as described in the text. Standard
errors are in parentheses. The cross price is the price for the same book at the competitor's site.
Bibliography

Amazon, e-mail correspondence with customer service agent John Armstrong, May 15,
2000.

American Booksellers Association, Industry News, " Overall Book Sales Up Slightly for First
Six Months of '01," November 1, 2001, <http://www.bookweb.org/home/news/
btw/5182.html>, accessed 5/23/02.

Bailey, J. “Intermediation and Electronic Markets: Aggregation and Pricing in Internet


Commerce.” Ph.D Dissertation, Department of Electrical Engineering and Computer
Science, MIT, May 20, 1998.

Bakos, J. Yannis. “Reducing Buyer Search Costs: Implications for Electronic Marketplaces.”
Management Science 43 (1997): 1676-1692.

BN.com, e-mail correspondence with customer service agent Charlie, January 14, 2000.

Baye, Michael, and Morgan, John. “Information Gatekeepers on the Internet and the
Competitiveness of Homogenous Product Markets.” forthcoming, American Economic Review.

Boston Consulting Group, The State of Online Retailing 3.0, with Shop.org November 2000.

Brynjolfsson, E., and Smith, M. “Frictionless Commerce? A Comparison of Internet and


Conventional Retailers.” Management Science, 46 (April 2000): 563-585.
Cader, Michael, "Online Bookselling: The New Mythology in the Making," Publisher's Lunch e-
mail newsletter, April 16, 2001.

Cage, Richard, "New Mthodology for Selecting CPI Outlet Samples," Monthly Labor
Review, December 1996, < http://www.bls.gov/cpi/cpirc001.htm>, accessed 7/3/2002.

Carlton, D. and J. Chevalier, 2001, “Free Riding and Sales Strategies for the Internet”, The
Journal of Industrial Economics, XLIX, p. 441-462.

Chevalier, J. and D. Scharfstein, 1996, “Capital-Market Imperfections and Countercyclical


Markups: Theory and Evidence,” American Economic Review 86, p. 703-725.

Clay, K., R. Krishnan, and E. Wolff, “Prices and Price Dispersion on the Web: Evidence
from the Online Book Industry,” The Journal of Industrial Economics, XLIX, December
2001, p. 521-540.

Clemons, E.; Hann, I-H.; and Hitt, L. “The Nature of Competition in Electronic Markets:
An Empirical Investigation of Online Travel Agent Offerings.” Working Paper, June 1998,
available at http://grace.wharton.upenn.edu/~lhitt/e-travel.pdf.

Ellison, G. and S. F. Ellison, 2001, “Search, Obfuscation, and Price Elasticities on the
Internet,” MIT working paper.
Goolsbee, Austan, "Evidence on the High Income Laffer Curve from Six Decades of Tax
Reform" Brookings Papers on Economic Activity, 1999(2), pp. 1-47.

Goolsbee, Austan. “In a World Without Borders: The Impact of Taxes on Internet
Commerce.” Quarterly Journal of Economics 115 (May 2000): 561-576.

Hansell, Saul, "Listen Up! It's Time for a Profit, A Front-Row Seat as Amazon Gets
Serious," New York Times, May 20, 2001.

Hoch, S., Kim, B., Montgomery, A., and Rossi, P., 1995, "Determinants of Store-level Price
Elasticity," Journal of Marketing Research, 32, 17-29.

Italie, Hilltel, "Amazon's Bottom 10: Not Exactly Page Turners," Chicago Sun-Times, August
17, 2001.

Kuttner, R. 1998, “The Net: A Market Too Perfect for Profits,” Business Week 20, May 11,
1998.

Lee, H. G. "Do Electronic Marketplaces Lower the Price of Goods?" Communications


Marden, John, Analyzing and Modeling Rank Data, Chapman and Hall, (London, England),
of the ACM 41 (1997): 73-80.

Poynter, Daniel, "Publishing Poynters," April-June 2000, <http://parapub.com/


getpage.cfm?file=newsletter/News0400.html&userid=1035600>, accessed June 5, 2002.

Rayport, Jeffrey F.; Knoop, Carin-Isabel; Reavis, Cate, "Selling Books Online in Mid-1998,"
HBS Case 9-899-038, 8/17/1998.

Reinsdorf, M., "Price Dispersion, Seller Substitution, and the U.S. CPI," Bureau of Labor
Statistics Working Paper 252, 1993.

Schultze, Charles and Christopher Mackie, Editors, At What Price? Conceptualizing and
Measuring Cost-of-Living and Price Indexes, Committee on National Statistics, National Research
Council, National Academy Press (Washington, D.C.), 2002.

Smith, M. and E. Brynjolfsson 2001, “Consumer Decision-making at an Internet Shopbot:


Brand Still Matters,” The Journal of Industrial Economics, XLIX, December 2001, p. 541-
558.

Smith, M.D.; Bailey, J.; and Brynjolfsson, E. “Understanding Digital Markets: Review and
Assessment.” In Erik Brynjolfsson and Brian Kahin, eds. Understanding The Digital Economy.
Cambridge, MA: MIT Press, 2000.

Smith, Michael D and Brynjolfsson, Erik, Consumer Decision-Making at an Internet


Shopbot, MIT Working Paper, July 2001.

Sorensen, A., 2000, “Equilibrium Price Dispersion in Retail Markets for Prescription
Drugs”, Journal of Political Economy 108, p. 833-850.
Weingarten, Gene, "Below the Beltway," Washington Post, June 17, 2001.

White, E. 2000, “No Comparison. Shopping Bots were Supposed to Unleash Brutal Price
Wars. Why Haven’t They?” Wall Street Journal, October 23 2000, R18.

You might also like