Homework 4
Homework 4
Homework 4
Homework 4
Prof. Gavin Feng
Due at Mar. 1st 5pm
The first line in the script above extracts all rows in mayo_DF whenever the two conditions prod-
uct==“Hellmans Mayo 32oz” and market==“Jewel” are TRUE. Note that & is the logical AND operator (for
other purposes you may use |, the logical OR). All of the rows satisfying these conditions are then assigned
to the new data frame mayo_Hellmans_Jewel. You can now summarize the newly created data frame.
Now calculate the summary statistics for all four product/market combinations. You’ll see that we don’t
have the same number of observations in the two markets (the first 16 weeks at Jewel-Osco are missing), and
that total unit and dollar sales are higher in the Kraft Central region compared to Jewel (as expected).
1
Create a price variable
Construct an average price variable by dividing dollar sales by unit sales:
mayo_DF$price = mayo_DF$sales_dollars/mayo_DF$sales_units
1. Explain how to interpret this price variable — how does it differ from a product price at the store level?
2. Provide summary statistics (mean, median, and standard deviation) for the product prices, separately
for each product/market combination.Report the statistics in a table. Are the means of prices similar
across the Kraft Central region and Jewel-Osco? Is there more price variation at Jewel or in the Central
Region? Why? What is the implication for our ability to estimate price elasticities with either account
level data or data in a large geographic market?
1.0
0.6
0 20 40 60 80 100
mayo_DF$week
Note the type = “o” option, which stands for “overplotted points and lines” and tells R to connect the
displayed data points with lines. The default for the type option is “p”, and then only data points are plotted,
while “l” produces lines without data points.
The graph created above is messy. Why? Because you plotted the price data for both products in both
markets on the same graph. Instead, we will create time-series plots separately for different product/market
combinations.
mayo_Hellmans_Jewel = subset(mayo_DF, product=="Hellmans Mayo 32oz" & market=="Jewel")
plot(mayo_Hellmans_Jewel$week, mayo_Hellmans_Jewel$price,
type = "o", pch = 21, lwd = 0.4, bg = "limegreen",
main = "Prices of Hellman's Mayo at Jewel-Osco", xlab = "Week", ylab = "Price")
2
Prices of Hellman's Mayo at Jewel−Osco
1.10
Price
0.95
20 40 60 80 100
Week
You can now repeat this process for all product/market combinations.
3. Provide time-series plots for all product/market combinations using your favorite method. There are
some visible differences between the prices at Jewel and the prices in the Kraft Central region. Why?
5. Estimate the log-linear demand model and provide the regression results separately for all four prod-
uct/market combinations. Discuss the results. Is demand more elastic at Jewel-Osco or in the Kraft
Central region? What is the reason for the observed difference in the elasticities?
6. Using the regression results for the log-linear demand model, calculate the percentage change in unit
sales for a simultaneous 10 percent increase in the price of Kraft and Hellman’s mayo at Jewel-Osco.
Use the exact formula.
Cross-price elasticities
Now we allow for both own and cross-price effects in the log-linear demand model. We first need to reshape
the data such that we have columns with unit sales and price information for both products.
3
The first step is to extract only the data that we need to create the final data frame used for estimation:
mayo_DF_extract = mayo_DF[, c("market", "product", "week", "sales_units", "price")]
head(mayo_DF_extract,3)