Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
13 views

Class-02-IndividualAssignment-Questions

Uploaded by

ayushvats08
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Class-02-IndividualAssignment-Questions

Uploaded by

ayushvats08
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Decision 610: Probability and Statistics

Class 2 Individual Assignment

General instructions
• This assignment has a total of 10 questions for a total of 10 points.
• This assignment is covered under the Honor Code of the Fuqua School of Business at Duke University.
• This assignment is due by the deadline listed on Canvas. Except for the submission deadline there is
no time limit to complete the assignment.
• You are allowed to discuss this assignment within your team. However, the assignment requires
individual submissions, and each team member must submit their own answers. No communication
with other teams is allowed.
• This assignment is open books and open notes. You may only access the material that is intended
for this course and that is made available through the course website and the course packet, as well
as your notes. You are not permitted to obtain any course materials (including handouts, readings,
assignments, etc.) or any solutions from other Fuqua students or any other source.
• You may use a computer in performing calculations. However, when requested by the question, you
are expected to include intermediate steps as well as references to the formulas used.
• When your are done working on your assignment, make sure to submit your responses by clicking on
the “Submit” button; unsubmitted answers are not recorded.
• You can submit your answers as many times as you wish before the deadline. Only your last submission
will be graded. If you take the assignment more than once, your previous answers will be deleted and
no answer will be recorded until you submit your new responses.
• Late or missing submissions will receive no credit.
• There are multiple question types on this assignment. Details for each question type are:
– Multiple choice questions: only one answer is correct, i.e., select the single best answer from those
provided.
– Multiple answer questions: only select correct answers from those provided (there may be more
than one). You will earn points for each correct selection and negative points for each incorrect
selection.
– Numeric answer questions: make sure to enter only numeric values with sufficient precision. For
non-integer numbers, please report either two significant digits (i.e., two non-zero digits after the
last zero) or three digits after the decimal, whichever is more informative. Please use a period
(not a comma) for the decimal separator.
– Essay (short answer) questions: answer each question concisely but thoroughly and within the
prespecified limit on the number of words.
– File upload questions: the number and type of files for upload might be limited as specified by
each question. No other upload is allowed. Your uploaded file(s) must detail your work and the
process you followed to reach your answer.
• Questions that are worth zero points are optional. No credit is given or lost. These questions
simply provide an opportunity for extra practice and/or for thinking about more advanced topics.
S&P1500 CEO Annual Compensation Data: Summarizing Data
The data file CEOCompensation Data.xlsx contains data on the 2022 total annual compensation of CEOs
of S&P1500 companies, as reported in SEC filings. The compensation is recorded in million of US dollars,
rounded to the nearest million.

S&P1500 CEO annual compensation data: summarizing data


The data file CEOCompensation Data.xlsx contains data on the 2022 total annual compensation of CEOs
of S&P1500 companies, as reported in SEC filings. The compensation is recorded in million of US dollars,
rounded to the nearest million. The following questions will help guide your analysis of the CEO compensation
dataset.
1. CEO Data (1 point)
Is the average CEO compensation smaller than, equal to, or larger than the median CEO compensation?
⃝ The average CEO compensation is smaller than the median CEO compensation
⃝ The average CEO compensation is equal to the median CEO compensation
⃝ The average CEO compensation is larger than the median CEO compensation
2. CEO Data (1 point)
What is the standard deviation of the CEO compensation (in million of US dollars)?
3. CEO Data (1 point)
What is the proportion of CEOs whose annual compensation is $5M? (Hint. First compute how many
CEOs in the dataset have annual compensation equal to $5M.)
⃝ Less than or equal to 0.15 (15%)
⃝ Greater than 0.15 but less than or equal to 0.30 (15% – 30%)
⃝ Greater than 0.30 but less than or equal to 0.70 (30% – 70%)
⃝ Greater than 0.70 but less than or equal to 0.85 (70% – 85%)
⃝ Greater than 0.85 (85%)
4. CEO Data (1 point)
What is the proportion of CEOs whose compensation is less than or equal to $5 million?

S&P1500 CEO annual compensation data: from data to a random variable


A succinct way to represent a large dataset is with a frequency table, which reports the frequencies of each
value in the dataset. The following instructions will help you build a frequency table for the 2022 CEO
compensation data in the file CEOCompensation Data.xlsx and connect it with probability distributions.
To create the frequency table, please follow the following steps.

(i) Identify the range of CEO compensation.


Note that the CEO compensation values are integers ranging from 0,1,. . . , 226, and enter
these values in a new worksheet in, say, column A. (There are multiple ways to do this. For
instance, you could type in 0 in cell A2, type 1 in cell A3, highlight both cells A2 and A3 and
then drag the bottom right corner until row 228.)
(ii) Use the COUNTIF function to calculate the frequency of each value.
In cell B2 write =COUNTIF(CEOData!C$2:C$1412,A2) and then double-click on the bottom
right corner to compute the frequency for the rest of the values. You can check that all
data entries are properly accounted for by computing the sum of all of the frequencies and
making sure that it equals 1411. To do this you can type =SUM(B2:B228) in a new cell and
observe whether it matches the number of data entries.

Page 2
(iii) Compute the proportion of each unique value.
To obtain the proportion of data entries for each unique value, note that there are 1411
entries in the dataset and write =B2/1411 in cell C2. You can then double-click on the bottom
right corner to compute the proportion for the rest of the values. Calculate =SUM(C2:C228)
in a new cell to check that all of the proportions add up to 1.
(iv) Visualize the data.
To visualize the data, you can now add a chart by selecting columns A and C (or just selecting
cells A2:A228 and C2:C228), and then choosing from the Column Chart icon on the Insert
ribbon.

A top business school will invite one of the S&P 1500 CEOs as next graduation speaker. Since the
identity of that CEO is not known, their annual compensation is a random variable which we denote with
X:
X = Annual compensation of the graduation speaker CEO.
Note that the distribution of the random variable X is derived from the frequency table constructed using
the CEO compensation data found in CEOCompensation Data.xlsx.
5. CEO Data (1 point)
Calculate the standard deviation of the random variable X (in million of US dollars).
6. CEO Data (2 points)
What is P(X > 5)?

S&P1500 CEO annual compensation data: sum of random variables


Random variables allow for the easy computation of summary measures of linear transformations and sums.
For instance, if X is a given random variable and a is a multiplicative scaling factor, then the quantity
aX is a random variable that transforms X by multiplying each of its outcomes by a. The scaling factor a
affects both the expected value and the variance of aX, and one has that

E[aX] = aE[X] and Var[aX] = a2 Var[X]. (1)

Instead, if c is an additive (shift) constant, then the quantity X + c is a random variable that transforms X
by shifting each of its outcomes by c. In this case, the additive constat c affects only the expected value of
X + c:
E[X + c] = E[X] + c and Var[X + c] = Var[X]. (2)
By combining (1) and (3), we also have that the random variable aX + c has expected value and variance
respectively given by

E[aX + c] = aE[X] + c and Var[aX + c] = a2 Var[X]. (3)

If X and Y are two given random variables, a, b are two multiplicative scaling factors, and c is an additive
(shift) constant, then the expected value of aX + bY + c is given by

E[aX + bY + c] = aE[X] + bE[Y ] + c.

Furthermore, if the random variables X and Y are also independent, then

Var[aX + bY + c] = a2 Var[X] + b2 Var[Y ].

7. CEO Data (0 points)


Four business schools independently plan to invite a S&P1500 CEO to be their graduation speaker. What
is the expected value of the average of the annual compensations of the CEOs speaking at graduation
at these four schools?
(Hint. Let X1 be the random variable that denotes the compensation of the CEO speaking at the first
school, X2 be the random variable that denotes the compensation of the CEO speaking at the second

Page 3
school, and X3 and X4 , respectively be the random variables of the CEOs speaking at the third and
fourth school. The question is asking to consider the random variable
X1 + X2 + X3 + X4
X= ,
4
which represents the average compensation of four CEO graduation speakers, and compute its expected
value.)

8. CEO Data (0 points)


What is the standard deviation of that average? (Hint. What is the standard deviation of the random
variable X defined earlier?)

Sampling New York City yellow cab data


Open the data file YellowCab TripData Sample.xlsx to obtain a sample of n = 100 New York City Yellow
cab trip records. Your sample is selected at random from the dataset of all New York City Yellow cab trip
annual records maintained by the New York City Taxi & Limousine Commission.

To visualize your sample, enter your unique sample code in cell D1 of the worksheet
Sample. Your unique sample code is listed among your Canvas grades for this course.

9. Sampling NYC yellow cab data (1 point)


What is your unique sample code? (Hint. Your unique sample code is listed among your Canvas grades
for this course.)

10. Sampling NYC yellow cab data (2 points)


What is the average price of a yellow cab ride in your sample?

Page 4

You might also like