Assignment #1 Descriptive Statistics Data Analysis Plan
Assignment #1 Descriptive Statistics Data Analysis Plan
Assignment Steps:
Step #1: Review the provided STAT200 data set file. (Note: This data set will be used for all
three of this term’s written assignments).
The provided data set is a subsample of 30 data points from the US Department of Labor’s
Consumer Expenditure Surveys (CE) and provides information about the composition of
households and their annual expenditures (https://www.bls.gov/cex/). Detailed information on
the sample and variables is included with the data set file; please carefully review this
information to familiarize yourself with the data (Note: This information will be used in
Assignment #2 to describe the dataset).
The "scenario" you describe is just to explain why you might be motivated to do this analysis. It
does NOT have to be correct and/or true. As an example:
*Note that this is an example and that the data variables may not be the same as in our data
set. You MUST create the scenario using the variables in our data set.*
"I am a 35 year old married parent (head of household) with one child. I earn $97,000 per year,
and I spend $8,200 per year on food (including dining out): $1,800 of it is on meat, $0 on bakery
items, and $150 on fruits (I'm on the Keto diet, so I spend more on protein than carbs). I want
to determine how my income and expenditures relate to other people in the United States."
You could also do something like this (but realize that this statement uses variables that are
NOT in our data set):
"I'm a 16-year old medical doctor (child prodigy) who makes $248,000 a year. I have no kids,
but I do have some college debt to pay off ($2,648 per month--med school is expensive!). I also
This statement is probably not true, but it will make me smile while reading it. :) The last
statement is important, though. No one analyzes data sets for fun (even I don't do that), so
briefly explain why you are undertaking this process (other than you are told to do it for this
class).
(NOTE: You do not need to put your actual income or any expenditure values in here--that is
personal information that I do not need to know).
A key point here is to ensure you use variables from the data set that accompany our section.
The variables described above may not actually apply to our section (they are “general” or
“generic”).
➢ Task 2: Select variables for analysis that match the scenario developed in Task 1. The data
set provides information on household consumption; there are socioeconomic variables and
expenditures variables. The socioeconomic variable names start with “SE-” and the expenditure
variable names start with a “USD;” all expenditures are in US dollars. All students must use
income as one variable. Select two additional socioeconomic variables (one qualitative and
one quantitative) and two expenditures for your analysis that match the scenario you
developed for Task 1.
For instance, using the example scenario of a 35 year old single parent with a high school
diploma and one child, you could select “income,” “education,” and “number of children” as
socioeconomic variables and then pick two household expenditure items to show the
distribution of costs and compare that with your income. For this assignment, though, only
select variables that are included with our section’s data set (and income, education, and
number of children may or may not be in our data set; these are just example variables include
to aid in understanding).
When selecting variables, think about the following three questions:
Answer these questions in the section on the template labeled: “Reason(s) for Selecting the
Variables and Expected Outcome(s):”.
➢ Task 4: Determine appropriate graph and/or table for each of the selected variables. Select
one graph or table for each variable (Please see below table for list of graphs and tables). When
determining the graphs and tables, think about what is appropriate given the level of
measurement and type of variable. Recommend referring to the text and information posted in
our LEO classroom to help with this task (Note: you will use this information to provide a
rationale for your choice of graphs and/or tables).
Types of Graphs Types of Tables
● Pie Chart ● Frequency Table
● Bar Chart ● Relative Frequency Table
● Histogram ● Grouped Frequency Table
● Box Plots (also known as Box-and-Whiskers Plot)
Here are the main sections for this assignment (i.e., completing the plan template):
✓ Identifying Information. Fill in information on name, class, instructor, and date.
✓ Scenario. In this section, briefly (2-3 sentences) describe the scenario you developed in
Step #2, Task 1.
✓ Complete Table 1: Variables Selected for the Analysis. Enter information the variables
selected for analysis in Step #2, Task 2. For each selected variable be sure to include its:
name as listed in the data set, description, and variable type.
✓ Reason(s) for Selecting the Variables and Expected Outcome(s): In this section, for each
selected variable, please answer the following questions:
✓ Why did I choose this variable?
✓ What interests me about this variable?
✓ What do I think will be the outcome?