Day 1 Notesheet
Day 1 Notesheet
Population
It’s hard to measure everyone (population), so we take a sample
Sample and make ____________________ about the population.
Biased Sampling
SW Tennessee Community College: The homepage of their website boasts: “Our overall graduation
placement rate is ______, with 91% working in their field of study.” - www.southwest.tn.edu (6/9/2020)
Population
Sample à _________
Undercoverage: When part of the population has a __________________ of being included in a sample.
-Leads to bias.
-Ex: excluding the students who didn’t graduate.
Rogers State University (Oklahoma): In a recent report*, the University found that about 75% of
graduates were pursuing another degree or had found full-time employment by their final semester.
The same report shows that the response rate to the University’s questions/surveys was only 20%.
Population
Sample à _______________
*Employment and Continuing Education for Graduating Students 2017-2019 AY 3-Year Aggregation
(downloaded 6/9/2020 from https://www.rsu.edu/about/accountability-academics/student-outcomes/)
Question: How could bias in the sampling method have affected the graduate study/employment rate
estimate from Rogers State University?
Model Response:
“Graduates who didn’t find post-grad employment may be ashamed, making them ______________ to
respond to the survey. Therefore, this sampling method may include a lower proportion of unemployed
graduates than in the full population. This produces an _________________ of the true percentage of
_____ graduates who are actually starting full-time work.”
Simple Random Sample (SRS): a sampling method in which every possible group of individuals in the
population has an __________________ of being selected.
Instead: We could have ______________________ the NYC population, tested those who were
sampled, and gotten an __________________ of the number of people infected.
Question: Describe how you would implement a simple random sample (SRS) of 1,000 NYC residents to
test for COVID.
Model Response:
“Assign every individual in NYC an integer _______ (where N is the population size of NYC). Use a
random number generator to obtain 1,000 integers between 1 – N, __________________________.
Administer the COVID test to the 1,000 individuals whose numbers were selected.”
Image courtesy of Professor Joseph Blitzstein (i.e. the best stats prof in the country).
See his “Harvard Thinks Big” talk on this problem: https://youtu.be/dzFf3r1yph8