Lecture
Lecture
Objectives
Learn basic vocabulary of statistics.
Distinguish between population and sample.
Distinguish among the two types of populations.
“The goal of the branch of mathematics(?) called statistics is to provide information so that
informed decisions can be made. Statistics is not a branch of mathematics, surely, we use
numbers, formulas but then they come from physics not in mathematics.
Statistics is a science it precedes forward base on inductive reasoning, tries to learn about the
larger whole base on your smaller sample. Mathematics, on the other hand, trust from the larger
bring down to be narrower- deductive reasoning.
Statistics is branch of science which enable to filter in your encounter so that you can be better
prepared for the decisions you make in your daily life.
STATICTICS
Target Population
Sampled Population
Sample
Population Sample
Read each of the shortened survey reports below. For each report:
a. Identify the population.
b. Identify the sample.
c. Determine whether the highlighted value is a parameter or statistic.
1. After an airplane security scare on Christmas day, 2009, the Gallup organization
interviewed 542 American air travelers about increased security measures at airports. The
report stated that 78% of American air travelers are in favor of United Staes airports
using full-body-scan imaging on airline passengers.
2. Rasmussen Reports also conducted a survey in response to the airport security scare on
Christmas day, 2009. The national telephone survey of 1000 adult Americans found that
59% of Americans surveyed favor racial profiling as a means of determining which
passengers to search at airport security checkpoints.
Two Branches of Statistics
The branch of descriptive statistics, as a science, gathers, sorts, summarizes, and displays
the data.
The branch of inferential statistics, as a science, involves using descriptive statistics to
estimate population parameters.
Source: Rosenstiel, Tom and Amy Mitchell. “Overview.” The State of the News Media: An annual Report on
American Journalism, Pew Research Center’s Project for Excellence in Journalism. 2011.
http:/stateofthemedia.org/2011/overview-2/ (12 Dec.2011).
Identify the descriptive and inferential statistics used in this excerpt from their article.
SECTION 1.2
Data Classification
Objectives:
o Classify data as
Qualitative or quantitative;
Discrete, continuous, or neither; and
Nominal, ordinal, interval, or ratio.
Qualitative vs. Quantitative Data
Qualitative Quantitative
Classify the following ass either qualitative or quantitative… What are you assuming about the
described variables?
a. Shades of red paint in a home improvement store
b. Rankings of the most popular paint colors for the season
c. Amount of red primary dye necessary to make one gallon of each of red paint.
d. Numbers of paint choices available at several stores
Q1. Is the variable “number of toes” a qualitative (categorical) or a quantitative (numeric)
variable?
Continuous vs. Discrete Data
Discrete data are quantitative data that can take on only particular values and are usually counts.
Continuous data are quantitative data that can take on any value in a given interval and are
usually measurements.
Example 1.5: Classifying Data ass Continuous or Discrete
a. Suppose all students in a statistics class were asked what pizza topping is their favorite.
Explain why these data are the nominal level of measurement.
b. Suppose instead that you wish to know the number of students whose favorite pizza
toppings is sausage. Explain why this data value is not at the nominal level of
measurement.
Data at the ordinal level of measurement are qualitative data that can be arranged in a
meaningful order, but calculations such as addition or division do not make sense.
Example 1.7: Classifying Data as Nominal or Ordinal
The birth years of your classmates are collected. What level of measurement are these data?
Data at the ratio level of measurement are quantitative data that can be ordered, differences
between data entries are meaningful, and the zero point indicates the absence of something.
Example 1.9: Classifying Data by the level of Measurement
Consider the ages in whole years of US presidents when they were inaugurated. What level of
measurement are these data?
Q3: Give an example of a ratio-level variable not provided in the slides or text.
Example 1.10: Classifying Data
Determine the following classifications for the given data sets: qualitative or quantitative;
discrete, continuous, or neither; and level of measurement.
a. Finishing times for runners in the Labor Day 10k race.
b. Colors contained in a box of crayons.
c. Boiling points (on the Celsius scale) for various caramel candies.
d. The top ten Spring Break destinations as ranked by MTV.
Section 1.3
The process of a Statistical Study
Objectives:
o Describe the process of a statistical study.
o Understand the primary sampling schemes.
o Identify various types of studies.
Neurologists want to study the effect of vitamin C on nerve disorders. The goal of the study
is to see if taking an intravenous dose of vitamin c will reduce the amount of nerve pain
reported by patients. Identify the population of interest and the variables in this study.
An observational study observes data that already exist.
An experiment generates data to help identify cause-and-effect relationships.
Note: these are the “proper” definitions as used by scientists. A statistician will refer to any “theoretical” data
collection as an experiment. This differences in terminology comes from the fact that statisticians will experiment to
better understand their field of study.
Here is an interesting question: How do you know if a sample representative of the population?
Convenience Sample
-the sample is convenient for the researcher to select.
Placebo – is a substance that appears identical to the actual treatment but contains no intrinsic
beneficial elements.
Placebo Effect – is a response to the power of suggestion, rather than the treatment itself, by
participants of an experiment.
Single -Blind experiment – subjects do not know if they are in the control group or the treatment
group, but the people interacting with the subjects in the experiment know in which group each
subject has been placed.
Double-Blind experiment - - neither the subjects nor the people interacting with the subjects
know to which group each subject belong.
Example 1.16: Analyzing an Experiment
Consider the study from Example 1.11, in which neurologists want to determine if taking an
intravenous dose of vitamin C will reduce the amount of nerve pain reported by patients. Suppose that
the study was narrowed to focus only on patients with the nerve disorder, multiple sclerosis (MS).
After study approval, the neurologists solicit volunteers who are patients with MS who are reporting
nerve pain. The participants are then randomly assigned to two groups, each having 20 participants.
Participants in Group A are administered intravenous doses of vitamin C, and their nerve pain is
tracked. Participants in Group B are administered intravenous doses of saline (which has no active
ingredients) and their pain levels are also tracked. The patients are not told which of the two groups
they are in; however, the nurses administering the IVs are aware of the group assignments. After a
predetermined length of time, the amounts of pain reported by the separate groups are compared to
determine if an intravenous dose of vitamin C will reduce the amount of nerve pain.
a. identify the explanatory and response variables.
b. What is the treatment?
c. Which group is the treatment group and which group is the control group?
d. What is the purpose of administering saline to Group B?
e. Is this a single-blind or double-blind study?
Institutional Review Boards
An Institutional Review Board (IRB) is a group of people who review the design of the study to make
sure that it is appropriate and that no unnecessary harm will come to the subjects involved.
Informed Consent involves completely disclosing to participants the goals and procedures involved in a
study and obtaining their agreement to participate.
SECTION 2.1
FREQUENCY DISTRIBUTIONS
Objectives
o Construct a frequency distribution.
o Create an ungrouped frequency distribution.
o Create a grouped frequency distribution.
Color Frequency
Blue 2
brown 5
Green 1
Q1: What is the difference between a grouped and an ungrouped frequency distribution?
Create a frequency distribution using five classes for the list of 3-D TV prices given in Table 2.2.
Solution
Because we were told how many classes to include, we will begin by deciding on a class width.
Subtract the lowest data value from the highest and divide it by the number of classes. As shown
below.
1999-1595 = 80.8 ≈ 81
5
This would give us a class width of $81.
We will stop here and consider some options. Choosing a class width of $81 does seem perfectly
reasonable from a theoretical point of view. However, one should consider the impression
created by having TV prices grouped in intervals of $81. Can you imagine presenting this data to
a client? Instead, it would be more reasonable to group TV prices by intervals of $100.
Therefore, we will choose our class width to be $100.
Now let’s continue building the class limits. Adding the class width of $100 to $1500, we obtain
a second lower class limit of $1600. We continue in this fashion until we have five lower class
limits, one for each of our five classes.
Finally, we need to determine appropriate upper-class limits. Again, be reasonable. Remember,
too, that the classes are not allowed to overlap.
Because the data are in whole dollar amounts, it makes sense to choose upper class limits that are
one dollar less than the next lower limit. The classes we have come up with are as followa.
3-D TV Prices
Class Frequency
$1500-$1599
$1600-$1699
$1700-$1799
$1800-$1899
$1900-$1999
Note that the last upper class limit is also the maximum value in the data set. This will not
necessarily occur in every frequency table. However, we have included all the data values in our
range of classes, so no adjustments to the classes are necessary.
Tabulating the number of data values that occur in each class produces the following frequency
table.
3-D TV Prices
Class Frequency
$1500-$1599 2
$1600-$1699 5
$1700-$1799 4
$1800-$1899 5
$1900-$1999 4
Note that the sum of the frequency column should equal the number of data values in the set.
Check for yourself that this is true.
Class Boundary – is the value that lies halfway between the upper limit of one class and the
lower limit of the next class. After finding one class boundary, add or(subtract) the class width to
find the next class boundary. The boundaries of a class are typically given inn interval form:
boundary-upper boundary.