MATH 121 (Chapter 1) - Nature of Statistics
MATH 121 (Chapter 1) - Nature of Statistics
MATH 121 (Chapter 1) - Nature of Statistics
1 Nature of Statistics
Learning Objectives
After completing this chapter, the students will able to:
Define Statistics.
Distinguish between descriptive statistics and inferential statistics.
Differentiate parametric and statistic.
Compare and contrast the sources of data.
Differentiate constant and variable.
Identify and explain the types of data.
Differentiate experimental and mathematical variables.
Classify variables as discrete and continuous.
List and describe the four levels of measurements.
Identify and explain the sampling techniques.
Discuss the methods of collecting and presenting data.
Evaluate summations of notations.
Chapter Outline
1.1 Introduction
1.2 Division of Statistics
1.3 Parameter and Statistic
1.4 Sources of Data
1.5 Constant and variable
1.6 Types of Data
1.7 Classification of Variables
1.8 Levels of Measurements
1.9 Sampling Techniques
1.10 Methods of Collecting Data
1.11 Methods of Presenting Data
1.12 Summation Notation Sigma∑
1.1 Introduction
.......
1
Everyday we encounter statistics. Some company advertisement use statistics is
that more customers would prefer their product over competitors such as a certain
petroleum company claims that 60% of fuel consumers preferred their products compared
to other fuel companies
Statistics is also used to show the quality of a product just like the claim of safeguard
soap; the company advertises that their soap can kill 99.99% of germs. There is a wide
application of statistics in different field such as astronomy, business education, sciences
etc.
The origin of modern statistics may be traced to two areas of interest which, on
the surface, have very little common: games of chance or what we call political science.
In eighteen century studies in probability led to mathematical treatment of errors of
measurement and the theory which now forms the foundation of statistics. In the same
century, interest in the numerical description of political units led to development of
methods which nowadays come under the heading of descriptive statistics.
Although descriptive statistics is an important branch of statistics and it continues
to be widely used, statistical information usually arises from samples, and this means that
its analysis will require generalizations which go beyond the data. As a result, the most
important feature of the recent growth of statistics has been a shift in emphasis from the
methods which merely describe to methods which serve to make generalizations: that is,
a shift in emphasis from descriptive statistics to the methods of inferential statistics.
Descriptive Statistics is the totality of methods and treatments employed in the
collection, description and analysis of numerical data. The purpose of a descriptive
statistics is to tell something about particular group of observation. On the other hand,
Inferential Statistics is the logical process from sample analysis to a generalization or
conclusion about a population. It is also called statistical inference or inductive statistics.
A population consists of all members of the group about which we want to draw a
conclusion, while sample is a proportion, or part of the population of interest selected for
analysis.
The relation of between a sample and population is portrayed in Figure 1.1.
Population Sample
Figure1.1:Relation between L B D N G
Population and Sample A F E H J T M N Q K C M
R S W Q V O P R W P D
C I K
Page 2 of 23
1.3 Parameter and Statistic
The major advantage of descriptive statistics is that they permit researchers to describe
the information contained in many scores with just a few indices.
There are basically two types of random variables yielding two types of data:
qualitative and quantitative.
Qualitative Variable. A variable that is conceptualized and analyzed as distinct
categories, with no continuum implied. Also termed categorical variable; that are put in
the same or different classes, each class being considered as possessing some common
characteristic that is not shared by those in other classes.
Example: eye color, gender, occupation, religious preference etc.
Qualitative Quantitative
Discrete Continuous
Page 4 of 23
In the broadest sense, all collected data are “measured” in some form. For example,
even discrete quantitative data can be thought of as arising by a process of “measurement
through counting.” The four widely recognized level of measurement- the nominal,
ordinal, interval, and ratio.
A. Nominal level of measurement is mutually exclusive and exhaustive meaning it is
used to differentiate classes or categories for purely classification or identification
purposes. It is the weakest form of measurement because no attempt can be made to
account for differences within the particular category or to specify any ordering or
direction across the various categories. Nominal data are discrete variables.
Exhaustive is a property of a set of categories such that each individual or object must
appear in a category.
Example:
Qualitative Variable Categories
Gender Male, Female
Automobile Ownership Yes, No
Type of Life Insurance Owned Term, Endowment, Straight-Life, Others, None
B. Ordinal level of measurement is used in ranking. It is somewhat stronger form of
measurement because an observed value classified into one category is said to poses
more of a property being scaled than does an observed value classified into another
category. Nevertheless, within a particular category no attempt is made to account for
differences between the classified values. Moreover, ordinal scaling is still a weak form
of measurement, because no meaningful numerical statements can be made about
differences between categories. That is the ordering implies only which category is
‘’greater’’ or “lesser”- not how much “greater” or “lesser.” Ordinal data are discrete
variables.
Example:
Qualitative Variable Categories
Student class designation Freshman, Sophomore, Junior, Senior
Product satisfaction Unsatisfied, Neutral, Satisfied, Very Satisfied
Movie classification G, PG,PG-13, R-18, X
Faculty Rank Professor, Associate Prof., Assistant Prof, Instructor
Hotel Ratings , , , ,
Student Grades 1.0, 1.25, 1.50, 1.75, 2.00, …
. C. Interval level of measurement is to classify order and differentiate between classes
or categories in terms of degrees of differences. Interval data are either discrete or
continuous variables.
Example:
Page 5 of 23
Qualitative Variable
Temperature (in degree ℃ or℉)
Calendar Time (Gregorian, Hebrew, or Islamic)
D. Ratio level of measurement differs from interval measurement only in one aspect; it
has a true zero point (complete absence of the attitude being measured).With an absolute
value point it can be said that the ratios of two observations is “twice as fast”’ “half as
long” or others. Ratio data are either discrete or continuous variables.
Example:
Qualitative variable
Weight (in pounds or
kilogram)
Age (in years or days)
Salary (in Philippine peso)
Numerical Data
Page 6 of 23
Qualitative Quantitative
One of the most important steps in the research process is to select the sample of
individuals who will participate as a part of the study. Sampling refers to the process of
selecting these individuals.
Example: For instance we have the data shown below; say we want to
consider every 5th on the list.
23 34 12 14 13 23 24 39 27 23
12 15 16 23 26 28 23 22 19 34
25 22 18 30 23 24 17 18 15 12
Therefore, the samples from every 5th from left to right are 13,23,26,34,23, and 12.
Page 7 of 23
Field of Population
Specialization
Nursing 6,000
Accountancy 500
Management 2,000
Marketing 1,000
Education 2,500
Total 12,000
To determine the sample size in each subgroup, we will simply multiply the sample
population with respect to each subgroup percentage in reference to the population. The
computation is shown in the last column of the table below.
Page 8 of 23
2. Purposive sampling is a process of selecting based from judgment to select a
sample which the researcher believed, based on prior information, will
provide the data they need. The disadvantage of purposive sampling is that the
researcher’s judgment may be in error- he or she may not be correct in
estimating the representative-ness of a sample or their expertise regarding the
information needed. It is also called judgment sampling.
Example: Imagine attempting to obtain the frame that includes all the
homeless people in Metro Manila. To obtain a sample of homeless
individuals, for example, the researcher will interview individuals on the street
or at homeless shelter.
Random Non-random
Page 9 of 23
1.10 Methods of Collecting Data
After the research problem has been laid, the next step is to determine the methods to
collect data. Here are the five basic methods in collecting data.
Observation Method. This method is used to data that are pertaining to behaviors of an
individual or a group of individuals at the time of occurrence of a given situation are best
obtained by observation. One limitation of this method is observation is made only at the
times or occurrence of the appropriate events.
Experiment Method. This is used to determine the cause and effect relationship of
certain phenomena under controlled conditions. This method usually employed by
scientific researchers.
There are different ways in presenting data. Three of them are as follows
Textual Method. This method presents the collected data in narrative and paragraph
forms.
Tabular Method. This method presents the collected data in table which are orderly
arranged in rows and column for an easier and more comprehensive comparison of
figures.
Graphical Method. This method presents the collected data in visual or pictorial form to
get a clear view of data. (e.g. histogram, pie chart, pareto chart, pictograph, etc.)
Page 10 of 23
1.12 Summation Notation, Sigma∑
The symbol
n
∑ ( Xi)
i=1
is used to denote the sum of all the Xi’s from i=1 to i=n; by definition,
n
∑ (x ¿ ¿ 1+ x2 + x 3 +…+ x n )¿
i=1
We often denote this sum simply by ∑X or ∑X1. The symbol ∑ is the Greek capital letter
sigma, denoting sum.
Solution:
4
1. ∑ ( X i )3= X13+X23+X33+X43
i=1
3
2. ∑ ( X i +2 )=(X1+2)+(X2+2)+(X3+2)
i=1
2
3. ∑ ( X i +Y i )3=(X1+Y1)3+(X2+Y2)3
i=1
Solution:
4
1. ∑ ( 2 X i Y i )=2X1Y1+2X2Y2+2X3Y3+2X4Y4
i=1
=2(1)(0)+2(3)(8)+2(2)(1)+2(5)(6)
=0+48+4+60
=112
Page 11 of 23
4
2. ∑ Z i ( Y i−X i )=Z1(Y1-X1)+Z2(Y2-X2)+Z3(Y3-X3)+Z4(Y4-X4)
i=1
=4(0-1)+7(8-3)+(-2)(1-2)+3(6-5)
=4(-1)+7(5)+(-2)(-1)+3(1)
= (-4)+35+2+3
=36
3
3. ∑ ( X i +Z i )2=(X1+Z1)2+(X2+Z2)2+(X3+Z3)2
i=1
= (1+4)2+(3+7)2+[2+(-2)]2+(5+3)2
=52+102+02+82
=25+100+0+64
=189
Page 12 of 23
Name:_________________________________ Date:_______________ Score:________
In each of these statements, tell whether descriptive or inferential statistics have been
used.
Page 13 of 23
13. D Birth rate of selected urban areas in the Philippines.
______________
14. I The political views of the youth in the rural areas with respect
to inflation rate in the Philippines. _____________
15. I The forecast of PAG-ASA in the number of typhoons that will
hit the Philippines in the upcoming years due to global warming.
Page 14 of 23
17. Number of pairs of women’s gloves owned. __________________
18. Primary type of transportation used by students. __________________
19. Type of telephone. __________________
20. Number of local calls made per month. __________________
Page 15 of 23
Name:_________________________________ Date:_______________ Score:________
4. f(x) = x + 3 __________________
13. The number of VCDs and DVDs rented each day in Video City. _____________
21. Amount of time spent surfing the internet per week. __________________
Page 16 of 23
Name:_________________________________ Date:_______________ Score:________
Determine whether each of the following is nominal, ordinal, interval, or ratio data.
Page 17 of 23
Name:_________________________________ Date:_______________ Score:________
A. Write the following expressions in expanded form.
3
1. ∑ ❑(4Xi +1)
i+ 1
4
2. ∑ ❑5(Xi + 2Yi)
i=1
5
3. ∑ ❑(Xi² - 3Yi)
i=1
3
4. ∑ ❑(Xi Yi² + 2Zi)
i=1
X1= -2 X2 = 5 X3 = 1 X4 = 0
Y1 = 1 Y2 = 1 Y3 = 2 Y4 = 7
Z1 = 3 Z2 = 6 Z3 = -3 Z4 = -1
3
1. ∑ ❑(7Xi + Yi)
i=1
Page 18 of 23
4
2. ∑ ❑(Xi² - 2Yi)
i=1
2
3. ∑ ❑(Xi + 3Yi + 2Zi)
i=1
______ 6. The data that can be classified according to color are measured on nominal
scale.
______ 7. The number of absences per year that a congressman has is an example of
discrete data.
______ 8. The National Statistics Office (NSO) reported that there are 20,750,000
Filipino currently employed in private institutions. This figure is called
statistics.
______ 9. The nominal is considered the “highest” level of data, and the data must be
mutually exclusive.
______ 10. Private statistical agencies such as Social Weather Station (SWS) seldom
employ sampling methods because the populations they work with are so
large.
______ 11. A sample of consumers tested a new flavored pizza and rated it excellent,
very good, fair, or poor. The level measurement for this market
research problem is interval.
Page 19 of 23
______ 12. A method used to find out something about the market behavior population in
National Capital Region (NCR) based on a sample of 200 respondents is
called inferential statistics.
______ 14. The Department of Trade and Industry (DTI) asked a sample of persons
shopping in SM Mall of Asia, if they live in Parañaque City, outside the City,
or lived in foreign country. This survey involved nominal level data.
______ 15. When the population of employees in government is divided into groups
according to their departments or agencies and then several are selected
from each group to make a simple, the sampling is called stratified
sampling.
______ 16. A sample of 5,400 poor Filipinos in Metro Manila was selected to find out if
they will go on rally on Friday. Over 50% of those in the sample said they
would go out and rally, we can assume that the majority of all poor Filipinos
in metro Manila favor a rally.
Page 20 of 23
Name:_________________________________ Date:_______________ Score:________
6. Which of the following terms best describes data that were originally collected by
different person for a different purpose?
Page 21 of 23
B. Quota sampling D. Systematic sampling
9. Which of the following would generally require the largest sample size?
15. Which of the following would require the smallest sample size because of its
efficiency?
A. Simple random sampling C. Cluster sampling
B. Systematic sampling D. Stratified sampling
Page 22 of 23
A. cannot be negative. C. can assume only whole number values.
B. is an example of qualitative variable. D. can assume only certain separated values.
Name:_________________________________ Date:_______________ Score:________
18.A sampling method when each member of a population has an equality likely chance
of being selected.
A. Purposive sampling C. Nonrandom sampling
B. Quota sampling D. Random sampling
19. In a study, the following variables were measured on each respondent: gender,
weight, height. The scale of these three variables are.
A. nominal, ordinal, interval. C. nominal, interval, interval.
B. nominal, ordinal, ratio. D. nominal, ratio, ratio.
20. determining the sample interval (represented by k), randomly selecting a number
between 1 and k, and including each k element in your sample are the steps for which
form of sampling?
A. Simple random sampling. C. Cluster sampling.
B. Systematic sampling. D. Stratified sampling.
21. If a researcher took the 300 government employees, divided them by gender, and then
took a random sample of the males and a random sampling of the females, the variable on
which we would divide the population is called the .
A. dependent variable. C. sampling variable.
B. independent variable D. stratification variable
22. Which of the will give a more “accurate” representation of the population from which
a sample has been taken?
A. A small cluster sampling
B. A small sample based on simple random sampling
C. A large sample based on simple random sampling
D. A large sample based on purposive sampling
Page 23 of 23