Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Statistics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 58

 HISTORY & DEFINITION

 The word statistics means different things to different people. To a college


student, statistics are scores on all quizzes, seatwork, assignments and
recitations made in his subject. To a biological researcher investigating the
effects of pollution to our environment, statistics are evidence of success of
research efforts. To a school president, statistics are information on faculty and
employee salary, tardiness & absenteeism, and increase or decrease in
enrollment. To a manager of a food chain, statistics may be kind of food
frequently served to customers, and to the president of a country, statistics are
the information to jobs created, housing projects, increase or decrease in
economic situation, etc.
 They are using statistics correctly, yet they use it in different ways and purposes.
 The word statistik comes from the Italian word statista which means
“statesman”. The word was first used by Gottfried Achenwall (1719-1772), a
professor at Marlborough and Gottingen, while Dr. E.A.W. Zimmerman
introduced it in England. Its used was popularize
 d by Sir John Sinclair in his work, Statistical Account of Scotland (1791-1799),
However, people had been recording and using data long before the 18th century.
 Presently, Statistics is defined as the branch of scientific methodology which
deals with the collection, classification, description and interpretation of data
obtained through survey or experiment.

Statistics is a scientific body of knowledge that deals with the collection, organization
or presentation, analysis and interpretation of data.
Collection refers to the gathering of information or data.
Organization or presentation involves summarizing data or information in textual,
graphical or tabular forms.
Analysis involves describing the data by using statistical methods and procedures.
Interpretation refers to the process of making conclusions based on the analyzed data.
Functions of Statistics
 To provide investigators means of measuring scientifically the conditions that may
be involved in a given problem and assessing the way in which they are related.
 To show the laws underlying facts and events that cannot be determined by
individual observations.
 To show relations of cause and effect that otherwise may remain unknown.
 To find the trends and behavior in related conditions which otherwise may remain
ambiguous.
Importance of Statistics to Research
 It gives the most exact kind of description.
 It provides the most definite and exact procedures in analyzing data.
 It summarizes results in a meaningful and convenient form.
 It draws a general conclusion.
 It predicts possible outcomes under certain conditions.

ORIGIN AND DEVELOPMENT OF STATISTICS


The history of statistics can be traced back at least to the Biblical times in Ancient
Egypt, Babylon and Rome. As early as 3,500 years before the birth of Christ, statistics had
been used in Egypt in the form of recording the number of sheep or cattle owned, the
amount of grain produced, and the number of people living in a particular city. In 3800
BC., Babylonian government used statistics to measure the number of men under the
king’s rule and the vast territory that he occupied. It was his belief that the more men
under his command and the more lands he conquered, the more powerful his kingdom
would become. In 700 B.C., Roman empires used statistics by conducting registration to
record population for the purpose of collecting taxes.
In modern times, statistical methods have been used to record and predict such
things as birth and death rates, employment and inflation rates, sports achievement, and
other economic and social trends. Try have even used to assess opinions from polls and
unlock secret codes from the game of chance.
Modern Statistics is said to have begun with John Graunt (1620-1674), an English
tradesman. Graunt collected published records called “bills of mortality” that included
information about the numbers and causes of deaths in the city of London. Graunt
analyzed more than fifty years of data and created the first mortality table, a table that
shows how long a person may be expected to live after reaching a certain age.
There were so many other great men who made important contribution to statistics.
One of them was Karl Friedrich Gauss (1777-1855), the brilliant German mathematician
who used statistical methods in making predictions about the positions of the planets in
our solar system. Adolphe Quetelet (1796-1874), A Belgian astronomer developed the
idea of the “average man” from his studies of the Belgian census. He was also known as
the “Father of Modern Statistics”. Karl Pearson (1857-1936), an English mathematician
made important links between probability and statistics. In the 20th century, the British
statistician Sir Ronald Aylmer Fisher developed the F-tool in inferential statistics (derived
after his name), this tool has been very useful in testing improvements of production from
agricultural experiments and improvement of precision of results from medical, biological
and industrial experimentation. The American George Gallup (1901-1984) was
instrumental in making statistical polling, a common tool in political campaigns.
In this age of information technology, a lot of computer programs such as Microstat,
Soritec Sampler, SPSS, and others are made available in diskette or websites that perform
more than the manual calculations in statistics. People working in some government
agencies, in laboratories, in media, and in business generally use these electronic devices
to easily access data, improve graphics, and obtain ready-made analyses interpretations
about the data.

APPLICATION OF STATISTICS
In Education
Through statistical tool, a teacher can determine the effectiveness of a particular
teaching method by analyzing test scores obtained by their students. Results of this study
may be used to improve teaching-learning activities.
In Business
A business firm collects and gathers data or information from its everyday operation.
Statistics is used to summarize and describe those data such as the amount of sales,
expenditures, and production to enable the management to understand and determine
the status of the firm. Data that have been organized and analyzed provide the
management a baseline to make wise decision pertaining to the operation of the business.
In Psychology
Psychologists are able to interpret meaningful aptitude tests, IQ tests and other
psychological tests using statistical procedure or tools.
In Politics & Government
Public Opinion and election polls are commonly used to assess the opinions or
preferences of the public for issues or candidates of interest. Statistics plays an important
role in conducting surveys or interviews for that purpose.
In Medicine
Statistics is also used in determining the effectiveness of new drug products in
treating a particular type of disease. To illustrate, a drug company wants to test the
effectiveness of its new drug product in treating tuberculosis. An experiment or a clinical
trial is conducted. Ten tuberculosis patients are treated using the new drug product and
another are treated using the existing drug. The results are analyzed statistically to find
out if the new product is more effective in treating tuberculosis.
In Agriculture
Through statistical tools, an agriculturist can determine the effectiveness of a new
fertilizer in the growth of plants or crops. Moreover, crop production and yield can be
better analyzed through the use of statistical methods.
In Industry
The most favorite actresses and actors can be determined by using surveys. Ratings
of the members of the board of judges in a beauty contest are statistically analyzed.
Interviews are used to determine the most widely viewed television show. The top grosser
movies for this year are reported based on statistical records of movie houses. All these
activities involve the use of statistics.
In everyday life
The number of cars passing through streets or a highway is recorded to enable traffic
enforcers to manage efficiently. Even the number of pedestrians crossing the street, the
number of people entering a warehouse or a department store, and the number of people
engaged in video games involve the use of statistics. In short statistics is found and used
in everyday life.
BRANCHES OF STATISTICS
1. Descriptive Statistics
 is a statistical procedure concerned with describing the characteristics and
properties of a group of persons, places or things.
 For example, we may describe a collection of persons by stating how many are poor
and how many are rich, how many are literate and how many are illiterate, how
many fall into various categories of age, height, civil status, IQ, and many more. We
may also describe a particular barangay in terms of the number of families it has,
the number of grade-schoolers, the number of professionals, the number of
households with certain kinds of appliances, the number of siblings in each
household, or the rate of unemployment
 Generally, descriptive statistics involve gathering, organizing, presenting and
describing data.

2. Inferential Statistics
 is a statistical procedure that is used to draw inferences or information about the
properties or characteristics by a large group of people, places, or things or the basis
of the information obtained from a small portion of a large group.
 Suppose we want to know the most favorite brand of toothpaste of a certain
barangay and we do not have enough time and money to interview all the residents
of that barangay, we may just ask selected residents. With the data obtained from
the interviews, we shall draw or make conclusions as to barangay’s favorite brand
of toothpaste. This example involves the use of inferential statistics.

TERMINOLOGIES IN STATISTICS
Some important terms are commonly used in the study of Statistics. These terms should
be understood fully in order to facilitate the study of statistics.
1. Population refers to a large collection of objects, places or things. To illustrate this,
suppose a researcher wants to determine the average income of the residents of a certain
barangay and there are 1500 residents in the barangay. Then all of these residents
comprise the population. A population is usually denoted or represented by N. Hence, this
case, N = 1500.
2. Sample is a small portion or part of a population. It could also be define as a sub-group,
subset, or representative of a population. For instance, suppose the above-mentioned
researcher does not have enough time and money to conduct the study using the whole
population and he wants to use only 200 residents. These 200 residents comprise the
sample. A sample is usually denoted by n, thus n = 200.
3. Parameter is any numerical or nominal characteristics of a population. It is a value or
measurement obtained from a population. It is usually referred to as the true or actual
value. If in the preceding illustration, the researcher uses the whole population (N=1500),
then the average income obtained is called a parameter.
4. Statistic is an estimate of a parameter. It is a value or measurement obtained from the
sample. If the researcher in the preceding illustration makes use of the sample (n=200),
then the average income obtained is called statistic
5. Data –(singular form is datum) are facts, or a set of information or observation under
study. More specifically, data are gathered by the researcher from a population or from a
sample. Data may be classified into two categories, qualitative or quantitative
 Qualitative data are data which can assume values that manifest the concepts of
attributes. These are sometimes called categorical data. Data falling in this category
cannot be subjected to meaningful arithmetic. They cannot be added, subtracted
or divided. Gender and nationality are qualitative data.
 Gender is a qualitative dichotomous variable since an individual may take one
of the two values “male or female”. In an opinion poll, the response of an
individual towards an issue whether to “go” for it, “against” it or “undecided”
is an example of qualitative trichotomous variable. Smoking habits of an
individual in different situations may be classified as “Always/Very Often”,
“often”, “Seldom”, “Very Seldom”, or “Never”. This set of qualitative values is
called multinomous variable.
 Quantitative Data are data which are numerical in nature. These are data obtained
from counting or measuring. In addition, meaningful arithmetic operations can be
done with this type of data. Test scores and height are quantitative data.
6. A Variable is a characteristic or property of a population or sample which makes the
members different from each other. If a class consists of boys and girls, then gender is a
variable in this class. Height is also a variable because different people have different
heights. Variables may be classified on the basis of whether they are discrete or
continuous and whether they are dependent or independent.
 Discrete Variable
 A discrete variable is one that can assume a finite number of values. In other
words, it can assume specific values only. The values of a discrete variable are
obtained through the process of counting. The number of students in a class
is a discrete variable. If there are 40 students in a class, it cannot reported
that there are 40.2 students or 40.5 students, because it is impossible for a
fractional part of a student to be in the class.
 Continuous Variable
 A continuous variable is one that can assume infinite values within a specified
interval. The values of a continuous variable are obtained through measuring.
Height is a continuous variable. If one reports that the height of a building is
15 m, it is also possible that another person reports that the height of the
same building is 15.1m or 15.12m, depending on the precision of the
measuring device used. In other words, height of the building can assume
several values.
 Dependent Variable
 A dependent variable is a variable which is affected or influenced by another
variable.
 Independent Variable
 An independent Variable is one in which affects or influences the dependent
variable. To illustrate Independent and dependent variables, consider the
problem entitled, The Effect of Computer-Assisted Instruction on the Students’
Achievement in Mathematics. Here the independent variable is the computer-
assisted instruction while the dependent variable is the achievement of
students in mathematics.
7. Constant refers to the fundamental quantities that do not change in value, fixed costs
and acceleration due to gravity are examples of such.

SCALES OF MEASUREMENT
 Nominal Scale
- This is the most primitive level of measurement. The nominal level of
measurement used when we want to distinguish one object from another
for identification purposes. In this level, we can only say that one object is
different from another, but the amount of difference between them
cannot be determined. We cannot tell that one is better or worse than
the other. Gender, nationality and civil status are of nominal scale.
 Ordinal scale
- in the ordinal level of measurement, data are arranged in some specified
order or rank. When objects are measured in this level, we can say that
one is better or greater than the other. But we cannot tell how much more
or how much less of the characteristic one objects than the other. The
ranking of contestants in a beauty contest, or siblings in the family, or of
honor students in the class are of ordinal scale.
 Interval Scale-
- If data are measured in the interval level, we can say not only that one
object is greater or less than another, but we can also specify the amount
of difference. The scores in an examination are of interval scale of
measurement. To illustrate, suppose Kensly Kyle got 50 in a Math
examination while Kwenn Anne got 40. We can say the Kensly Kyle got
higher score than Kwenn Ann by 10 points.
 Ratio Scale
- The ratio level of measurement is like the interval level. The only
difference is that the ratio level always starts from an absolute or true zero
point. In addition, in the ratio level, there is always the presence of units
of measure. If data are measured in this level, we can say that one object
is so many times as large or as small as the other. For example, suppose
Mrs. Reyes weight 50 kg, while her daughter weighs 25 kg. We can say
that Mrs. Reyes is twice heavy as her daughter. Thus, weight is an example
of data measured in the ratio.

DATA GATHERING TECHNIQUES


Sources of Data
 there are two sources of obtaining data. One is called primary source from
which a first-hand information is obtained usually by means or personal
interview and actual observation. On the other hand, the secondary source
of information is taken from other’s works, news reports, readings, journals,
magazines, and those that are kept by the National Statistics Office, Securities
and Exchange Commission, Social Security System and other government and
private agencies.

 Data are said to be an asset of a company if they are accurate, updated and
available when needed. Hence, any institution or business organization must
have a database called Management Information System where all
information about their business are made available in order to facilitate
verification of claims and to come up with wise management decision.
METHODS OF COLLECTING DATA: Its Advantages and Disadvantages
 Direct or Interview Method
- is a person-to-person interaction between a interviewer and an
interviewee. Tape recorded or written interview will help the researcher
obtain exact information from the interviewee.
- Advantages: Precise and consistent answers can be obtained by modifying
or rephrasing the questions especially to illiterate or to children under
study.
- Disadvantages: It is time, money and effort consuming and it will be
applicable only for small population, except when conducting a census.

 Indirect or Questionnaire Method


- is an alternative method for the interview method. Written responses are
obtained by distributing questionnaires (a list of questions intended to
elicit answers to a given problem, must be given in a logical order and not
too personal) to the respondents through mail or hand-carry
- Advantages: Lesser time, money, and efforts are consumed.
- Disadvantages: Many responses may not b consistent due to the poor
construction of the questionnaire. The meaning of the questions may be
different from each respondent. Inconsistent responses can no longer be
modified, thus, it reduces valid number of respondents.

 Registration Method
- is enforced by private organization or government agencies for recording
purposes.
- Advantages: Organized data from an institution can serve as ready
references for future study or for personal claims of people’s record.
- Disadvantages: Problem arises only when an agency doesn’t have a
Management Information system and if the system or process of
registration is not implemented well.

 Observation Method
- is a scientific method of investigation that makes possible use of all senses
to measure or obtain outcomes/responses from the object of study
- Advantages: Observation method is usually applied to respondents that
cannot be asked or need not speak especially when behaviors of
persons/culture of organization/performance outcomes of
employees/students are to be considered.
- Disadvantages: Subjectivity of information sought cannot be avoided.

 Experimentation
- is used when the objective is to determine the cause-and-effect of a
certain phenomenon under some controlled conditions.
- Advantages: There is objectivity of information since a scientific method of
inquiry is used. An equal number of respondents with relatively similar
characteristics are being examined to obtain the different effects of
something applied to the experimental group.
- Disadvantages: It’s too difficult to find respondents with almost similar
characteristics. The whole method must be repeated if the desired
outcome is not reached.

Data that are collected by these methods are usually referred to as raw data. Responses
out from taped interviews, answered questionnaires, furnished registration forms,
recorded observations, and results from an experiment are considered raw data since they
are not yet organized and presented in a form ready for interpretation.
CLASSIFICATION OF VARIABLES AND DATA

v
VARIABLE

QUALITATIVE  Dependent
 Independent QUANTITATIVE

 Dichotomous *Discrete
 Trichotomous *Continuous
 Multinomous

DATA

SCALES OF
SOURCES PRESENTATION
MEASUREMENT
*Primary METHODS  Textual
*Nominal
* Secondary *Ordinal  Tabular
*Interview
*Questionnaire *Interval  Graphical/Chart
*Registration *Ratio -Line Graph
*Observation -Bar Graph
*Experimentation -Pie Graph
-Pictograph
-Map/Cartogram
-Scatter Point Diagram

SLOVIN’S FORMULA IN DETERMINING THE SAMPLE SIZE


SLOVIN’S FORMULA IN DETERMINING THE SAMPLE SIZE
In research, we seldom use the entire population because of the cost and time involved.
In fact, most researchers do not use the population in their study. Instead, the sample
which is small representative of a population is used. The characteristics of the whole
entire population are described using the characteristics observed from the sample.

The sample size can be obtained by the formula


N where n – sample size
n = ---------------- N – population size
1 + Ne2 e – margin of error

Observe that there is a margin of error. When we use a sample, we do not get the actual
value but just an estimate of the parameter. Hence, there is an error associated when
using the sample.
To illustrate, suppose we want to find out the average age of the students in Manila.
However, due to insufficient time, only the students in three particular schools were used
to estimate the average age. Obviously, the result is not the actual average age but just an
estimate and thus, there is really an error when we use the sample instead of the
population.
Study the examples below in finding the sample size.
Example 1. A group of researcher will conduct a survey to find out the opinion of residents
of a particular community regarding the oil price hike. If there are 10,000 residents in the
community and the researchers plan to use a sample using a 10% margin of error, what
should the sample size be?
Solution: N= 10,000, e= 10% or .10
Hence, the researchers will just conduct the survey using 99 residents. A 10% margin or
error means that the researcher is 90% confident that the result obtained using the sample
will closely approximate the result had he used the population.
Example 2. Suppose that in example 1, the researcher would like to use a 5% margin of error. What
should be the size of the sample?

Solution: N=10,000 e = 5% or .05

10,000 10,000 10,000 10,000


n = ------------------ , n = ----------------------, n= ------------, n = -----------, n = 384.62 or 385
1+ 10,000(.05)2 1 + 10,000(.0025) 1 + 25 26

Observe from examples 1 & 2 that as we reduce the margin of error, the sample size gets larger. Hence if
we want to have a more accurate result, we have use a larger sample.

3. A researcher plans to conduct a survey. If the population size is 18,000, find the sample size if the
desired margin of error is
a. 10% b 5% c. 1% d. 3%

SAMPLING TECHNIQUES
Sampling Technique- is a procedure used to determine the individuals or members
of a sample.
A – PROBABILITY OR RANDOM SAMPLING TECHNIQUE is a sampling technique wherein
each member or element of the population has an equal chance of being selected as
members of the sample.
1.) Simple Random Sampling
a.) Lottery Method
Suppose Mrs. Cruz wants to send five students to attend a 2-day training or seminar in
basic computer programming. To avoid bias in selecting these five students from her 40
students, she can use the lottery sampling. This is done by assigning a number of paper to
each student and then writing these numbers on pieces of paper. Then, these pieces of
paper will be rolled or folded and placed in a box called lottery box. The lottery box should
be thoroughly shaken and then five pieces of paper will be picked or drawn from the box.
The students who were assigned to the numbers chosen will be sent to the training. In this
case, the selection of the students is done without bias. Note that we can simply assign1
to the first student, 2 to the second student and so on.
• Sampling with the use of Table of Random Numbers
Below is a proportion of the table of roman Random Numbers
Let us illustrate how these random numbers are use to select the members of the sample.
Let us consider the preceding example wherein Mrs. Cruz wants to select 5 students from
her 40 students. Again, we will assign a number to each student, say from 1 to 40.

Since there are 40 students, we will use the two-digit number of the table of random
number when selecting the members of the sample. This is because the students have
been assigned with number 01, 02, 03,. . . up to 40. Looking at the first column of the table
of random numbers above, we see that the number formed by the first two-digit is 31,
hence, the student assigned to number 31 is chosen as a member of the sample. If we
proceed down the column, we see that the number formed is 87 which cannot be used
because we have only 40 members. In a similar manner, the third number is 06 so that the
student assigned to number 6 is chosen. Notice that the next two numbers from the table
are 95 and 44, numbers we cannot use for the same reason as before. When we get to the
bottom of the column, we move up the column and merely shift one digit to the right for
the next random number. Thus, we will have 18 as our next number. Thus is one of the
many alternatives. We can have other ways of selecting the members of the sample until
we complete the 5 students.
2.) Systematic Sampling
Let us use the example wherein Mrs. Cruz wants to select 5 students from her 40
students. First, we select a random starting point. This is done by dividing the number of
members in the population by the number of the members in the sample. Hence, in our
case we shall have i = 8. The next step is to write the numbers 1, 2, 3, 4, 5, 6, 7, and 8 on
pieces of paper and draw one number by lottery. If we were able to get 5, this means that
we will select every 5th student in the population as members of the sample. Therefore,
the 5th, 10th, 15th, 20th, and 25th student shall be the members of the sample. If, for instance,
we were able to obtain the number 6, then the members of the sample will be the 6th, 12th,
18th, 24th and 30th students.
3.) Stratified Random Sampling
There are some instances whereby the members of the population do not belong to
the same category, class, or group. To illustrate this, let us suppose that we want to
determine the average income of the families in a certain community or barangay. In a
typical barangay, different families belong to different income brackets. If we will draw
or select members of the sample using simple random sampling, there is a possibility
or chance that none of the families or a disproportionate number of the families from
the low-income, average income, or high-income group will be include in the sample.
In this case, the result of the study would turn out into biased. For example, if the
sample comes only from the high-income families, then we will conclude that the
average income of the families living in this barangay is high. This suggest that the
sample that should be drawn from the population should be proportionally drawn from
each group or category – the high, the average, and the low-income families.
To do this, we will use the stratified random sampling. The word stratified comes
from the root word strata which means group or categories (singular form is stratum).
When we use this method, we are actually dividing the elements of the population into
different categories or subpopulation and then the members of the sample are drawn or
selected proportionally from each subpopulation.
Example. Suppose a community consists of 5000 families belonging to different income
brackets. We will draw 200 families as our random sample using stratified random
sampling. Below are the subpopulations and corresponding number of families belonging
to each subpopulation or stratum.

Solution: the first step is to find the percentage of each stratum. This is done by
dividing the number of families in each stratum by the total of families. Then, we multiply
each percentage by desired number of families in the sample.
From the above table, we see that if we are going to draw 200 members from the
population of 5000, we should draw 40 families belonging to the high-income, 100 from
the average, and 60 from the low-income groups. Observe that the number of families
drawn as sample in each stratum is proportional to the number of families from the
population.
From the above table, we see that
if we are going to draw 200
members from the population of
5000, we should draw 40 families
belonging to the high-income, 100
from the average, and 60 from the
low-income groups. Observe that
the number of families drawn as
sample in each stratum is
proportional to the number of
families from the population.
4.) Cluster Sampling
Cluster sampling is sampling wherein groups or clusters instead of individuals are randomly
chosen. Recall that in the simple random sampling we select members of the sample
individually. In cluster sampling, we will select or draw the members of the sample by group
and then we select a sample of elements from each cluster or group randomly. Cluster
sampling is sometimes called area sampling because this is usually applied when
population is large.
To illustrate the use of this sampling method, let’s suppose that we want to determine the
average income of the families in Manila. Let us assume there are 250 barangay in Manila.
We can draw a random sample of 20 barangays using simple random sampling, and then
a certain number of families from each of the 20 barangays may be chosen.
5.) Multi-Stage Sampling
Multi-stage sampling is a combination of several sampling techniques. This method
is usually used by the researchers who are interested in studying a very large population,
say the whole island of Luzon or even the Philippines. This is done by starting the selection
of the members of the sample using cluster sampling and then dividing each number or
group into strata. Then, from each stratum individuals are drawn using simple random
sampling.
B. Non-Probability or Non- Random Sampling Techniques
The non-probability sampling is a sampling technique wherein members of the
sample are drawn from the population based on the judgment of the researchers. The
results of a study using this sampling technique are relatively biased. This technique lacks
objectivity of selection; hence, it is sometimes called subjective sampling. Inferences made
based on the sample obtained using this technique is not so reliable.
Non-probability sampling techniques are used because they are convenient and
economical. Researchers use these methods because they are inexpensive and easy to
conduct.
1.) Convenience Sampling
As the name implies, convenience sampling is used because of the convenience it
offers to the researcher. For example, a researcher who wishes to investigate the most
popular noontime show may just interview the respondents through the telephone. The
result of this interview will be biased because the opinions of those without telephone will
not be included. Although convenience sampling may be used occasionally, we cannot
depend on it in making inferences about a population.
2.) Quota Sampling
In this type of sampling, the proportions of the various subgroups in the population
are determined and the sample is drawn to have the same percentage in it. This is very
similar to the stratified random sampling the only difference is that the selection of the
members of the sample using quota sampling is not done randomly. To illustrate this, let
us suppose that we want to determine the teenagers’ most favorite brand of T-shirt. If
there are 1000 female and 1000 male teenagers in the population and we want to draw
150 members for our sample, we can select 75 female and 75 male teenagers from the
population without using randomization. This is quota sampling.
3.) Judgment or Purpose Sampling
Another method of drawing the members of the sample using non-probability is by
using purposive sampling. Let us suppose that the target is to find out the effectiveness of
a certain kind of shampoo. Of course, bald fellows will not be the sample.
4.) Incidental Sampling
This design is applied to those samples which are taken because they are the most
available. The investigator simply takes the nearest individuals as subjects of the study until
it reaches the desired size. In an interview, for instance, an interviewer can simply choose
to ask those people around him or in a coffee shop where he is taking a break.
Exercise1
A researcher would like
to investigate the
perception of the
students in Mathematics.
He divided the
population into sub-
population as shown
below. Use stratified
random sampling if the
sample to be drawn consists of 500 students.

2.) A TV journalist
would like to know
the most favorite
noontime show for
this month. He
decided to conduct a
survey on 5
barangays. The table
below shows the list
of barangay and the
number of residents
in each barangay. Use
stratified random sampling to draw 10000 residents who will be included in the survey.

ORGANIZATION & PRESENTATION OF DATA


Ungrouped data are data that are not either organized, or if arranged, could only be from
highest to lowest or lowest to highest.
Grouped data- are data that are organized and arranged into different classes or
categories.
Forms of Presentation of Data
 Textual- this form of presentation combines text and numerical facts in a
statistical report.

Arranging the scores from the lowest to highest will facilitate the enumeration of
important characteristics of the data. The test scores of the 50 students in Calculus
arranged from lowest to highest are shown below:

The highest scores obtained is 50 and the lowest is 3. Ten students got a score of 40 and
above, while only 4 got ten and below. Generally, the students performed well in the test
with 33 students or 66% getting a score of 25 and above.

• Stem – and – leaf plot which sorts data according to a certain pattern. It involves
separating a number into two parts. In a two-digit number, the stem consists of the
first digit, and the leaf consists of the second digit. While in the three digit number,
the stem consists of the first two digits, and the leaf consists of the last digit. In a
one-digit number, the stem is zero.
Table 1.1 Stem-and-leaf Plot of an arranged Test Scores in Calculus of 50 Students

By looking at the stem-and –leaf plot, we can easily rank the data or put them in order.
Thus, the ten lowest scores are 3,9,10,10,12,13,13,14,15 and 16 while the ten highest
scores are 40,40,40,41,42,43,46,48,50 and 50.
 Tabular- this form of presentation is better than textual form because it provides
numerical facts in a more concise and systematic manner. Statistical tables are
constructed to facilitate the analysis of relationships. Each class/subclass is assigned
to a particular row or column and figures for various classifications are noted in
appropriate cells.
Advantages of Tabular Presentation
o It is brief, it reduces the matter to the minimum.
o It provides the reader a good grasp of the meaning of the quantitative
relationship indicated in the report.
o It tells the whole story without the necessity of mixing textual matter with
figures.
o The systematic arrangement of columns and rows makes them easily read and
readily understood.
o The column and rows make comparison easier.
The table has the following parts:
o Table number: This is for easy reference to the table.
o Table Title: It briefly explains the content of the table.
o Column header: It describes the data in each column.
o Row classifier: It shows the classes and categories.
o Body: This is the main part of the table.
Source note: This is placed below the table when the data written are not original.
 FREQUENCY DISTRIBUTION
A frequency distribution table –is a table which shows the data arranged into different
classes and the number of cases which fall into each class.

Parts of Frequency Table


Class limits/class interval – grouping or categories defined by lower and upper limits
Example: 16-20
21-25
26-30
Class size – width of each class interval
Lower limit upper limit
L.L. U.L.
16 - 20
} class size = 5
21 - 25
3. Class Boundaries or REAL or EXACT CLASS LIMITS are the numbers used to separate class
but without gaps created by class limits. The number to be added or subtracted is half the
difference between the upper limit of one class and the lower limit of the preceding class.

Example
Class interval Class boundaries
L.L – U.L L.C.B – U.C.B
16 – 20 15.5 - 20.5
21 – 25 20.5 - 25.5
26 - 30 25.5 - 30.5
4. Class marks are the midpoints of the classes. They can be formed by adding and lower
and upper limits and then divide by 2.
Example:
Class interval class mark/midpoint (X)
16-20 18
21-25 23
26-30 28
STEPS IN CONSTRUCTING A FREQUENCY DITRIBUTION TABLE
1. Decide on the number of class intervals. There should not be too many to avoid many
empty classes, and there should not be few to avoid long details. Use the formula
suggested by Sturge.
k = 1 + 3.3 log N
2. Compute the range. The range R, is defined as the difference between the highest score
and the lowest score.
3. Divide the range R by the number of class intervals (k) to obtain the size of the class
interva i= R/ k or c = R / k
4. Starting from the larger integer less than or equal to the minimum score, construct class
intervals of size c until the maximum score is reached.
5. Set up the class boundaries.
6. Tally the scores in appropriate classes and then add tallies for each class in order to
obtain the frequency.
7. Solve the class mark or midpoint of each class. This is obtained by adding the lowest
classlimit and the upper class limit, then divide by 2.
• Example 1. The following are the entrance examination scores of 60 students.
Range = 57 – 18 = 39
K = 1 + 3.3 log N c = Range / k
= 1 + 3.3 log 60 = 39 / 7
= 1 + 3.3 (1.77815) = 5.57
= 6.867 ≈6
≈7

Table 1.2
Grouped Frequency Distribution for the Entrance Examination Scores of 60 Students
Class Limits Class Boundaries Tally Frequency Class mark/midpoint
18-23 17.5 - 23.5 1111-1 6 20.5
24-29 23.5 - 29.5 1111-1111-1 11 26.5
30-35 29.5 - 35.5 1111-1111-1111-11 17 32.5
36-41 35.5 - 41.5 1111-1111-1111 14 38.5
42-47 41.5 - 47.5 1111-111 8 44.5
48-53 47.5 - 53.5 111 3 50.5
54-59 53.5 - 59.5 1 1 56.5
N = 60
Quiz # 3
Construct a frequency distribution table using the suggested steps . (50 Points)
Example 2. The following are the entrance examination scores of 60 students

Construct a frequency distribution table of 13 classes. In this problem, c = 39/13, c= 3

Notice that the class interval 54-56 is already the 13th class interval, but the highest value
which is 57 has not been reached. Yet addition one more class is allowed in cases like this
to accommodate all the values in the set.
Cumulative Frequency Distribution
The “less than” cumulative frequency distribution (<cf) is obtained by adding
successively from the lowest to the highest interval while the “greater than” cumulative
frequency distribution (>cf) is obtained by adding frequencies from the highest class
interval to the lowest class interval.

Relative Frequency Distribution


The relative frequency distribution of a class is the frequency divided by the total
frequency of all classes and is generally expressed as a percentage.
Frequency of each class interval
Relative Frequency= -----------------------------------------
Total number of observation
Graphical Presentation – this forms is the most effective means of organizing and
presenting statistical data because the important relationships are brought out more
clearly and creatively in virtually solid and colorful figures.

GRAPHICAL REPRESENTATION OF THE FREQUENCY DISTRIBUTION


The following graphs can be constructed to represent a frequency
distribution:
Histogram is a graph represented by vertical or horizontal rectangles whose bases are the
class marks and whose heights are the frequencies.
Frequency polygon is a line graph whose bases are the class marks and whose heights are
the frequencies.
Ogives is obtained by plotting the cumulative frequency by connecting points of
intersection between the class boundaries versus cumulative frequencies “less than” or
“greater than”
-it is the graph of a cumulative frequency distribution.
A pie chart is a circle graph showing the proportion of each class through either the relative
or percentage frequency.
A bar chart is a graph represented by either vertical or horizontal rectangles whose bases
represent the class intervals and whose heights represent the frequencies.

Quiz # 3:
A group of second year Business Administration students took the qualifying examination
for the admittance to the course Bachelor of Science in Accounting. The results are as
follows:

Prepare a frequency distribution using 7 classes starting with 34.


Include the columns of % relative frequency, less than cumulative frequency and greater
than cumulative frequency.
Measures of Central Tendency
 Measures of central tendency are numerical descriptive measures which indicate or
locate the center of the distribution or data set.
 In layman’s term, a measure of central tendency is the average

MEAN
The mean of the set of values or measurements is the sum of all the measurements divided
by the number of measurements in the set.
SAMPLE MEAN FOR UNGROUPED DATA
𝛴𝑥
𝑥=
𝑛
where: x = mean
∑x = sum of the measurements or values
n = number of measurements
Example: Below are the travel time in minutes spent by Kenneth in going to school last
week

𝛴𝑥 60 + 45 + 50 + 53 + 47
𝑥= = = 53 𝑚𝑖𝑛𝑢𝑡𝑒𝑠
𝑛 5

𝛴𝑥𝑊
Weighted Average Mean 𝑥 = 𝛴𝑊

 Example: To the right are Dona’s Subjects and the corresponding number of units
and grades she got for the first grading period. Compute her grade point average
(GPA).
𝛴𝑥𝑊 80(1) + 82(1) + 83(1) + 81(2) + 80(1) + 85(1.5) + 82(2)
𝑥= =
𝛴𝑊 1 + 1 + 1 + 2 + 1 + 1.5 + 2
778.5
= = 81.95
9.5
Therefore, Dona has the GPA of 81.95 for the first grading period.

MEAN FOR GROUPED DATA


To compute the mean for grouped data, we can use two formulas, namely:
1. The class mark formula and
2. The coded formula
THE CLASSMARK FORMULA
∑ 𝑓𝑥
X= where,
𝑛

x =measurement or score
f=frequency
X= classmark
N = total frequency
Example: Below is the frequency distribution of the scores of 40 students in Mathematics.

Solution:
∑ 𝑓𝑥
𝑋=
𝑛
1828
=
40
= 45.7
Coded Formula
X = Xam + (Σfd/n)i
where:
Xam = assumed mean
f=frequency
d=coded deviation
N=total frequency
i=class size
Example:

Solution:
X = Xam + (Σfd/n)i
= 43.5+ (11/40) 8
= 43.5 + 2.2
= 45.7

Characteristics of the mean


1.) the mean is the most appropriate measure of central tendency when the data are in
the interval or ratio scale.
2.) The mean lies between the largest and the smallest values or measurements.
3.) There is only one value for the mean for a given set of values or measurements.
4.) The mean is easily influenced by extreme values because all values contribute to the
average. If there are high values, the mean tends to be high also. If there are extremely
low values, the mean tends to be low also.
Seatwork:
The following are the test scores obtained by III-1 students in Statistics. Compute the mean
using the:
1. Classmark formula and b. Coded formula. What is the average score
obtained by the students?

MEDIAN
Median is the middle value of a given set of measurements, provided that the values
or measurements are arranged in an array. An array is an arrangement of values in
increasing or decreasing order.
MEDIAN FOR UNGROUPED DATA
Example 1. The following are the ages of the Mathematics teachers in Pontevedra North
Elementary School: 21, 23, 32, 28, 25, 50, 48. Compute the median.
Arrange the data in an array. 21, 23, 25, 28, 32, 48, 50. The median is 28.
Example 2: In an English test, eight students obtained the following scores: 10, 15, 12, 18,
16, 20, 12, 14. Find the median.
Arrange the scores in an array, that is, 10, 12, 12, 14, 15, 16, 18, 20
The median is (14+15)/2= 14.5
MEDIAN FOR GROUPED DATA
For grouped data, we have the following formula in finding the median:
Median = l + (n/2 - <cf) i
f
where l = lower class boundary of the median class
n = total frequency
<cf = less than cumulative frequency above the median class
i = size of the class interval
f = frequency of the median class
Let us illustrate how to compute the median of grouped data using the distribution of the
test scores of 40 students in Mathematics given in the example of the preceding section.

Example:

Median = lb + ( n/2 - <cf) i


f
= 39.5 + (20 – 10) 8
12
= 46.17
CHARACTERISTICS OF THE MEDIAN
1.) The median is the most appropriate measure of central tendency for interval data.
2.) The median lies between the highest and lowest measurements.
3.) There is only one value for the median in a given set of measurements.
4.) The median is not influenced by extreme values.
5.) The median is used when the middle value is desired. It is the value where 50% or half
of the distribution lies above it and 50% lies below it.

MODE
Mode is the value which occurs most frequently in a set of measurement or values.

MODE FOR UNGROUPED DATA


The mode for ungrouped data is fairly easy to find. It is just the value or measurement
which occurs the most number of times. In other words, it is the most popular value.
A distribution may have only one mode. In this case, the distribution is said to be
unimodal. Data that have two
values for the mode are said to be bimodal. It is also possible that the set of data is
multimodal if there are more than two values for the mode. If all the scores in a set of data
occur only once, then the set of data has no mode.
Example 1:
The data on the number of times 10 mothers go to market every week are shown below.

Find the mode.


Solution: The mode is 3. This means that the majority of the mothers go to market three
times a week.
MODE FOR GROUPED DATA
For grouped data, we have the formula to find the mode:
(𝑓𝑚−𝑓𝑏 )𝑖
mode = 𝑙𝑚 + 2𝑓
𝑚 −𝑓𝑎 −𝑓𝑏

Where lm = lower class boundary of the modal class


fm = frequency of the modal class
fa = frequency below the modal class
fb = frequency above the modal class
i = size of the class interval
Below are the steps in finding the mode of grouped data:
1.)Find the modal class. This is the class interval with the highest frequency.
2.)Use the formula to find the mode.
It is important to note that the formula for the mode given above holds only for unimodal
distribution. For multimodal distribution, the rough mode is given by the formula
Mode = 3(Median) – 2(Mean)
Let us use the distribution of scores of 40 students in Mathematics to illustrate how to
compute the mode for grouped data.
Example: Find the mode of the data whose frequency distribution is given below.

Notice that the class intervals are arranged from lowest to highest group. The modal class
is the class interval 40-47. The lower class boundary of the modal class is 39.5, the
frequency of the modal class is 12, the frequency below the modal class is 6, the frequency
above the modal class is 10, and the size of the class interval is 8. Substituting these values
in the formula, we have
(𝑓𝑚−𝑓𝑏 )𝑖
mode = 𝑙𝑚 +
2𝑓𝑚 − 𝑓𝑎 − 𝑓𝑏
= 39.5 + ( 12 - 6 )8
2(12)-10-6
= 45.5
In case a distribution has at least 2 modes, a rough mode can be computed as follows:
Moderough = 3 Mdn – 2Mn
where Mdn = median
Mn = mean

CHARACTERISTICS OF THE MODE


1.) The mode is the most appropriate measure of central tendency when the data are
nominal in scale.
2.) The mode is the least reliable among the three measures of central tendency because
its value is undefined in some distributions.
3.) The mode is used when we want to find the value which occurs most often.
4.) The mode is a quick approximation of the average. The mode is sometimes referred to
as an inspection average.

Seatwork:
The following data give the time (in minutes) taken to commute from home to school for
20 students of LCCC.
10 50 64 33 48 5 11 23 37 26
26 32 17 7 13 19 29 43 21 22
Construct a grouped data frequency distribution table with 5 classes.
Find the mean, median & mode
OTHER MEASURES OF LOCATION
The measure of central tendency is a measure of location. It indicates the center of
a given data. Other descriptive measures which are used to locate the position of values
or scores in the distribution are quartiles, deciles, and percentiles. In this book, we shall
not discuss these measures comprehensively, but instead we shall just define them.
Recall that the median is the value where 50% of the distribution fails or lies above it while
50% of the distribution lies below it. The midpoint of the line segment is the median.
In other words, the median is the value which divides the distribution into two equal
parts. We define the quartiles, deciles, and percentiles in a similar manner. These
descriptive measures – quartiles, deciles, and percentiles – are called fractiles.
Quartiles are values which divide the distribution into four equal parts.
Deciles are values which divide the distribution into ten equal parts.
Percentiles are values which divide the distribution into 100 equal parts.
The first quartile (Q1) is the value where 25% of the distribution lies below it while 75% of
the distribution lies above it. The third quartile (Q3) is the value where the 75% of the
distribution lies below it while 25% of the distribution lies above it. Observe that the
median is equal to the second quartile (Q2).
Computation of the Quantiles for Ungrouped Data
To determine any quantile, change it first to percentile and follow the steps below.
Step 1. Arrange first the scores according to magnitude or size.
Step 2. Find the position of the given percentile/decile/quartile in the distribution using
the formula P(n + 1)/100
Step 3. Locate the score corresponding to the obtained position in the distribution starting
from the lowest score.
Step 4. Interpolate to get the score if the obtained position from step 2 is not exact.
Formula:
Px = x(n + 1)/100
Dx = x(n + 1)/10
Qx = x(n + 1)/4
Example 1. Find the 20th percentile or P20 of the following scores.
25
22
20
16
17
12
8
6
5

Steps:
1. Locate the position of the score corresponding to the 20th percentile using Px = x(n +
1)/100
Solution: P20 = 20 (9 + 1)/100 = 2
2. Locate the 2nd score from the lowest. The answer is 6.
3. Hence, the 20th percentile or P20 is 6.
This means that 20% of the cases scored below 6.

Example 2. Find the 60th percentile or P60 of the following scores.


99
95
80
75
70
60
40
Steps:
1. Compute for the position of the 60th percentile using Px = x(n + 1)/100
Solution: P60 = 60 (7 + 1)/100 = 4.8
2. Since 4.8 is between the 4th and the 5th score from the bottom, we have to interpolate
to find the answer.
Take the 4th score from the bottom which is 75 and the 5th score which is 80.
3. Solve the difference between these two scores, 80-75, which is 5.
4. Multiply the difference 5 by the decimal part obtained in step 1 which is 0.8. The product
is 4.
5. Finally, add this product to the lower score, 75. Thus, P60 = 75 + 4 = 79.
This means that 60 % of the cases fall below the score 79.

Computations of the Quantiles for Grouped Data


The computation of any quantile for grouped data is similar to that of the median.
The formula is
𝑥𝑛
−𝑐𝑓
Px = lb + (100𝑓 ) 𝑖

Where: Px = the desired percentile


lb = exact lower limit of the class interval containing Px
n = number of cases
cf = cumulative frequency immediately below the class interval containing P x
f = frequency of the class interval containing Px
i = class size

𝒙𝒏
−𝒄𝒇
Dx = lb + (𝟏𝟎𝒇 ) 𝒊
𝒙𝒏
−𝒄𝒇
Qx = lb + ( 𝟒 𝒇 ) 𝒊
Computations of Some Quantiles

Compute for Q3 or P75


Solution:
Step 1: Compute for xn/4
3(90)/4 = 67.5
Step 2: Locate 67.5 under the <cf column. The 67.5 score is contained in the <cf 70 which
corresponds to the class
interval 52-55. Therefore, Q3 or P75 lie within this class interval.
Step 3: Find the exact lower limit of 52 which is 51.5. So, lb = 51.5
Step 4: Find the <cf value immediately below the class interval, 52-55. Thus <cf = 65.
Step 5: Find f value or frequency of the class interval, 52-55. So, f = 5.
Step 6: Get the class size which is 4. So, i = 4.
Step 7: Compute for Q3 or P75 by substituting all the needed values in the formula.
𝑥𝑛
−<𝑐𝑓
P75 = lb + (100 𝑓 )𝑖 = 51.5 + (67.5−65
5
)4 = 51.5 + 2 = 53.5

This means that 75% of the cases lie below the score 53.5.
Solve for D4 or P40
Solution:
xn/10 = 4(90) / 10 = 36
<cf = 31
f = 10
i =4
lb = 39.5
Substitute the values in the formula:
𝑥𝑛
−<𝑐𝑓
D4 = lb + ( 10 𝑓 ) 𝑖 = 39.5 + (36−31
10
)4 = 39.5 + 2 = 41.5

Find the Q1, P36, & D8 of the data whose frequency distribution is given below.

1.) Q1
Solution:
xn/4 = 1(40) / 4 = 10
<cf = 4
f =6
i =8
lb = 31.5
Substitute the values in the formula:
𝑥𝑛
−<𝑐𝑓
Q1 = lb + ( 4 𝑓
)𝑖 = 31.5 + (10−4
6
)8 = 31.5 + 8 = 39.5

2.) p36
Solution:
xn/100 = 36(40) / 100 = 14.4
<cf = 10
f = 12
i =8
lb = 39.5
Substitute the values in the formula:
𝑥𝑛
−<𝑐𝑓
P36 = lb + (100 𝑓 )𝑖 = 39.5 + (14.4−10
12
)8 = 39.5 + 2.9 = 42.4

3.) D8
Solution:
xn/10 = 8(40) / 10 = 32
<cf = 22
f = 10
i =8
lb = 47.5
Substitute the values in the formula:
𝑥𝑛
−<𝑐𝑓
D8 = lb + ( 10 𝑓 ) 𝑖 = 47.5 + (32−22
10
)8 = 47.5 + 8 = 55.5
Characteristics and uses of the quartile
There are three quartiles actually, the first, the second, and the third quartiles. The
second quartile is the median. The quartiles are computed and used when a group is to
be divided into four equal subgroups according to some traits or characteristics such as
ability. The quartiles are also used in the computation of the quartile deviation, a measure
of variability. The quartiles are also used to determine who of a group belong to the lower
quartile, middle 50%, or upper quartile.

USES OF PERCENTILE
The percentile is computed and used when:
1. a scaled group of scores are to be divided into 100 equal parts of subgroups.
2. percentile bands are needed, that is, when a group of scores is to be divided into a
number of subgroups with equal or unequal number of scores in the subgroups. This is
used especially in the transmutation of raw scores into school marks or grades.
3. percentile ranks of scores are desired. When there is a large number of scores,
percentile ranks are more useful than ordinal ranks.

MEASURES OF VARIABILITY OR DISPERSION


Measures of Variability or Dispersion are measures of the average distance of each
observation from the center of the distribution. They measure the homogeneity or
heterogeneity of a particular group.
 A small measure of variability would indicate that the data are
 Clustered closely around the mean;
 More homogeneous;
 Less variable;
 More consistent and;
 More uniformly distributed.
Consider the following sets of grades in Mathematics of two groups of 5 students each:
THE RANGE
The range refers to the difference between the highest and the lowest score. If the
highest score is 25 and the lowest score in a distribution is 10, then the range is equal to
25-10 = 15. This range is specifically called as the exclusive range. If the difference between
the exact lower limit of the lowest score and the exact upper limit of the highest score is
solved, the result is called the inclusive range. Using the same example, we get 16 from
25.5-9.5. The inclusive range is simply determined by adding 1 to the exclusive range.
Generally, the exclusive range is used for ungrouped data while the inclusive range is used
for grouped data.
The range is the easiest and simplest to determine among the measures of variability
because it depends only on the pair of extreme values. However, it is also the most
unstable because its value easily fluctuates with the change in either of the highest or
lowest scores. It is also considered the most unreliable because it does not give the
dispersion or spread of the scores in between the two extreme values. A more reliable
measure should involve all the values in a distribution to provide us an adequate spread of
all the scores from the average.
Example 1. Find the range of the grades in Math of thee two groups of students in the
preceding example.
Male: 100-60 = 40
Female: 83-79=4
The range of grades of the male group is 40 while that of female group is 4. This shows
that the grades of the males are scattered while the grades of the female group are close
to each other. It shows further that females are more homogeneous than the males in
their math ability.
Uses of the Range
 The range may be computed and used when
 The spread or scatter of the scores is all that is wanted;
 The scores are very few;
 There are extremely low and extremely high scores that there are big gaps between
the scores; and
 A rough comparison of the variability of two or more groups is desired. A group
whose range is smaller is more homogeneous than a group with a bigger range.

THE INTERQUARTILE RANGE AND QUARTILE DEVIATION/SEMI-INTERQUARTILE RANGE

The interquartile range is the distance between the quartiles Q1 and Q3. It includes
the middle 50% of the scores. But it is the semi-interquartile range called quartile deviation
which is more commonly used as a measure of dispersion or variability.

IQR = Q3- Q1 SIQR = 𝑄3−𝑄1


2

COMPUTATION OF THE IQR & SIQR (AVERAGE DEVIATION)

The class frequency


distribution of test
scores in Math 2.
Solve for IQR:
IQR = Q3 – Q1 = 84.86 – 71.06 = 13.8
Solve for SIQR:
SIQR = 𝑸𝟑−𝑸𝟏
𝟐
= 𝟖𝟒.𝟖𝟔−𝟕𝟏.𝟎𝟔
𝟐
= 6.9

USES OF THE QUARTILE DEVIATION


The quartile deviation/semi-interquartile range may be computed and used under
the following conditions:
When the measure of central tendency is the median. The quartile deviation is used with
the median when classifying students.
When the homogeneity or heterogeneity of a group is to be determined. When quartile
deviation is small the group is more or less homogeneous but when the quartile deviation
is large the group is more or less heterogeneous.
When there are extremely high and low scores especially when there are big gaps between
scores. This is because other measures of variability, especially the standard deviation, are
seriously disproportionate under this condition.
When the main concern is the concentration of the middle 50% of the scores around the
median
COEFFICIENT OF VARIATION
To relate the measure of dispersion to its average and to convert it to percent form, the
standard deviation is divided by the arithmetic mean. Stating this measure in percentage
form solves the problem presented by differing units. The resulting measure developed
by Pearson is known as the Coefficient of variation.

CV= (SD/mean)100%

Other comparative coefficient of dispersion may be computed when using the other
measures of dispersion:
Quartile Coefficient of Dispersion
CVq= (Q3- Q1) / (Q1+Q3) x 100%

Example 1. Below is the summary of sales of 100 sales representatives.


Compute the coefficient of variations.
Answer the following items. Show your solution.

2. The NSAT scores of 12 students in a certain college were taken and are shown below.
86, 95, 84, 87, 91, 90, 99, 84, 83, 88, 96, 92
Determine:
a.Mean
b.Median
c. Q3
d.P4

You might also like