Lecture Note 1
Lecture Note 1
Chapter 1
Introduction to Statistics
Definition of Statistics:
Statistics is a science dealing with the collection, classification, analysis, interpretation of
numerical data for drawing conclusions on the basis of their probability.
Population
By saying population, it means the entire set of all organisms, objects or events belong to a study.
Each member of the population must be clearly defined so it can be known with certainty whether
or not any given individual or event belongs to that population.
Sample
A sample is any subgroup or subset of a population. Most of the population in real, situations are
so large that it is impractical to observe all of its members and scientists therefore resort to observe
a relatively small number, termed a sample, which serves to represent that population. The
characteristics of many populations can never be known in the sense of having been directly
observed, but rather they are inferred form measures taken on samples.
Sample
Any subgroup
Population
DSCS 1 2022
STAT 11613 Fundamentals of Statistics
Parameter
A parameter is a numerical term that summarizes or describes a population.
e.g. Population mean
Statistic
A statistic is a numerical term that summarizes or describes a sample. Statistics are obtained from
samples and are used to estimate population parameters. A parameter is a purely descriptive term,
but a statistic is both a descriptive term (because it describes a sample characteristic), and an
estimate of the corresponding population characteristic.
e.g. sample mean
Population
Sampling Inference
Sample
Exercise:
1. Identify the population and the parameter of interest in following research studies.
(a) Finding the average z-score of STAT 11514 students?
(b) Proportion of dengue patients died in 2017?
(c) Finding the average GPAs of undergraduates who got university colors 2019?
2. An election will be held next week and, by polling a sample of the voting population, we
are trying to predict whether the JVP or Good governess candidate will win in the
Provincial Election. Which of the following methods of selection is likely to yield a
representative sample?
(a) Poll all people of voting age leaving a dinner dance in Five Star Hotel in Galle face.
(b) Obtain a copy of the voter registration list, randomly choose 100 names, and
question them.
DSCS 2 2022
STAT 11613 Fundamentals of Statistics
(c) Use the results of a television call-in poll, in which the station asked its listeners to
call in and name their choice.
(d) Choose names from the telephone directory and call these people.
(e) Poll people who participate rally organized by JVP.
Example: To understand the nature of a simple random sample, think of a lottery. Imagine that
10,000 lottery tickets have been sold and that 5 winners are to be chosen. What is the
fairest way to choose the winners? The fairest way is to put the 10,000 tickets in a drum,
mix them thoroughly, and then reach in and one by one draw 5 tickets out. These 5
winning tickets are a simple random sample from the population of 10,000 lottery
tickets. Each ticket is equally likely to be one of the 5 tickets drawn. More importantly,
each collection of 5 tickets that can be formed from the 10,000 is equally likely to
comprise the group of 5 that is drawn. It is this idea that forms the basis for the definition
of a simple random sample.
Exercise:
(1) A physical education professor wants to study the physical fitness levels of students at her
university. There are 20,000 students enrolled at the university, and she wants to draw a
sample of size 100 to take a physical fitness test. She obtains a list of all 20,000 students,
numbered from 1 to 20,000. She uses a computer random number generator to generate
100 random integers between 1 and 20,000 and then invites the 100 students corresponding
to those numbers to participate in the study. Is this a simple random sample?
(2) A quality engineer wants to inspect rolls of wallpaper in order to obtain information on the
rate at which flaws in the printing are occurring. She decides to draw a sample of 50 rolls
of wallpaper from a day’s production. Each hour for 5 hours, she takes the 10 most recently
produced rolls and counts the number of flaws on each. Is this a simple random sample?
(3) Suppose there are 850 students in a school from which a sample of 10 students is to be
selected. The students are numbered from 1 to 850. Since the population runs into three
digits, used random numbers that contain three digits in a random number table. All
numbers exceeding 850 are ignored because they do not correspond to any serial number
in the population. If the same number occurs again, the repetition is ignored. Following
these rules, select 10 students for the sample.
DSCS 3 2022
STAT 11613 Fundamentals of Statistics
Observational studies:
An observational study measures the characteristics of a population by studying individuals in a
sample, but does not attempt to manipulate or influence the variables of interest.
Example: surveys, average daily temperature in Colombo, relationship between class
attendances and final exam score, Studies to determine the effect of cigarette smoking
on the risk of lung cancer.
When designed and conducted properly, controlled experiments can produce reliable information
about cause-and-effect relationships between factors and response.
Probably the biggest difference between observational studies and designed experiments is the
issue of association versus causation. Since observational studies don't control any variables, the
results can only be associations. Because variables are controlled in a designed experiment, we
can have conclusions of causation.
DSCS 4 2022
STAT 11613 Fundamentals of Statistics
Exercise
1. A study considered a random sample of adults and asked them about their bedtime habits.
The data showed that people who drank a cup of tea before bedtime were more likely to go
to sleep earlier than those who didn't drink tea. What type of a study is this?
a) Observational study
b) Design of experiment
2. A research study considered a group of adults and randomly divided them into two groups.
One group was asked to drink tea every night for a week, while the other group was asked
not to drink tea in that week. Researcher then compared when each group felt asleep. What
type of a study is this?
a) Observational study
b) Design of experiment
Variable
A variable is any observable/measurable property of organisms, objects or events such that
individuals may differ in the amount or kind, of this property. The behavior or property under
investigation is considered as the variable of interest.
Variable
Qualitative Quantitative
Qualitative variable
A qualitative variable is a distinction of kind, and not amount. Qualitative measurement consists
of classification into categories such as when people are classified as being male or female. The
designation of male and female do not imply different amount of the variable of gender but rather
indicate different kinds or qualities of this variable. These variables also called as categorical
variables.
Nominal variable
Nominal variables have two or more categories without having any kind of natural order.
DSCS 5 2022
STAT 11613 Fundamentals of Statistics
Ordinal variable
An ordinal variable is a categorical variable for which the possible values are ordered.
Quantitative variable
A quantitative variable is one in which the number derived from the measurement reflects the
amount of the property in question. Height is a quantitative measurement. Height is expressed as
the number of measurement units such as centimeters and this numerical score corresponds to the
actual physical size of the object. Quantitative measurement is the assignment of numerical
quantity to the variable and is what we ordinarily understand the act of measurement to mean.
Continuous variable
A continuous variable is one that may assume any value between maximum and minimum
limits. Height is an example of a continuous variable, since within limits any value is
possible.
Discrete variable
A discrete variable is one that can only assume certain numerical values such as being
restricted to whole numbers.
(1) Following table contains results of some students along with their Z-score for the G.C.E.
A/L examination.
Number of Number of
Vehicle Type Make Model Colour Fuel Type
Passengers Km
Car Toyota Corolla Maroon Petrol 4 12000.85
Van Toyota Dolphin White Diesel 16 86254.14
DSCS 6 2022
STAT 11613 Fundamentals of Statistics
Objective observation
All science may be observations but all observations are not science. Objectivity is the special
quality of scientific observation. An objective observation is one that is not in any way affected by
the opinions, values, or biases of the observer.
Subjective observation
A subjective observation is one that reflects the observer’s personal point of view; clearly, there
can be no science if the raw data are a matter of opinion.
Descriptive Statistics
Descriptive statistics consists of the techniques for organizing, summarizing and extracting
information from numerical data.
Inferential Statistics
Inferential statistics is the body of rules and procedures by which the general statements are made
about people or events. If the statements are made only about those individuals or events that have
been directly observed, science would be impractical. Statistics provides us with procedures for
making predictions based on observed data and for interpreting the outcome of experiments
designed to test predictions.
DSCS 7 2022
STAT 11613 Fundamentals of Statistics
Obtain
Using theory of Calculate appropriate
information on
probability make summary statistics on
variables from the
inferences about each variable measured
selected sample
the population in the sample
DSCS 8 2022
STAT 11613 Fundamentals of Statistics
Exercise:
Determine the population, sample, variable/s and parameter/s of interest under study in the
following situations:
(1) Suppose you read an article in a local newspaper and found that they have mentioned that
the average college student plays 2 hours of video games per week. To test whether this
claim is true for your school, you randomly approach 20 fellow students and ask them how
long (in hours) they play video games per week. You find that the on average, a student
plays video games for 1 hour per week among those you questioned.
(2) Small farmers in a certain village registered at Farmers’ Corporative Organization, which
provides agricultural assistance to them. To get better return on their investment, the
Farmers’ Corporative Organization conducts a study on pineapple in an experimental field
to see how long it takes the fruit to mature (measured in days from the time of plantation)
with a particular fertilizer.
DSCS 9 2022