Statistical Techniques Notes(Monitoring & Evalution - BMEC - Level 4)
Statistical Techniques Notes(Monitoring & Evalution - BMEC - Level 4)
3
WHY STATISTICS?
6
?
STATISTI Information
Data CAL
TOOLS
9
Sample
Population
POPULATION
SAMPLES
Sample Definition:
A Subset of a population.
Representative Sample
– Has the characteristics of the population
WHY SAMPLING?
ADVANTAGES
• Because it is secondary data it is usually cheap and is less time consuming
because someone else has compiled it
• Patterns and correlations are clear and visible
• Taken from large samples so the generalisability is high
• Can be used and re-used to check different variables
• Can be imitated to check changes which increases reliability and
representativeness
DISADVANTAGES
• The researcher cannot check validity and can't find a mechanism for a
causation theory only draw patterns and correlations from the data
• Statistical data is often secondary data which means that is can be easily
be misinterpreted
• Statistical data is open to abuse it can be manipulated and phrased to
show the point the researcher wants to show (effects the objectivity)
• Because this is often secondary data it is hard to access and check
STATISTICAL INFERENCE
GRAPHICAL
PRESENTATION OF
DATA
BAR GRAPH
Government 4 6
4
No. of Schools
Local Body 8 2
0
Private Aided 10
Private Unaided 2
Cont…..
BAR GRAPH
80
Science Maths History
70
60
Anita 25 22 20
50
History
40 Maths
Raju 19 19 22 Science
30
20
Sunil 28 26 21
10
0
Anita Raju Sunil Rani
Rani 20 23 21
PIE CHART
Pie diagrams are popularly used to denote percentage breakdown.
It is a circular statistical graphic which is divided into slices to illustrate numerical proportion.
In a pie chart, The arc length of each slice( consequently its central angle & area) is
proportional to the quantity it represent.
Pie Diagram
15%
Components Percentage
Frequency 2 2 4 7 5 3 1
SHAPE
John 18
Mike 12
Henry 21
Frank 9
George 15
MEASURES OF CENTRAL TENDENCY
INTRODUCTION TO MEASURES OF CENTRAL TENDENCY
Definition: The median is a dataset’s middle value when ordered from least to
greatest.
Formula:
Odd Number of Observations: Median is the middle value.
Even Number of Observations: The median is the average of the two middle
values.
Characteristics:
Less sensitive to extreme values compared to the mean.
Provides a better central location in skewed distributions.
MEDIAN
MODE
The range of a data set is a measure of spread. That is, it measure how spread out the
data are.
The range of a data set is the difference between the largest and the smallest value.
San Francisco 51 54 55 56 58 60 60 61 63 62 58 52
St. Louis 30 35 44 57 66 75 79 78 70 59 45 35
Source: National Weather Service
The largest value for San Francisco is 63 and the smallest is 51.
The range for San Francisco is 63 – 51 = 12.
The largest value for St. Louis is 79 and the smallest is 30.
The range for St. Louis is 79 – 30 = 49.
THE RANGE IS NOT USED IN PRACTICE
Although the range is easy to compute, it is not often used in practice. The
reason is that the range involves only two values from the data set; the
largest and smallest.
The measures of spread that are most often used are the variance and the
standard deviation, which use every value in the data set.
RANGE AND IQR
Range = maximum – minimum
Easy, but NOT as good as the…
Quartiles & Inter-Quartile Range (IQR)
Quartile 1 (Q1) cuts off bottom 25% of data (“25th percentile”)
Quartile 2 (Q2) cuts off two-quarters of data
same as the Median!
Quartile 3 (Q3) cuts off three-quarters of the data (“75th percentile”)
OBTAINING QUARTILES
Order data
Find the median
Look at the lower half of data set
Find “median” of this lower half
This is Q1
Look at the upper half of the data set.
Find “median” of this upper half
This is Q3
EXAMPLE: QUARTILES
Consider these 10 ages:
05 11 21 24 27 28 30 42 50 52
median
Because the variance is computed using squared deviations, the units of the variance are the
squared units of the data. For example, in Battery Lifetime example, the units of the data are
hours, and the units of variance are squared hours. In most situations, it is better to use a
measure of spread that has the same units as the data.
We do this simply by taking the square root of the variance. This quantity is called the standard
deviation. The standard deviation of a sample is denoted s, and the standard deviation of a
population is denoted by σ.
s s2 2
EXAMPLE
Recall that in the Battery Lifetime example, the sample variance was computed as s2 =
2. Find the sample standard deviation.
Battery 3 4 6 5 4 2
Lifetime
Solution: s s 2 2 1.414
The sample standard deviation, s, is the square root of the sample variance.
STANDARD DEVIATION
Standard Deviation:
sx
i
(x x ) 2
n 1
Example: Metabolic Rates 1792 1666 1362 1614 1460 1867 1439
STANDARD DEVIATION
1792 1666 1362 1614 1460 1867 1439
Step 4: Add the products (Midpoint – Mean)2 x (Frequency) over all classes.
(Midpoint – Mean)2 x
(Frequency)
125,440 (Midpoint-Mean)2 Frequency
19,220 = 125,440+19,220+1,872+15,884+54,208+76,176
1,872
15,884 292, 800
54,208
76,176
SOLUTION
Step 5: Since we are computing the sample variance, we divide the sum
obtained in Step 4 by n –1.
(Midpoint-Mean)2 Frequency 292, 800
s2
n 1 50 1
5975.51020
Step 6: Take the square root of the variance to obtain the standard deviation.
s s 2 5975.51020 77.30142
DATA COLLECTION
INTRODUCTION
Decision Making
To minimize the risk of errors in decision-making, it is important that accurate data is collected so
that the researcher doesn't make uninformed decisions.
Save Cost and Time
Data collection saves the researcher time and funds that would otherwise be misspent without a
deeper understanding of the topic or subject matter.
To support a need for a new idea, change, and/or innovation
To prove the need for a change in the norm or the introduction of new information that will be
widely accepted, it is important to collect data as evidence to support these claims.
DATA COLLECTION METHODS
PRIMARY DATA COLLECTION METHODS
Barometric Method
Also known as the leading indicators approach,
researchers use this method to speculate future trends
based on current developments.
When the past events are considered to predict future
events, they act as leading indicators.
PRIMARY DATA COLLECTION METHODS
Qualitative Methods
Qualitative methods are especially useful in situations when historical data is not available. Or there is no need
of numbers or mathematical calculations.
Qualitative research is closely associated with words, sounds, feeling, emotions, colors, and other elements
that are non-quantifiable. These techniques are based on experience, judgment, intuition, conjecture, emotion,
etc.
Surveys
Polls
Interviews
Delphi Technique
Focus Groups
Questionnaire
QUALITATIVE METHODS
Polls
Polls comprise of one single or multiple choice question. When it is
required to have a quick pulse of the audience’s sentiments, you can go for
polls. Because they are short in length, it is easier to get responses from
the people.
Similar to surveys, online polls, too, can be embedded into various
platforms. Once the respondents answer the question, they can also be
shown how they stand compared to others’ responses.
QUALITATIVE METHODS
Surveys
Surveys are used to collect data from the target audience and gather insights into
their preferences, opinions, choices, and feedback related to their products and
services. Most survey software often a wide range of question types to select.
You can also use a ready-made survey template to save on time and effort. Online
surveys can be customized as per the business’s brand by changing the theme, logo,
etc. They can be distributed through several distribution channels such as email,
website, offline app, QR code, social media, etc. Depending on the type and source of
your audience, you can select the channel.
QUALITATIVE METHODS
Interviews
In this method, the interviewer asks questions either face-to-face or
through telephone to the respondents. In face-to-face interviews, the
interviewer asks a series of questions to the interviewee in person and
notes down responses. In case it is not feasible to meet the person, the
interviewer can go for a telephonic interview.
This form of data collection is suitable when there are only a few
respondents. It is too time-consuming and tedious to repeat the same
process if there are many participants.
QUALITATIVE METHODS
Delphi Technique
In this method, market experts are provided with the estimates
and assumptions of forecasts made by other experts in the
industry. Experts may reconsider and revise their estimates and
assumptions based on the information provided by other
experts.
The consensus of all experts on demand forecasts constitutes
the final demand forecast.
QUALITATIVE METHODS
Focus Groups
In a focus group, a small group of people, around 8-10 members,
discuss the common areas of the problem. Each individual
provides his insights on the issue concerned.
A moderator regulates the discussion among the group members.
At the end of the discussion, the group reaches a consensus.
QUALITATIVE METHODS
Questionnaire
A questionnaire is a printed set of questions, either open-ended
or closed-ended. The respondents are required to answer based
on their knowledge and experience with the issue concerned.
The questionnaire is a part of the survey, whereas the
questionnaire’s end-goal may or may not be a survey.
SECONDARY DATA COLLECTION METHODS
Less costs
stratified contain strata or layers people with different levels of income: low, medium, high
grouped by type contains distinctive groups of apartment buildings – towers, slabs, villas, tenement blocks
grouped by location different groups according to animals in different habitats – desert, equatorial forest,
where they are savannah, tundra
SAMPLING METHODS OR TYPES
SAMPLING TECHNIQUES
Probability sampling techniques give the most reliable
representation of the whole population.
Non-probability techniques, relying on the judgment
of the researcher or on accident, cannot generally be
used to make generalizations about the whole population.
PROBABILITY SAMPLING
Convenience
Judgemental
Snowball
Quota
CONVENIENCE SAMPLING
Snowball sampling method is purely based on referrals and that is how a researcher is able to
generate a sample.Therefore this method is also called the chain-referral sampling method.
This sampling technique can go on and on, just like a snowball increasing in size (in this case
the sample size) till the time a researcher has enough data to analyze, to draw conclusive
results that can help an organization make informed decisions.
QUOTA SAMPLING
Population selection The population is selected randomly. The population is selected arbitrarily.
Nature The research is conclusive. The research is exploratory.
Since there is a method for deciding the sample, Since the sampling method is arbitrary, the
Sample the population demographics are conclusively population demographics representation is almost
represented. always skewed.
Takes longer to conduct since the research design This type of sampling method is quick since
Time Taken defines the selection parameters before the neither the sample or selection criteria of the
market research study begins. sample are undefined.
CONCLUSION