Chapter 1 To 5
Chapter 1 To 5
CONCEPT OF RESEARCH
1.0 INTRODUCTION
1
1.2 DEFINITIONS OF METHODOLOGY AND RESEARCH
METHODOLOGY
3
find truth and gain knowledge is the scientific method of
investigation.
The researcher must be able to give clear scientific explanation and the
logic behind them for a host of the following questions:
(3) To identify the frequency with which some phenomenon, say stock-
out, occurs or the causes associated with the particular
phenomenon.
During 2000 AD, HLL’s all tea brands were suffering from
competition due to which growth in sales became stagnant. India’s
tea market size is Rs. 6000 crores, in which branded tea market
occupies Rs. 2500 cr and unbranded or loose tea occupies Rs.
3500 cr. Research indicated that growth would be only in loose tea
market. HLL successfully launched ‘A1’ brand tea to snatch the
customers from loose tea market, with punch line ‘strong cup of
tea’ and market segments focused are housewife and journalists.
The philosophy used, ‘due to strong cup of tea, ordinary man like
housewife and or a journalist get courage to face difficult situations
in the life.
P&G also designed ‘Project Dhrusti’ for ‘Whisper’ and ‘Mr. Gold’ for
all products under umbrella brand.
(5) Intention to get respectability within the country and out of the
country.
India’s most successful IT trio is Infosys, Wipro and TCS. India
and Indians are respected in globe due to the powerful brain of
Narayan Moorthy, Azim Premjee and Mr. S. Ramadorai, CEO,
TCS. The respectability is the result of lot of hard work to deliver
the value to the stakeholders. For shareholders the respect is due
to very attractive dividend and stock price. For country, the respect
is due to forex earning ability and employment generation. For
global countries the respect is due to its intelligence, price, service-
delivery and out sourcing ability. I always admire the skill of the trio
and hence would like to describe, ‘Real Gold of Indian IT’. (Per
employee profit after tax for Infosys during 04-05 was Rs. 5,00,000
for TCS Rs. 4,40,000 and for Wipro Rs. 3,70,000). Please read
article 1.35 India’s most Admired Companies.
8
Significance of research and research leads to invention.
Following facts highlight the importance of the research
(1) Research facilitates logical or scientific thinking process
which leads towards flow less strategy formulation.
(2) It facilitates identification of ‘trends’ which ultimately
responsible in marketing opportunities.
(3) Decision making becomes easier for well researched
phenomenon.
(4) Research is important in solving various
(5) operational and planning problems of business and industry.
(6) It helps understanding perception of the society about the
marketer and accordingly designs the marketing strategy.
There are many types of the research like descriptive, analytical, basic,
applied, qualitative,quantitative, conceptual, etc. Let us discuss some of
the research.
Company Weights
10
120
100
80
60
40
20
0
1
Ethics 13
Growth Prospects 17
Returns to shareholders 17
Financial performance 25
Management quality 29
30
25
20
15
10
12
Processed Food Export Development Authority) conducts regular
research for the benefit of agri industry.
For example (a) which first word comes to your mind when you hear
(Sahara) Air line services?
(i) Timely departure and arrival of flights (ii) Safety (iii) consumer
friendly
(b) Which first brand comes to your mid when you think of anteceptic
liquid (i) Dettol (ii) Savlon
(c) Which first brand comes to your mind when you think of luxury of
smoking (i) Gold Flake (ii) Wills (iii) Four-square
13
(ii) Sentence completion : Respondents are asked to complete an
incomplete sentence.
For example (a) when I choose flying by air, the most important
consideration in my decision is ________________________
(b) When any graduate student thinks of post graduate management
education, the most important parameters which influence the
decision are _____________
14
(IV) (a) Conceptual Research : It relates itself to abstract ideas,
concepts or theory. It results in development of new concepts or re-
interpretation of existing one.
P&G also identified that if one common theme is used for brand
promotion, consumer respond favourably. For launching and thereafter
promotion, P&G used the platform of R.D. Burman’s Bollywood songs
like Ek Ladki Ko Dekha to Aisa Lag, Rim Zim Rim Zim, O Haseena
Zulpho Wali Jane Jaha. Example : Abstract idea or myth could be,
‘consumer do not buy product but buy brand’.
It is seen that this fact is half true. For low value products, consumer
may buy product, whereas for high value things may not work
sometimes. For example, US based marketer Metal, when launched
Barbie Doll in Japan, suffered huge set back only because the doll did
not look like Japanese Girl / Women.
15
hypothesis was disproved through observations as well as through
experiments.
V (a) One Time Research: The research for the product/brand is done
only once in a year or once in few years because the market is steady
and consumer tastes, preferences do not change so rapidly. For
example, engineering products, industrial goods like boilers, dg sets,
compressors, etc. do not require frequent changes. In such cases, it is
better to buy the research from specialized research organization.
16
Case Study:
17
(iii) To develop new scientific tools, concepts and theories which
would facilitate reliable and valid study of human behaviour
and social life.
1.72 Scope of social research
Every group of social phenomenon, every phase of human life and every
stage of past and present developments are the materials for the social
scientists.
18
being studied so that there can be no objective procedure for
achieving the truth.
(iv) Common problems faced by researchers are refusal of
sample, improper understanding of questions, loss of
memory of the samples, etc.
(v) The quality of research findings and conclusions depends
upon the soundness of decisions made by the social
investigators on research process as correct definition of
research problem, correct sample selection, appropriate
statistical techniques for data processing. Any mistake in any
of these decision areas will challenge the validity of research
findings.
Reasoning forms the basis of the scientific inquiry. The thought process
of a scientist may be based on deduction, induction or a combination of
both. Let us understand in detail each of these processes of thinking
needed for conducting and drawing conclusions from marketing
research.
19
Deduction
A deduction is true if the premises on which it is based are true, i.e., they
agree with the real world. For example, premise like "world is flat" is a
false premise. Deduction based on such premise will also be false.
Deduction is valid if it is impossible for the conclusions to be false, and if
the premises it is based on is true. That means the method of drawing
conclusion should be logical and valid. Truth and validity of a deduction
together mean that conclusion is not logically justified (even if true) if
either one or more of the premises is false or if the method of deduction
is incorrect. The conclusion may still be correct due to some other
premises not considered. Let us look at an example.
Here both the premises are true, but the argument that led to the
conclusion is not valid and so the deduction is not valid. The conclusion
may be due to some other reasons but not as a result of the given
premises. Let us look at an example where the deduction is not true.
20
Here premise 1 is correct and true in general. But, the rabid dogs are
also dogs. If we consider this, then the first premise is not correct. One
has to correct the first statement, based on the second. This leads to the
invalid conclusion.
This deduction is true and valid. Such deductions are made every
moment by one and all and look obvious.
Induction
Induction is the conclusion drawn from one or more facts, bat not
necessarily from facts alone. The conclusion explains the facts, but the
facts just given are not sufficient to lead to the conclusion. There is a
need for additional facts from the previously learnt knowledge. To
illustrate, suppose Kiran approaches his boss Pavan with a routine
problem, but, to his shock, receives rude treatment for no mistake of his.
Then Kiran can based on his previous experience conclude any of the
following:
Any of these conclusions can explain the fact that Pavan treated Kiran
rudely. However, at the same time, the given fact cannot lead directly to
any of these conclusions. These conclusions are based on some
previous experience, i.e. on some other facts also. And conclusions are
in reality only hypotheses and need further verification to ascertain the
correctness.
21
To explain an observed phenomena a researcher formulates some
hypotheses that needs to be verified by the use of induction. The
researchers then use deduction to check whether each of the
hypotheses can explain the given facts completely by itself. Once this is
done, it is necessary to perform empirical tests with all these hypotheses
and then select a hypothesis that passes these tests. This method is
described as the double movement of reflective thinking by Dewey and is
adapted by Cooper and Schindler. The process the researcher should
follow is as outlined below. The researcher
These steps are interdependent. They are also not sequentially fixed.
Based on the nature of the study some of the above steps may be
eliminated or new steps may be added.
22
Two major characteristics of the scientific method are validity and
reliability. Validity is a measure of the match between what the research
claims to measure and what it actually measures. In other words it
measures the effectiveness of measurement in research. For example, a
people meter on a TV set is supposed to be measuring the viewership of
a particular program, while in reality it only measures the number of
occasions the TV was tuned to that particular channel when the program
was relayed. The Television may be on but there may be no one
watching it. Moreover even if there are viewers, the people-meter cannot
count how many. Thus the assumption that the people-meter measures
viewership validity is wrong. Hence a research which assumes it is
measuring the viewership with people meter is not valid. Validity seems
easily achievable, but as in the above example, there may be minor
deviations that can easily go unnoticed. Hence, to ensure validity one
should carefully and purposefully probe into every detail of the research.
23
Box 1.7 :Steps that make up the scientific method
Observation:
A good scientist is observant and notices thing in the world around him / her. (S)he sees, hears,
or in some other way notices what is gaining on in the world, becomes curious about what’s
happenings and raises a questions about it.
Hypothesis:
This is a tentative answer to the question: an explanation for what was observed. The scientist
tries to explain what caused what was observed (huy7po=under, beneath; thesis = an arranging)
Prediction:
Testing:
Then, the scientist performs the experiment to see if the predicted results are obtained. If the
expected results are, that supports the hypothesis.
24
Scientific Method-In Marketing As Compare Dot Physical Sciences
Research conditions
Measuring Instruments
25
Measuring instruments in physical sciences provide very high accuracy;
For instance, physicists can measure up to a 10-15 of a meter, which is a
millionth of a billionth of a meter. However, in marketing it is difficult to
arrive at such accuracy. For example many questionnaires use a five-
point scale to measure the likelihood of purchase. The scale is shown in
Figure 1.7. Such a scale gives only a crude measure: moreover the
meaning of the words in the given scale may mean different things to
different people. This affects both the validity and reliability of the
research.
Personal Interests
Influence Of Measurement
Sometimes the process of measurement may itself affect the result, i.e.
the process researchers undertake for making the measurement may
result in a change in the outcome. In science, the affect of measurement
on the result is not very pronounced except in fields like quantum
mechanics. But in marketing research the influence of measurement on
the result is very appreciable. For example, when a family has a "people
meter" on its television set, it may modify their viewing habits because
they know that all their viewing is recorded. Similarly, people participating
in focus groups know that they are being observed and so they may
come up with socially acceptable answers.
26
Re-questioning a group of respondents may also affect the results. If a
group of respondents are questioned a second or third time, they may
give different answers from what they would have given if they were
questioned for the first time. For example, if a company has quizzed a
group of respondents about its brand before an advertisement campaign,
then after this the group will start noticing the advertisements with more
interest than it would have done without us quizzing. This would change
their responses in the questioning from what the responses would have
been had they not been questioned before.
Time Pressure
Short-Term Goals
Difficulty In Experimentation
Facts are phenomena that we believe are true. These facts do not
change with the person who reports them. Original documents and fact-
gathering agencies are important sources of facts in marketing research.
Discrete: This type of variable takes only a fixed number of values. For
example, the variable ‘occurrence of sale' can take any of the two values'
1 for sale and '0' for no sale, and so can be called a dichotomous
variable. Similarly, 'degree of liking' is referred to as a polytomy because
it takes multiple values. It can take the value '-1' for dislike, '0' for neither
like nor dislike, and '1' for' like.'
The above two types of variables are different from each other in terms
of their relationship. The other types of variables, based on their
relationship with other variables are the following. Moderating variable
(MY): The moderating variable is the second independent variable
included in the study since it is believed to have a significant effect on
the relationship between the main independent and dependent variables.
For instance, let us state a hypothesis - the introduction of a dating
allowance (IV) will lead to higher productivity (DV), especially among
younger (age is MY) workers. Here the younger workers have a
moderating effect on the original relationship.
Concepts are abstract ideas generalized from particular facts. They are
characteristics associated with certain events, objects, conditions,
situations and the like. They are used to classify, explain and
communicate a particular set of observations. Concepts are developed
out of personal or group experiences over time. The concepts
developed are shared between the users and thus they form the basis
for the development of new concepts. Concepts are also borrowed
across fields. For example, the concept of distance is borrowed from
physical is used in attitude measurement to refer to the degree of
difference between the attitudes of two people. Further, we keep adding
new meanings to the existing concepts, that is we broaden them as we
acquire more knowledge about it. But people teed to differ in the
meanings they attribute to a concept, and this may cause problems in
communication. For example, concepts like personality, leadership,
motivation, social class, etc. have a variety of meaning and so people
may to perfectly understand each other when they use these words.
Constructs are highly abstract concepts. These are not directly tied with
reality but are derived on the basis of other concepts. These are
normally ideas or images specifically invented for a specific research or
theory building purpose.
31
they represent some idea or image to describe the qualitative
requirements of a news report.
Problems
DrinkIt, a soft drink company with a strong trendy image, now intends to
target the older generation in order to expand its market. They have
several alternatives before them. They can either shift the brand image
and project it as a soft drink for all ages or they can introduce
promotional campaigns with the message that the older generation can
project themselves younger by consuming DrinkIt. Or they can introduce
a new product for the older generation, with different packaging and a
different brand name. This is the managerial problem.
Now the research problem can be stated as, "Will the older generation
like to project themselves younger? How will the younger generation
react to each of these alternatives? Will too many brands create
confusion? "
The research problems are questions about the interaction between two
or three variables or concepts. To further analyze these problems, a
hypothesis is prepared.
Hypothesis
Types of Hypotheses
Laws
33
Once a hypothesis is verified by numerous researchers, in different
situations, the relationship between the variables may be considered a
law. A law can be defined as a well-verified statement of relationship
about an invariable association among variables. In business, we do not
have many well-established laws, as the relationships are not fully
invariable. We only have tentative laws that are the only to some extent.
This is because of the presence of a number of IIVs and MVs in real
situations.
Theories And Models
34
1.9 STEPS IN THE RESEARCH PROCESS
But in most cases the decision-makers may not be willing to give the
complete details of the problem to researchers, either because they do
not see the need to do so or because they would like to maintain
36
secrecy. Further, some of the decision-makers perceive problem
identification as the duty of the researchers and do not see any role for
themselves in it.
The reluctance of the management to discuss the problem and the lack
of initiative on the part of a researcher often leads to either incorrect or
partial problem identification. Obviously, if the research process is
continued with such a defective problem identification and statement the
results will not be of any use. This financial human and temporal
resources used in that research would have been wasted. To avoid such
wastage, the researcher should identify the problem in the first instance
itself. Thus one can consider that in marketing research, problem
identification is one of the most important steps. We can even say that
right problem identification is equal to research half done.
To identify the right problem and understand all its dimensions, the
researcher should ideally know the following.
But as we have already seen, the researcher is not provided with all this
information. So, it is up to the researcher to use appropriate techniques
to collected the needed information For example, a researcher can
analyze the problem statement given by the client word by word. This
will bring out the real objectives of the client.
37
P
r1
r2
The researcher needs to understand not only the problem but also the
objectives of the management. Then alone the researchers can align the
research objectives with that of the managerial objectives. Towards this
end the most important requirement in research process is the
communication between the researcher ad the decision-maker. Better
the communication between them, closer the problem statement will be
to the actual problem. The problem will be further clarified, if the
researcher develops a situation model. A situation model is a description
of variables and their relationships to the outcomes.
The researchers can ensure that the data collected is indeed relevant by
asking questions concerning the possible findings, due to each of the
data elements. They should then trace the implications of each of these
findings on the decision. If the findings do to affect the decision then the
data element that lead to the finding, should be dropped.
The major cost constraints in marketing research are time and money.
These costs are justified only by the value of information which is the
result of the research. The decision-maker depends on research for
some additional information that can reduce the uncertainty about the
situation. But the additional data does not have any value if it is not
supplied in time. So the time factor is important in research. Thus time
can be considered major resources for marketing research.
The possible outcomes and their pay off: When a problem is being
considered, if the payoffs in each of the possible outcomes are not very
different from each other, then it does not matter which one is chosen. In
such cases, the value of the research is very low. The higher the
difference in pay off of various outcomes, the more valuable the
information becomes.
Poor problem formulation will come to light during the marketing research exercise. Three main
signals indicate poor performance in this task.
Extensive Iteration
A researcher can use two types of data: primary data, data exclusively
collected for the current problem; secondary data, data collected for
some other purpose and which is useful in the present research. In an
exploratory survey, secondary data is used more regularly because of its
cost and time advantages, whereas in conclusive research the usage
depends upon the case.
The data can also be classified into survey data and experimental data
depending on the method of collection used to collect it. These methods,
i.e. the survey method and the experimental method, can be sources of
primary and secondary data.
41
I. Secondary Research Utilization of data that were developed
for some purposes other than for
solving the problem at hand.
a) Internal secondary data Data generated within the organization
itself, such as sales person call
reports, sales invoices and accounting
records
b) External secondary data Data generated by sources outside the
organization, such as government
reports, trade association data and
data collected by syndicated services
II. Survey Research Systematic collection of information
directly respondents
a) Telephone interviews Collection of information from
respondents via telephone
b) Mail interviews Collection of information from
respondents via mail or similar
techniques
c) Personal interviews Collection of information in a face-to-
face situation.
• Home interviews Personal interviews in the
• Intercept respondent’s home or office
interviews Personal interviews in a central
location, generally a shopping mall
d) Computer interviews Respondents enter data directly into a
computer in response to questions
presented on the monitor
Projective techniques and Designed to gather information that
depth interview respondents are either unable or
unwilling to provide in response to
direct questioning.
Experimental Research Researcher manipulation of
independent variables in such a way
that its effect on one or more other
variable can be measured.
Laboratory experiments Manipulation of the independent
variable (s) in an artificial situation.
Field experiments Manipulation of the independent
variable(s) in a natural situation.
42
Step 4: Selection of the sample
43
Once the analytical methods have been selected and proposal approved,
the researcher should design the response instruments and generate
dummy data. Dummy data is the hypothetical data generated
imaginatively by the researcher to check whether the analysis techniques
are working as they should. This imaginative data has all the
characteristics of original data. The dummy data should then be fed into
the analysis tables and checked for completeness. The analysis will also
expose any redundant data-elements in the original plan, and it will also
reveal any missing essential data-elements.
The resources needed for research are time, finance and personnel.
Time and financial requirements are inversely dependent on each other
i.e. if one likes to reduce financial expenses then the research may take
a longer time period, and if one tries to conduct the same research in
lesser time period the financial expenses may increase. Researchers
need to strike a balance between the use of these two resources.
Executive Summary – a brief statement of the major points from each of the other sections.
The objective is to allow an executive to develop a basic understanding the proposal without
reading the entire proposal.
Background – A statement of the management problem and the factors that influence it.
Objectives – a description of the types of data the research project will generate and how
these data are relevant to the management problem. A statement of the value of the
information should generally be included in this section.
Time and cost requirements – an explanation of the time and costs required by the planned
methodology accompanied by a PERT chart.
Technical Appendixes – any statistical or detailed information in which only one or a few of the
potential readers may be interested.
44
Step 7: Prepare the Research Proposal
Data collection requires trained people for ensuring the validity of the
research. If a firm’s requirement does not warrant a permanent team, it
can outsource personnel from data collector suppliers. When the firm
hires data collectors, it should take care that these hires are well trained.
Moreover, data collectors, whether external or internal should be given a
complete picture of the research before they are assigned a task. This
will reduce errors as the data collector’s interpretation will be in tune with
that of the researcher. Research training and evaluation of field workers
can also help standardize data collection methods and reduce errors.
45
Step 10: Reporting
However, one should remember that a written report might not really
invite action unless the management is very interested in it. Such an
interest can be generated only if the manager is involved in the research
from the beginning of the project. Also, many managers do not respond
to the written word. Some managers may respond, but may
misunderstand the written material. Hence the written report must be
supplemented with an oral report. This oral report, depending upon the
situation, can range from a briefing to a full-fledged audio-visual
presentation to an executive body.
Company
(b) Main Text Introduction Product / Service
46
Statement of findings, conclusions and
recommendations
Summary / Synopsis
1.102
2.0 INTRODUCTION
Quite often managers face situations that are vague in nature. They may
not be able to understand whether a situation present an opportunity or
poses a problem for them.
48
Usually the researcher studies the situation and identifies the main
factors contributing to that particular situation. As the number of the
factors affecting the bottomline of an organization may be large, the
researcher then converts these factors into specific hypotheses relative
to possible actions. These hypotheses are then tested by conclusive
research .For instance, a television company may notice a change in the
sales figures. But they may not be able to pinpoint the factors that affect
the sales. These factors may be technological changes poor marketing
efforts, changing consumer preferences etc. In this case, the researcher
may identify two factors and then convert them into a hypotheses that is
subsequently tested by conclusive research .The process can be
illustrated through the following figure
Exploratory Research
Hypothesis
49
Decision
Fig. 2.1
Source: Marketing Research, Harper W. Boyd, Jr., Ralph Westfall and
Stanley F. Stasch.
The information technology boom has made the search for data very
easy .The internet is one of the largest repositions of secondary data,
available at minimal or at no cost at all. Several Information Technology
techniques such as datamining help researcher collect the right data
and also aid them in establishing connections. At times researcher may
be confused by the information glut.
50
It is absolutely important for the researcher to find new ideas. Hence
research is usually conducted by interviewing people who are
cooperative and have an interest in the subject being researched.
Respondents should be given the freedom to express their ideas
however radical they may be.
51
One major disadvantage of depth interviews is that their results cannot
be compared as interviewers have different interview styles. Another
major disadvantage is the difficulty in analysing the data. The data
available is highly subjective and varies from one analyst to another.
Association Techniques
Completion Techniques
52
The story completion method is similar to the sentence completion
technique. Here the respondents are asked to complete a story told to
him. Usually, a specific situation, like a couple visiting a shopping mall
and having a disagreement over the purchase of a product, is presented
to the respondent for completion. It has been found that the respondents
will build the story using their own experience and attitudes.
Construction Techniques
In a focus group interview, the moderator normally briefs the group about
the topic to be discussed. Then the moderator throws out some
53
questions to the group. These questions will usually be simple, often
aimed at breaking the ice. For instance, they may ask questions like,
‘what do you think about the product?’ Such questions are easy to
answer. This will slowly help generate a discussion among the group.
Once the atmosphere is relaxed, the moderator may bring up more
specific issues and carefully watch the proceedings so as to check
whether the group is coming up with new ideas. Towards the end of the
discussion the moderator may give the group a task. The moderator
then leaves the room and watches the proceedings through a television
(or a one way window) to see if the discussion has caused the client to
think of any more questions to ask.
Normally the researcher will ask around nine to twelve questions. The
moderator also informs the group that the proceedings are being
watched by another group (clients/researchers). Usually, the discussion
is watched by the organization’s staff through a one way window.
Special care has to be taken to see that the moderator blends with the
group. If the moderator is of the same age group and sex, the group
members will express themselves freely. Normally a group consists of 6
to 12 people. However, groups can range from one to a dozen. This
depends on the size of study being conducted. Members for the group
are selected on the basis of their familiarity with the product. And if they
are also articulate they will contribute more effectively to study. The
sample should be dominated by the segment important for the project. It
has been found out that different forms of group for different segments
yield good results.
Features that are common and those that are uncommon are analysed
thoroughly to formulate hypothesis. Researchers should be careful in
selecting the cases for analysis.
Advantages
• Cases are studied comprehensively, taking into consideration all
aspects.
Focus groups can be an effective marketing research tool. But like tools they need to be used
properly in order to provide meaningful results. The most successful focus groups include the
following characteristics.
Appropriate research objectives : Robert Bohle, President of Focus on Issues, at a St. Louis-
based marketing, consulting and research company, says the primary purpose of focus groups is
to test and develop hypothesis. “Focus groups help define various customer population
segments. They help companies make better judgements”, says Bohle.
56
William Newbold ,supervisor of marketing research at Detroit Edison adds, “Focus groups
are ideal for concept testing ,copy testing and preliminary advertising testing. ”The focus
group format allows the moderator to change things on the fly and retest it. " When you
need a real fast turnaround and fast input, "focus groups are appropriate", Newbold says.
Bohle points out, however, that focus groups have a major limitation - they only provide
directional information.
The central figure in focus groups is the moderator, who guides and leads the discussion.
This role is crucial to the overall success of the groups. However, a good moderator must
walk a tightrope between asking questions and eliciting feedback from all the respondents.
"The moderator has to be able to manage without leading (the respondents) and has to be
able to control strong personalities in the group", Newbold says. "A moderator is 70 percent
of what you get from a focus group. The moderator has to make everyone feel important,
so they will talk."
"You need a good moderator who knows the issues but isn't defensive - a moderator can't
be too close to the issues. It should be a third party", says Robert Sitkauskas, director of
communications technology for Detroit Edison's VRU system. Bohle agrees. "The
discussion guide and the moderator are key. The most important part is the ability of the
moderator to listen and probe without passing judgement."
Good recruiting: Another key to good focus groups is proper recruiting. Good
representation is crucial for achieving meaningful results. "The recruiting should be really
representative of he customer base", says Newbold. Representative and balanced focus
groups were one reason Detroit Edison's VRU groups were so successful. Sitauskas says
focus groups are ideal for eliciting customer response from a variety of demographic groups
relatively quickly and easily.
Well planned discussions guide : While Detroit Edison's focus groups had a clear
agenda, Detroit Edison was careful to build flexibility and fluidity into the groups. "You
should have an genda, Sitauskas explains, "but not a rule -based agenda." Bohle adds that
a discussion guider should be just that - a guide. Part of the success will depend upon the
ability of the moderator and the respondents to go beyond the original guide and delve into
the important underlying issues. Indeed, the accessibility issue was never a part of the
original focus groups moderator’s guide. The utility thought power outages were the
problem. However, the moderator uncovered inaccessibility as an underlying problem.
Proper environment : Creating the proper environment is another key to the overall
success of focus groups. To be truly effective, the research sponsors must establish the
proper setting. "The setting has to provide the kind of environment where you can
communicate not what you want to hear." To be truly effective, the research sponsors must
establish the proper groups. " The setting hs to provide the kind of environment where you
can communicate not what you want to hear, but what you ought to hear," Bohle says.
Interpretation : Focus groups are meaningless if the findings are not interpreted correctly.
You need someone insightful to draw the conclusion from the groups. Newbold says
"People can jump to conclusions based on focus groups and can be misled by one strong
personality. We advocate holding multiple groups."
Since the researchers will be in association with the respondent for a longer period, they will
develop an informal relationship. This relationship will help in collecting more data.
Moreover, the data available will be accurate.
57
Some of the disadvantages of case study are :
Researchers tend to generalize the situations, though the case may not
call for such generalizations.
1. Case method
2. Statistical method
Case Method
Statistical Method
59
The following are some of the observations made by Forrester on line
purchases.
• Media and technology lead net shopping. More than 50% of online
shoppers buy software and more than 40% buy books. Buyers
spend an average of more than $100 over three- month period.
Disadvantages
• Concomitant variations
• Sequence of concurrence
60
• Absence of other potential casual factors.
Concomitant Variations
Sequence of Occurrence
2.22 Experimentation
61
made. Field experiments are done in a natural environment and so are
less controlled, but yield more realistic results.
Advantages
The degree to which the certainty of the causal relationship between two
variables can be established is highest in experimental studies. This is
mainly because the experimenter can manipulate the independent
variable and directly observe the effects of this manipulation on the
dependent variable. At least in theory, the ability to manipulate is
unlimited and the relationship can be established with complete certainty.
The use of a control group is useful to assess the existence and potency
of manipulation.
Disadvantages
Laboratory Experiment
Field Experiment
The time involved here is higher than in the laboratory as the observers
have to wait for a natural process to occur. Natural processes are often
slow compared to a laboratory experimental situation. The cost of the
experimentation could also be high as the number of variables, the time
and the amount of control necessary are all equally high. The
experimentation cost could also be high because the necessary number
of variables, the time and the amount of control are equally high.
65
Blind and Double Blind - If the test units do not know that they are a
part of an experiment, the experiment is said to be blind. If the
experimenters (as differentiated from the researchers) also do not know
that they are part of an experiment, then the experiment is known as
double blind.
Symbolic Representation
R O X O
Time
O X O O
X O O
66
O : O represents an observational instance, i.e., one complete set of
measurements made as a part of one observation. For example, pretest
is an observational instance.
A diagram with X’s and O’s is read from left to right in a temporal order
i.e., as we move from left to right in the diagram time moves forward.
X’s and O’s vertical to each other indicate that these two instances occur
simultaneously in different groups.
Parallel rows separated by a dotted line indicate that the groups were not
equalized by randomization process.
Parallel rows not separated by a line indicate that the groups have been
equalized by a randomized process.
O X O O
X O O X
67
environments and time periods. Let us see some factors that affect each
of these types of validity.
History
Maturation
Selection Bias
Statistical Regression
68
This is an unusual bias that occurs if the groups were selected on the
basis of their extreme characteristics. For instance, let us assume that an
observation (O1) was made on the frequency of purchase of an item. A
team was then selected consisting of the top 20% and the bottom 20% of
the buyers. After some time a second observation (O2) was made.
Irrespective of whatever happens between O1 and O2, it is found that
the observations tend to each more average values from their initial
extremes, i.e., the high values at O1 reduce by O2, while the low values
at O1 increase by O2. This is because the units tend to move closer to
their long-run averages in the second observation. To avoid this, a
researcher should not choose a team based on such criteria of extremes.
External validity
69
The sample selected from a population may not be representative due to
many reasons. In such a case, results obtained by using that group
cannot be extrapolated to the population. Moreover, the artificial setting
of an experiment may affect the results.
a) Pre-experimental designs.
b) True experiments.
c) Extensions of true experimental designs.
d) Field experiments.
Pre-experimental Designs
X O
Here a single treatment is given and the observation is made. But there
are no previous observations to compare the results of these
observations with. For example, if an advertisement is released and
then the brand awareness is measured, one cannot gauge the effect of
the campaign without first assessing the pre-campaign awareness levels.
This is a better design than the previous one. The diagram can be given
as
O1 X O2
X O2
O4
X O2
O4
In this design there are two groups – control and experimental. The
experimental group is first subjected to a treatment. After the treatment,
observations are made on both the groups. This design is useful in
situations where prior observations are not possible due to the sudden
and natural occurrence of the treatment. For instance, let us consider a
treatment like a natural disaster, say, the cyclone in Orissa. It is possible
to measure the psychological trauma of the people who underwent it.
Now another group of people who did not undergo this can be used as
the control group.
71
This design is better than the previous designs, but it suffers from a
major weakness : there is no way by which one can estimate whether
both these groups are equivalent. If they are not equivalent, equalizing
them is extremely difficult.
R O1 X O2
R O3 O4
In this design two randomly selected groups are pre-tested, and then the
experimental group alone is administered the treatment. After that both
groups are post tested. The history, maturation and regression, which
occur in the experimental group, occur in the control group as well. One
can account for these effects by using the difference in the control group
observations, that is, O4-O3. The final result is expressed as
But problems like instrumentation, selection and mortality can affect the
results. Moreover, testing interaction with external environment is high in
this design. Other factors also can influence the internal validity. Also,
there is no guard against external validity in this design.
72
The previous design is expensive due to the pre and posttest
requirements. Alternatively, we can have a Posttest only design, i.e., the
control and experimental groups are observed only after the treatment.
The groups are selected at random and the experimental group is
subjected to the treatment straight away. Then observations are made
on both the groups. The diagrammatic representation is as follows.
R X O1
R O2
Researchers do not use the true experimental designs as they re. They
are too simplistic as such. They are often extended to further complex
designs in order to
R O1 X1 O2
73
R O3 X2 O4
R O5 X3 O6
In the above example, if there is any reason to believe that the groups
are not equal and factors such as the type of the city6 in which the store
is located influence the results, then a block design, is needed.
Let us assume that there are three types of cities, class A, class B and
class C. On the basis of these classes, the cities can be divided into
three blocks, say, A, B, C. Now each city in each block is assigned to a
different group at random. Here the size of the city is known as the
blocking factor.
This design can measure both main effects ad interaction effects. The
main effect is only the effect that due to the treatment it is not influenced
by any other factors. In the above example, the main effect is the effect
of various package designs on sales.
BlockIng
Active
Factors Class A Class B Class C
Design 1R X1 X1 X1
Design 2R X2 X2 X2
Design 3R X3 X3 X3
74
This effect is measured by averaging the sales of each treatment over
the three blocks. The interaction effect is the effect of one factor over
others. Here the effect of the size of the city on the choice of the
package design is the interaction effect. The interaction effect may or
may not be significant. The design serves its purpose only if there is a
significant difference across the blocks.
When there are two major factors whose influence should be measured,
the Latin Square Design is used. Let us continue with the Nimba
example. Let us consider two major factors in this example; the size of
the city and the size of the store. Each of the factors is divided into
various levels.. A matrix is built with these two blocking factors, with
levels of one blocking factor forming the rows and the levels of the other
forming the columns. Each cell represents a unique combination of
these two factors. Each of these combinations is then assigned a
separate treatment such that each treatment appears only once in a row
and once in a column. This assignment method places a severe
restriction on the method that the number of levels in each of the
blocking factors should be equal to the number of treatment levels.
Large X1 X2 X3
Medium X3 X1 X2
Small X2 X3 X1
In the example the stores are divided into three levels on the basis of
their size – Large, Medium and Small, and again into three levels on the
basis of the class of the city – Class A, Class B and Class C. The
number of levels of the blocking factors is equal to the number of
treatments, that is the package designs – X1, X2 and X3. Now these
treatments are assigned in such a way that there is no repetition of a
75
design in a row or in a column. The best design for these combinations
is then evaluated from the sales figures.
But this design does not consider the interrelationship of the variables,
the size of the store, the size of the city and design. To do so, we need a
three-dimensional matrix with echo if the three-dimensional matrix with
each of the three variables forming an axis. Thus the matrix will contain
twenty-seven combinations. . Though this becomes a limitation, one can
repeat the Latin Square Design to get these interrelations, but it
becomes more time consuming and expensive. If one is not interested
in these interrelationships, then the Latin Square will yield the required
results.
Factorial Design
Package Designs
Price differences X1 X2 X3
76
Once the treatments are obtained, any of the previous methods like
completely randomized design, randomized block design or Latin Square
analysis can be used for obtaining observations. Table 10.2 represents
various treatments thus obtained. These treatments are then assigned
randomly to various stores if we use a completely randomized design for
further analysis.
Covariance Analysis
This method is used quite widely and is different from the pretest-
posttest control group design. In this design the groups re not randomly
assigned.
O1 X O2
O3 O4
77
The design can be diagrammatically represented as follows.
Here two different samples are involved. While pretest is done on one
sample, posttest is done on the other. This is applicable when we know
when and whom to measure, but do not know when and with whom to
introduce treatment. It is also used in instances where there is no way to
restrict the treatment to a particular group. It is even used in instances
where the pretest is highly reactive and can influence the results. For
example, if a company plans to have an intense campaign on time
management for its employees, a pretest is likely to increase the interest
of the first group and will result in more attention to the campaign. So
the posttest is done on a different group. The test can be
diagrammatically represented as follows.
R O1 (X)
R X O2
78
A time series design is any of the above designs with repeated
observations both before and after treatment. It is highly useful in cases
where the treatment is by the environment and is not in the hands of the
experimenter. In such cases neither the treatment nor its time is known
before the treatment occurs. For example, the effect of the changes in
government policies can be measured, if the data before and after the
changes were collected. Moreover, when a particular study needs a long
observation period, this type of repeated observations are essential.
Extraneous Variables
Un-Controllable
Variables
Like competition
Examples :
Example : Pepsi Co. measured the sales before launch of ‘Oya babli’
campaign and after the ad campaign. It noticed healthy growth in the
sales due to classic picturisation and content of the ad.
Before E1 C1
Experiment
E2 C2
After Experiment
Example : Most of the telecom companies reduce the tariffs, initially only
for say one circle, study the impact of reduction in tariff on sales and
80
market share and then repeat same strategy at other circles too. In such
cases, the circle where benefit is offered is called Experimental Group,
whereas the consumer outside the circle is called Control Group. They
do not participate in experiment, but hope to get the benefit at later stage
and hence buy the pre-paid or post-paid of same company which offered
benefit at other circle.
82
CHAPTER 3
SOURCES OF SECONDARY DATA
3.0 INTRODUCTION
Once the research process starts, the researcher charts out the research
objectives. Then the researcher turns his attention to the research
design and determines the sources of marketing data. At times,
researchers make the mistake of conducting primary research for
collecting data while data is available from secondary sources at a lower
cost. Moreover, the secondary data sources provide information that
may not be obtained by other external agencies. However, the
researcher should evaluate the secondary data for its reliability and
accuracy. The researcher should also check whether the available data
will fulfill the requirements.
In this chapter we will discuss the nature of secondary data and its
advantages and disadvantages. We shall also survey the sources of
secondary data.
Collecting primary data involves field work and further analysis on the
data collected to arrive at a conclusion. For instance, a marketer who
wants to launch a particular product may be interested in collecting data
regarding the buying habits of consumers in that particular region. The
marketer can conduct field surveys to collect the relevant data, which, in
turn, can be analyzed to arrive at a proper conclusion. But at the same
time, he can refer to any published material that has already done an
analysis. While the first method is tedious, time consuming, and
expensive, the second method, which is collecting secondary data, is
fast and inexpensive.
Relevance
Relevance refers to the extent to which the data fit the information needs
of the research problem. While secondary data is available from many
sources, it may not be relevant to the current research. This is mainly
because the secondary data is collected from sources that are not
directly related to the problems at hand. For instance, data may be
available about the lifestyle pattern of a particular society. The study
may have been conducted with respect to different age groups with an
interval of 20 years. But this data would not be relevant for a marketer
who is interested in introducing a particular product for the age group 20-
25.
84
Another major problem is relevance with regard to time. While
secondary data may be available, it need not be relevant to the current
time period. The marketing environment is dynamic and needs up-to-
date information. For instance, in India, census is conducted once in ten
years. So, the data becomes less and less relevant over a period of
time.
Accuracy
Sufficiency
85
Even if the secondary data available is relevant and accurate it may not
be sufficient to meet all the data requirements of the problem being
researched. Though the data may provide relevant material for the
research, it need not contain data regarding the problem being
researched.
Availability
Secondary sources may not be able to provide all the data relevant for
some marketing problems. Thus, the researcher should check whether
the secondary data is available or not before a decision regarding the
selection of appropriate data source is taken.
Secondary data can be collected from various sources. But before using
the data it must be evaluated properly. Let us discuss some parameters
against which the data should be evaluated.
Pertinence
Units of measurement in the data and project t hand should be the same
and should be relevant to the pe4riod of time. For instance, in India, all
measurements are done in the SI (Systems International) system while
in the UK and the US measurements are done in the Imperial system. A
marketer may need information regarding the characteristics of people
within a certain region, say, for instance, coastal Andhra Pradesh.
However, demographic statistics may not be available for the regions
specified by the marketer. Most probably, information will be available
only for cities, states or countries. There is a measurement mismatch
between the available information and the required information. So while
collecting data from an external source, the researcher should take care
to ensure that the source follows the same measurement systems.
Publisher’s Credibility
86
The publisher’s credibility should be evaluated before collecting the data.
Who published the data? Why they chose to publish the data, are the
clients using the data satisfied with the project? Etc. are the questions to
be raised before using secondary data. For instance, political
organizations may publish statistical data through their own
mouthpieces. These data may be incomplete and may conceal some
factors are necessary for drawing proper conclusion. On the other hand,
an organization whose business is collecting, analyzing and selling data
will provide accurate and unbiased data. Some organizations have the
authority to collect and publish certain data. Data published by such
organizations will certainly be more credible. For instance, the revenue
department will be able to publish data regarding the income distribution
in the country.
The data available within the organization, which may be published for
purposes other than the problem at hand, is called the internal data.
87
Internal Data may be sales reports, accounting records, inventory
reports, budgets, profit and loss statement, etc. External data is the data
available outside the organization. This can be data made available to
the organization by external research organizations. Syndicated sources
publish and sell data periodically and library sources include information
from a wide array of publications.
Census Data
In the US, the Bureau of Census publishes census data. Data published
by the Bureau of Census is known for its quality and credibility. In India,
the census report is published by Registrar General of India. The survey
is usually done once in ten years. Census data contains demographic
information. The report contains information on different aspects such as
age, sex, education and occupation. But one of the main disadvantages
of census data is that it has a time lag of three to four years. Yet this is
the only demographic data recognized by users as authentic.
Commercial Information
88
Consumer purchase data is extremely useful for devising marketing
strategies. It provides information which can be used for understanding
the market share, market segments, competitors, effects of advertising,
etc. Syndicated Research agencies provide such information on a
continual basis.
Advertising Data
Test Marketing
90
Name of the Source Information provided
9) Directorate General of Foreign Import Export Regulations
Trade (DGFT)
10) Exim Bank Creditworthiness of importers and
countries.
11) Export Credit Guarantee Insurance covers and financial
Corporation of India (ECGC) guarantees available to
exporters.
12) Agriculture & Processed Food High Tech Agri Farming,
Export Development technology tie-ups, seed capital,
Authority(APEDA) inspection, etc.
13) Central Statistical Organisation Industry Economics
(CSO)
14) National Sample Survey (NSS) Per Capita consumption &
monthly per capita income,
literacy per state, employment
across male & female
etc.
91
CHAPTER 4
HYPOTHESIS
4.0 INTRODUCTION
Definitions of Hypothesis
92
represent random sampling fluctuations. A test of significance is to verify
if the deviation of a statistic is statistically significant or not.
Standard Error
Now we find that the number of standard errors in the sample mean
differs from the hypothesized population mean.
x – u 1470 – 1500
z = -------- = -------------- = 2
σ x 15
From normal distribution tables, we find that the probability of the sample
differing from the populating mean by 2 standard errors is 4.5%. This is
too low a chance for the sample to be from a population of the given
mean. We conclude that the hypothesis that the population mean is Rs.
1500 is wrong. Thus we prove that there is a drop in average purchases
per customer per week from Rs. 1500.
93
Testing for statistical significance follows a well-defined pattern. Though
one may not be able to understand all the terms in these steps at this
stage, we are mentioning them here. They will be discussed in
subsequent chapters. The steps are as follows:
Choose the statistical test: The choice of the statistical test is dependent
on the power and efficiency of the test, the nature of the population, the
method of drawing the sample and the type of measurement scale.
Compute the calculated difference value: After the data is collected, the
formula for the appropriate significance test should be used to obtain the
calculated value.
Obtain critical test value: The critical value for the calculated value
should be looked up in the appropriate tables. The critical value is the
criterion that defines the region of rejection from the region of
acceptance of the null hypothesis.
Make the decision: For most tests, if the calculated value is larger than
the critical value, we reject the null hypothesis rejected and it is conclude
that the alternate hypothesis is accepted. If the critical value is larger,
we conclude we have failed to reject the null.
95% of area
2.5% of area
2.5% of area
Rejection Acceptance
region region
Rejection region
94
4.13 Formulating A Hypothesis
In the above example, the null hypothesis is that the average purchase
has not changed from Rs. 1500. it is represented by
In the above example, the alternative hypothesis is that there has been a
change in the average purchases per week from Rs. 1500. We can have
three different alternative hypotheses about this change. These are
indicated below as:
HA : µ (mu) ≠ Rs 1500
HA : µ (mu)< Rs 1500
In table 4.14, four cases are presented. When the alternative hypothesis
is true, it means that the null hypothesis is false. Using this concept we
can deduce that the cases are accepting a true null hypothesis and
rejecting a false null hypothesis from the table it is clear that in any
testing problem we are liable to two types of errors.
Here, α = 0.05, or 5%
The region between the acceptance and rejection region is called the
critical value. In the above problem the critical values are Rs. 1470 and
Rs. 1530 at a given significance level of 5%. Alternatively, for a given
96
significance level we can calculate the critical values above or below
which a hypothesis can be rejected or accepted.
Let us assume that the mean has actually moved from 1500 to 1470.
Our null hypothesis is that the average purchase is 1500. This is false.
The probability of not finding this out, which is nothing but assuming that
the given hypothesis is correct, is (β) 95%. For a different population
mean the value of β will be different. Ideally, a zero β indicates an error
free test. This means that ideally 1- β must be equal to 1. The closer
this value is to 1, the better is the test. 1- β is considered as the power if
a hypothesis test for it is the probability of rejecting a false null
hypothesis.
Accept H0 Reject H0
H0 is true Correct Wrong – Type-I
error
HA is true Wrong – Type-II Correct
error
Questions like the size of the sample, the quality of the sample size and
weighted data can be raised. These questions will be answered in
advanced statistics books and researchers should make use of them
when required.
97
Two samples are often used when there are two different products. Two
samples, one for each product, are taken and tested to find out whether
they belong to the same population.
Table 4.1 lists the various statistical techniques appropriate for different
measurement levels and test situations. ANOVA is discussed in the text,
but in a separate chapter. Only the most commonly used tests are
surveyed in the following sections. Non-parametric tests except chi
square tests, call for an involved discussion and so are not discussed
here. Refer to advanced spastics books for studying these methods in
detail.
Parametric Tests
Let us illustrate the Z test with the same retail store chain example.
Suppose we have a sample of 121 accounts. It is found that the sample
mean is Rs. 1470 and the sample standard deviation 165. Can you tell
with 90% confidence that the sample mean has not changed from Rs.
1500?
X -µ
t/z = --------------
s
----
√n
Calculated Value :
Standard error = 15
Z=2
Degrees of Freedom = (n-1) (121-1) = 120
99
Critical Value : From the tables for a significance of 10% we get a critical
value of 1.289.
Decision : Here the calculated value is greater than the critical value, and
so we reject the null hypothesis and conclude that the average has
changed.
The procedure is the same as for single sample tests, except the
formulae for finding z and t values. The required formula for the z test is
( X1 - X2 ) - (µ1- µ2)0
z= --------------------------------
S1 2 S2 2
---- + ----
n1 n2
( X1 - X2 ) - (µ1- µ2)0
z= --------------------------------
1 1
2
Sp ---- + ----
n1 n2
Formula is as follows:
Where
100
Two related Samples
(n1-1)S12 + (n2-1)S22
Sp 2 = ----------------------------
n1-n2-2
D
t = -----------
SD /√n
Where
∑D
D = --------
n
(∑ D2 )
∑ D2 - ----------
n
SD = --------------------------
n–1
This is the most widely used non-parametric test, particularly for nominal
data, but it can also be used for higher scales. It is used for actual values
rather than percentages. It is used to find if difference between the
k
χ2 = ∑ (Oi – Ei)2
i=1 Ei
101
One sample Test
Where
Care should be taken in using the chi square method in the following
cases:
Alternative hypothesis is
102
HA:O1<> Ei
Statistical test: The responses are divided into nominal categories and
so we should use Chi square analysis
Calculated value:
Using the Table 4.2 we have calculated the value chi-square to be χ2=
12.68
Decision: here the calculated value is greater than the critical value and
so we reject the null hypothesis conclude that the categories do have an
effect on the intent to purchase a new car.
Table 4.2: The Data and Calculations for Chi-Square with Single
Sample Problem
The basic methodology is same as in the one sample test but the
formula involved is as follows:
103
Here the data is categorized and so is placed in a two
(Oij - Eij)
χ2 = ∑ ∑
i j Eij
104
milk found was 0.98 litre. Can we say with 95% confidence that the
machine is working property?
X -µ
Test Statistics = t/z = -----------
S / √n
Data : x = 0.98 lit., µ = H0=1 lit, n = sample size = 100, standard deviation
= s = 0.01 lit.
0.98 -1 - 0.02
Hence t/z = ---------------- = --------- = 20
0.01/ 100 0.01
x-µ
Test statistics = t/z = ---------
S / √n
60-50 10 10
105
t/z = ----------- = --------- = ----------- = 63.29
5√1000 5 / 31.62 0.158
106
CHAPTER 5
SAMPLING
5.0 INTRODUCTION
The terminology of sampling has evolved over the period of its existence.
Knowledge of these terms is necessary for understanding sampling. Let
us examine these terms through the hypothetical case of Wild goose, a
marketing research firm. This firm wants to find out the types of movies
the owners of VCD players in India would like to watch.
5.11 Element
5.12 Population
The subset of the elements of the population chosen for study is called
the sample or the study sample. The characteristics of a good sample
are discussed later in the chapter. Wild Goose may choose a few cities
for sampling and within these cities it may further select a few families.
The list of the families, thus selected, forms the sample used in the
study.
108
parameter. Statistics are used to estimate the corresponding
Parameters.
Table 5.13 Samples taken to measure the average size of the family.
There are two types of errors (i) imprecision inherent in using statistics to
estimate parameters and, (ii) errors associated with applying a decided
sampling procedure. If probability samples are used, sampling theory
can estimate the degree of imprecision that may be associated with a
sampling design.
109
information, albeit with some uncertainty, within the given resource
constraints.
5.22 Accuracy
5.23 Impossibility
5.25 Quality
111
Before we deal with complex sampling methods, we shall study basic
sampling concepts.
112
Mean ξ = 4.15
Standard Deviation σ = 1.27
Now let us assume that we have taken five different samples and their
results are as follows.
The mean and standard deviation of the five samples do not match with
each other or with the population mean and standard deviations. Then
how do we determine the real mean, or at least the range in which it
falls?
According to the central limit theory for sufficiently large sample (n>30,
where n is the member of limit in a sample), the sample mean will be
distributed around the population mean in a form of distribution referred
to as normal distribution. This result does not depend on the shape or
the size of the population distribution. The major characteristics of a
Sample mean distribution are as follows (See Figure 5.5)
113
For example, 68.2% of the sample means lie within one standard error
from the mean. 95.4% of the sample means lie within two standard
errors from the mean. Similarly, one can calculate the proportion of the
means within a given limit by using normal distr5ibution tables.
Now let us assume 95.4% of all sample means lie within two standard
errors from the sample distribution mean. So, if one takes a sample
mean, there is a 95.4% probability of finding the sample distribution
mean within the two standard errors of the given sample mean. But the
sample distribution mean is the same as that of the population mean.
This means that there is a 95.4% chance of finding the population mean
within two standard errors from the sample mean. To put it in another
way, we can say with 95.4% confidence that the population mean will be
in the range of two standard errors from the sample mean. Symbolically,
the interval is represented as ξ+2σ.
Population and
Sample distribution
Using this concept and a given sample mean and a given standard error,
w can calculate an interval and say with a certain degree of confidence
that the population mean will lie within the given interval. This is known
as the confidence interval estimate.
But how does one calculate the standard error of a sample mean
distribution? This can be done by a formula given as
σ
σx = --------
n
114
Where = standard error of the mean
Standard deviation of the universe.
Number of observations in the sample.
σ (N – n)
σx = -------- -----------
n N
Let us find the interval in which the population mean may lie with a
confidence of 95.4%.
115
4.23 ± 2*0.03
= 4.23 ± 0.06
We can represent the distance from the mean by the number of standard
deviations. This number is represented by Z and is given by
When we learnt about estimating sample sizes right after we learnt about confidence intervals, but
before we learnt about hypothesis testing? The sample size calculations took into account the
confidence level and the width of your confidence interval. We also learned that Type I error is the
mistake of concluding that there is a difference when really there is not one, and “confidence” is the
probability that we’re not making a Type I error. Our sample size calculations were designed to
control Type I error.
Type II error, on the other hand, is the mistake of saying that there is not a difference when there is
one. The probability of not making a Type II error is called “power”. It turns out those standard
formulas we learned for determining sample size tacitly assume only 50% power.
What this means. Suppose we are interested in a pre/post-advertising study and that we want to
detect a 10% difference in unaided awareness before and after. Using our standard sample size for
confidence interval analysis, we’d compute a sample size of 193 pre-study and 193 post-study. This
means that, even if there is a real 10% difference in the marketing place, our research design has
only a 50% chance of detecting it! In other words, there’s a 50% chance of incorrectly concluding
that our company’s advertising had no effect and is a poor use of our employer’s money.
The table identities sample sizes required for specific levels of power and for specific magnitudes of
differences between two independent proportions at 95% confidence. Power computations allow for
other levels of confidence and for a wide variety of other statis5tical tests, but Figure 1 demonstrates
the concept. The table allows us to determine the sample size that best meets our statistical testing
needs, because it accounts for bot6h %Type I and Type II error. A rule of thumb is that we should
aim for a least 70% power.
116
Note that a sample size of about 150 gives just a 40% chance of detecting a real 10%
difference in, for example, unaided advertising awareness. Why spend money on research
that has a better than even chance of calling successful advertising a failure?
The table also helps us quantify the efficiency of different sample sizes. For instance,
remember hearing that sample size is a diminishing returns sort of thing That’s because the
width of a confidence interval is only halved when sample size is quadrupled. But power isn’t
like that; often we can cut Type II error in half by less than doubling sample size. Recall the
example of the pre and post ad test above. With 193 respondents pre and 193 post, we have
a 50% chance of detecting real 10% change in unaided awareness. By increasing sample
size to 348 pre and 3post (an increase in sample size that yields approximately 40% increase
in cost) we cut our chance of missing a genuine 10% bump in unaided and awareness in half.
Designing, selling or buying research without understanding the implications of sample size
on statistical power is likely to lead you to spend an unnecessary amount of money, both
directly (money spent on useless research) and indirectly (discovering false negatives, such
as finding that a successful new ad campaign is not a potential success).
-------------------------------------------
www.intelliquest.com
117
Type of Brief Advantages Disadvantages
Sampling Description
Probability Assign to each 1. Requires 1. Does to make
designs population minimum use of
A. Simple member a knowledge of knowledge of
random unique population in population
number; select advance. that
sample items 2. Free of possible researcher
by use of classification may have.
random errors. 2. Larger errors
numbers. 3. Easy to analyze for same
data and sample size
compute errors than in
stratified
sampling.
118
Type of Brief Advantages Disadvantages
Sampling Description
b. Systematic Use natural 1. If population is 1. If sampling is
ordering or ordered with related to
order respect to periodic
population; pertinent ordering of the
select random property, gives population,
starting point stratification increased
between1 and effect and variability may
the nearest hence reduces be introduced.
integer to the variability 2. estimates of
sampling ratio compared to A. error likely to
(N/n); select 2. Simplicity of be high where
items at drawing there is
interval of sample, easy to stratification
nearest integer check
to sampling
ration.
C. Multistage Use a form of 1. Sampling lists, 1. Errors likely to
random random identification be larger than
sampling in and numbering in A or B for
each of the required only same sample
sampling for members of size
stages where sampling units 2. Errors
there are at selected in increase as
least two sample. number of
stages 2. If sampling sampling units
units are selected
geographically decreases
defined, cuts
down field costs
(i.e. travel)
With Select reduces variability Lack of
probability sampling units knowledge of size
proportionate with probability of each sampling
to size proportionate unit before
to their size selection
increases
variability
119
D. Stratified Select from 1. Assures 1. Requires
1. every representativenes accurate
Proportionate sampling unit s with respect to information on
at other than property that proportion of
last stage, a forms basis of population in
random classifying units; each stratum;
sample therefore, yields otherwise
proportionate less variability increases
to size of than A or C, error.
sampling unit. 2. decreases 2. If stratified
chance of failing lists are not
to include available, may
members of be costly to
population prepare them;
because of possibility of
classification faulty
process classification
3. characteristics of and hence
each stratum can increase in
be estimated and variability
hence
comparisons can
be made.
120
Type of Brief Advantages Disadvantages
Sampling Description
2. Optimum Same as D1 Less variability for Requires
allocation except sample same sample size knowledge of
is than D1. variability of
proportionate pertinent
to variability characteristic within
within strata strata
as well as
their size.
3. Same as D1 More efficient than Less efficient than
Disproportionate except that D1 for comparison D1 for determining
size of sample of strata or where population
is not different errors are characteristics i.e.
proportionate optimum for different more variability for
to size of strata. same sample size.
sampling unit
but is
indicated by
analytical
considerations
or
convenience
E. Cluster Select 1. If clusters are 1. Larger errors
sampling units geographically for comparable
by some form defined, yields size than other
of random lowest field probability
sampling; costs. samples.
ultimate units 2. Requires only 2. Requires ability
are groups; listing of to assign each
select these at individuals in member of
random and selected population
take a clusters. uniquely to a
complete 3. Characteristics cluster, inability
count of each. of clusters as to do so may
well as those of result in
population can duplication or
be estimated. omission of
4. Can be used individuals
121
for subsequent
samples, since
clusters, not
individuals, are
selected and
substitution of
individuals may
be permissible
F. Stratified Select clusters Reduces variability 1. Disadvantages
cluster at random for of plain cluster of stratified
every sampling sampling added
sampling unit. to those of
cluster
sampling.
2. Since cluster
properties may
change,
advantage of
stratification
may be
reduced and
make sample
unusable for
later research.
122
Type of Brief Advantages Disadvantages
Sampling Description
G. Two or more Provides estimates of 1. Complicates
Repetitive: samples of population administration
multiple or any of the characteristics that of field work.
sequential above types facilitate efficient 2. More
(doubt) are taken, planning of computation
using results succeeding sample; and analysis
from earlier therefore, reduces required than in
samples to error or final estimate. non-repetitive
design later sampling.
ones or 3. Sequential
determine if sampling can
they are be used only
necessary. where a very
small sample
can
approximate
representativen
ess and where
the number of
observations
can be
increased
conveniently at
any stage of
the research.
Non- Select a Reduces cost of 1. Variability and
probability subgroup of preparing sample and bias of
Designs the population fieldwork, since estimates
Judgment that, on the ultimate units can be cannot be
basis of selected so that they measured or
available are close together. controlled.
information, 2. Requires strong
can be judged assumptions of
to be considerable
representative knowledge of
of the total population and
population; subgroup
123
take a selected.
complete
count or sub
sample of this
group.
Quota Classify 1. Same cost Introduces bias of
population by considerations as observers’
per income Judgment classification of
properties; (Advantage) subjects and non
determine 2. Introduces some random selection
desired stratification effect. within classes
proportion of
sample from
each class; fix
quotas for
each observer.
1. Select units of Quick and inexpensive Contains unknown
Convenience analysis in any amounts of both
convenient systematic and
manner variable
specified by
the researcher
124
Type of Brief Advantages Disadvantages
Sampling Description
2. Snowball Select units Only highly specific Representativeness
with rate application. of rare
characteristics; characteristic may
additional units not be apparent in
are referred by sample selected.
initial
respondents.
Where
S
Sx = --------
n
Example
A team intends to find the size of the sample they require to estimate the
number of cards owned on an average by the population of Delhi. They
decided that they should have a confidence of 95% and that the size of
the interval estimate should be 0.05 car. They also conducted a
preliminary survey that resulted in a standard deviation of 0.42,
125
Sample design can be basically of two types; probability and non-
probability sampling. Each of these sampling methods contains a variety
of sampling types (sub-designs).
σ 0.05
σx = ---- = ------- = 0.026
Z 1.96
S2
n = ------ ≈ 261
σx2
126
sampling methods like systematic sampling, stratified sampling etc., are
given in Table 5.6 along with their advantages and disadvantages.
Non-probability sampling
(Z S)
Sample size = n = e where
n = sample size, Z = standard normal distribution for certain confidence
level, S = population standard deviation and e = Tolerable error in
estimating the variable.
Illustration
Solution :
First we compute S
Max Value of Cust Satisfaction – Minimum Value of Cust
Satisfaction
Here S = 6
= 10-1
6
= 1.5
Value of Z for 5% significance level is 1.95
Solution:
128
Sample Population : All women in Pune.
Sample Frame : All women between age group 10-50
Sampling Method : Stratified.
Sampling Plan “ Sample frame is divided into 4 groups as follows :
Justification : Beauty creams are costly and hence stratified sampling will
ensure the income i.e. affordability. It is seen that at higher secondary
school level, the girls are more cautious about looks. Hence, the age
limit begins with 10. At the age 50, the ladies might value natural beauty.
Four groups are formed to understand in depth the consumer profile and
its preferences.
Sample size : 1% from each group. (Sample frame for Pune contains 8
lacs ladies)
129