All in 1
All in 1
All in 1
Galileo Galilei Used telescope to prove that the Earth revolves around the Sun
“Any organized inquiry designed and carried out to provide information for solving a
problem.” (Emory William)
In essence, business research provides the needed information that guides managers
to make informed decisions to successfully deal with problems.
“Management Research” is an unbiased, structured and sequential method of
enquiry directed towards a clear implicit or explicit business objective. This might
lead to validating the existing postulates, or arriving at new theories and models.
It aids the manager in his decision-making process.
Business Research Methods
Unit 1 – Introduction to Research
Organizational Level
Regulatory Compliance Customer Driven
Competition Driven Innovation - Technology
Growth, Profitability Environmental Concerns
Economic Consideration Quality, Safety, Health
Social Aspect Failures, Unforeseen Changes
Complexity of the real world Availability of better Tools
Business Research Methods
Unit 1 – Introduction to Research
Types of Research
Many types of research depending on the ‘basis’ like – application, purpose, methodology etc.
Hence the different ‘Classifications’ such as:
Plagiarism
Business Research Methods
Unit 1 – Introduction to Research
Relevance & Significance of Research (or this subject of BRM) for MBA / PGPM Students
Projects in any subject, when done properly, give better understanding, practical
Insights, industry relevance, and application of theory / concepts
Academic Enhancements
Business Research Methods
Unit 1 – Introduction to Research
Scientific Objective and Logical Scope and Limitations clearly spelt out
Purpose is to find answers to pertinent questions, or to find solutions to problems
Concept Construct
Some more …
UNIT-1
Research Methodology-Introduction
Compiled by:
BRM-Introduction to Research Dr Vanita Joshi
1
Session 11-13 Sessions 14-17 Session 18
Scales & Measurement – Attitude Survey Research – Data Collection Data Preparation
Measurement- Comparative V Non- Methods Questionnaire Design Coding, Outliers,
comparative scales Qualitative V Quantitative Methods Missing Values
Reliability & Validity Tests Observation, Indepth Interviews,
Projective Techniques (Guest Sessions
Sessions 10 from industry experts) Mid Sem
Case Analysis – CEC 2
SPSS Session 19-23
Multivariate Techniques –
Introduction Multiple
Sessions 9 SPSS Lab Regression - Testing of
Analysis & Interpretation of goodness of fit Dummy
output - ANOVA
STRUCTURE Variables, SPSS Lab Analysis
& Interpretation
Business Research Numericals Case Analysis
Sessions 5– 8
CEC 1 Case Analysis
Methods
Assignment of Group Projects Semester II Session 24-28
Research Design – Experimental Multivariate Techniques –
Design , Validity Types - Pre , Quazi, Interdependence Techniques
True & Statistical- ANOVA- Factor Analysis - SPSS Lab
Randomized, Block, Latin Square & Analysis & Interpretation
Factorial (SPSS ) Numericals Case Analysis
Session 29-30
Presentations - Projects
Session 1 -4
Introduction to Research- Meaning
& Definition, Significance, Overview Session 31-32
of Methodology . Categories of Session 33 Other Multivariate
Research, Research Types Report Writing Techniques
Content
BRM-Introduction to Research 3
Part A
Introduction- Research &
Business Research
BRM-Introduction to Research 4
Research
• Process of finding solution to a problem after
thorough study and analysis of the situational
factors.
• Managers in organizations constantly are engaged in
studying and analyzing issues …
• Thus involved in some form of research activities in
their workplace.
• Sometimes decisions are good and sometimes not…
11/2/2020 IBS 5
P&G Luring Women with their
Feminine Toothpaste
11/2/2020 IBS 6
Harley Davidson Exploring New Markets
• Problem: Flat domestic scales
• Solution:
– In 1999, Harley-Davidson started a rental program which provided a way to hook
customers on riding and thereby entice them into buying a motorcycle.
– 40 percent of those enrolled in the program were female and about 30 percent
were under the age of 35
• Result:
– Motorcycles rented went up from 401 days in 1999 to a total of 224,134 days
worldwide in 2004.
– 32 percent of rental customers surveyed bought a bike or placed an order after
renting, another 37 percent were planning to buy one within a year.
– Nearly half of the renters spent more than $100 on Harley-Davidson accessories,
such as T-shirts and gloves.
11/2/2020 IBS 7
Launching of iPod
11/2/2020 IBS 8
What is Research ?
• Research is the process of finding solutions
to a problem after a thorough study and
analysis of the situational factors.
• Research is a structured enquiry that utilizes
acceptable scientific methodology to solve
the problems and create new knowledge
that is generally applicable.
BRM-Introduction to Research 9
What is Business Research?
• Business Research may be defined as the
“systematic and objective process of
gathering, recording and analyzing data for
aid in making business decisions” (Zikmund,
Business Research Methods, 2002, p. 6)
• Systematicness and Objectivity are its
distinguishing features of Business Research,
which is important tool for managers and
decision-makers in corporate and non-
corporate organizations
BRM-Introduction to Research 10
Why study Business Research?
• Every organisation faces various operational and
planning problems .
• Research is useful to accelerate the decision making
power and alone it can make possible the
identification of determinants.
• The tools are applied effectively for studies
involving sales analysis, demand forecasting ,
product positioning, new product potential,
performance evaluation etc.
11/2/2020 IBS 11
Why study Business Research?
(Contd..)
• Your job as a treasurer, controller, brand
manager, product manager, marketing or sales
officer, project manager, business analyst or
consultant would involve decision making ,
choosing right course of action from many
alternatives.
• Systematic process of identifying problem to
implementing the right course of action is the
Research process .
11/2/2020 IBS 12
Objectives of Research
• To gain familiarity with a phenomenon.
11/2/2020 IBS 16
Seven I’s of Research
• Identification of problem/opportunity/new
product……………..Reason for research
• Information…………..cornucopia of data…big
data… ART
• Inquisitiveness…………can we….. How why .
thinking out of the box
• Insights………….from qualitative data
• Inferences………..from quantitative data
• Interpretation……..of results thereof
• Implementation ……Result of research..
11/2/2020 FDP IBS 17
SOME TERMINOLOGIES
• Data: Measurement of records of facts made under specific
conditions
BRM-Introduction to Research 18
SOME TERMINOLOGIES
• Dependent variable: The variable on which the effect
of the Independent variable are tested/analyzed.
BRM-Introduction to Research 19
SOME TERMINOLOGIES
Review: A research paper that is a critical
evaluation of research on a particular topic.
Basic
Applied
11/2/2020 IBS 21
Basic Research
• Basic Research- Also called fundamental research.
• Aimed at gaining knowledge than solving a realistic problem.
• First, assumptions are made and tools are used to test the
hypotheses. Then interpretations are drawn.
• Then the general laws are made about the phenomenon.
Gathering knowledge for knowledge’s sake is pure or basic research. (Study
about human behaviour or natural mathematical related
phenomenon )
• Generally used to verify already established facts and theories.
• Eg. Courses offered by B Schools that help fit gap between the
knowledge and applications at corporate world.
• Time involved is flexible
• Applied to entire business community/universally
• ABC /JIT technique of inventory management
11/2/2020 IBS 22
Applied Research- Aid Decision Making
• Applied Research-Aimed at discovering
applications to put them in use in practice, and help
in solving problems. Choosing best amongst various
alternatives is its strategy.(Marketing research,
evaluation research)
• Specific and not general as in basic research
• Implications for immediate action
• Action oriented
• Conducted when a real life problem is to be solved
and decisions are to be taken
• Both the approaches use scientific methods in
various stages of the research process.
11/2/2020 IBS 23
Business Research Design
• Business Design is an arrangement of conditions for
collection and analysis of data in a manner that aims to be
relevant to the research purpose with economy in
procedure.
• Broadly speaking, we can classify research designs into the
following three kinds -
• Exploratory Research
• Descriptive Research
• Causal Research
Conclusive
Research
11/2/2020 IBS 24
Types of Business Research
• Exploratory
• Descriptive
• Causal
11/2/2020 IBS 25
Exploratory Research
11/2/2020 IBS 27
Exploratory Research (contd..)
• Initial research conducted to clarify and define the
nature of a problem.
11/2/2020 IBS 28
Exploratory Research (Contd..)
• Mostly involves qualitative investigation
• Sample not strictly representative
• Flexible
11/2/2020 IBS 29
Comprehensive Expert Opinion Focus
Secondary Survey Group
Case Study
(historical) Discussions
Data
Exploratory
Research
11/2/2020 IBS 30
Exploratory Research Techniques:
11/2/2020 IBS 31
Exploratory Research Techniques
contd…
Comprehensive Case Study
• Focused on single unit of analysis
• Post-hoc study
• Complete presentation of facts as they occur
• Chance of bias and subjectivity
• Eg.. Performance Appraisal System to be adopted by a firm
Take an idea from case studies of companies adopting
different appraisal systems
11/2/2020 IBS 32
Exploratory Research Techniques
contd…
Expert Opinion Survey
• No previous information
• For launching an organic product, doctors and
dieticians’ opinion could be of help
• Different experts’ opinion could be taken
– A note of caution
• Loosely structured and skewed
11/2/2020 33
IBS
Exploratory Research Techniques contd…
Focus Group Discussions:
• Widely used for consumer and motivational studies
• Discussions with significant individuals associated with
problem under study
• Small set ( 6 – 10)representative of larger respondent
population
• Focus Group discuss the concerned topic for around 90
minutes (may be more )
• Trained observer to manage informal non structured
discussion
• Eg : Organic product survey FGD carried out in metros
revealed awareness of organic products to be quite low
11/2/2020 34
IBS
Exploratory Research Techniques
contd…
Tips for conducting successful FGDs
• Planning ,Recruitment – similarity and contrast; but avoid
dominants
• Moderating
Some uses:
• Insights into products and services usage
• Responses to new product and service features
• Responses to new packaging, graphics and other elements in
final delivery package
• Responses to new communication or benefit ideas
• Probing what new things the customers expect or dream of
11/2/2020 35
IBS
Conclusive Research
• Quantitative in nature
• Consequence of exploratory study
• More Structured
• Hypothesis Formulation
11/2/2020 IBS 36
Conclusive Research
Descriptive
Causal
11/2/2020 IBS 37
Descriptive Research
• Research designed to describe characteristics
of a population or a phenomenon.
11/2/2020 IBS 38
Descriptive Research
11/2/2020 IBS 39
I keep six honest serving men, (they
taught me all I knew), their names are
what, and why, and when, and how,
and where and who.”
--Rudyard Kipling
11/2/2020 IBS 40
Descriptive Research- Categories
Longitudinal studies
• Single sample study stretched over a period of time
11/2/2020 IBS 41
Descriptive Research – Cross
Sectional Studies
• Carried out on a single moment in time
• Carried out on a section of respondents from
population under study
• Relevant over the time coordinate of the study
• Extremely useful to study current patterns of
behaviour or opinion
11/2/2020 IBS 42
Descriptive Research –
Longitudinal Studies
• Carried out over a period of time
• Single sample of identified population
• Eg . A panel of consumers to study the
grocery purchase ( Every month/week)
11/2/2020 IBS 43
Conclusive Research – Causal
11/2/2020 IBS 44
Causal Research
• Research conducted to identify cause-and-effect
relationships among variables where the research
problem has already been narrowly defined.
11/2/2020 IBS 46
Difference Between Qualitative and
Quantitative Research
• Qualitative research is a method of inquiry that develops
understanding on human and social sciences, to find the way
people think and feel. A scientific and empirical research
method that is used to generate numerical data, by employing
statistical, logical and mathematical technique is called
quantitative research.
• Qualitative research is holistic in nature while quantitative
research is particularistic.
• The qualitative research follows a subjective approach as the
researcher is intimately involved, whereas the approach of
quantitative research is objective, as the researcher is
uninvolved and attempts to precise the observations and
analysis on the topic to answer the inquiry.
BRM-Introduction to Research 47
Difference Between Qualitative and Quantitative Research
BRM-Introduction to Research 50
SCIENTIFIC METHOD OF PROBLEM SOLVING
/ RESEARCH PROCESS
BRM-Introduction to Research 54
DEVELOPING HYPOTHESIS
• It should be very specific and limited to
the piece of research in hand because it
has to be tested.
• The role of hypothesis is to guide the
researcher by delimiting the area of
research and to keep him on the right
track.
BRM-Introduction to Research 55
DEVELOPING
HYPOTHESIS
• Discussion with colleagues and experts
about the problem, its origin and the
objectives in seeking solution
• Examinations of data and records
• Review of similar studies in the area or of
the studies on similar problems
• Personal investigation which involves
original field interviews.
BRM-Introduction to Research 56
PREPARING RESEARCH AND SAMPLE
DESIGN
• State the conceptual structure within which research
would be conducted
• Type of research design
• Experimental , quasi experimental and non experimental
• Setting of the study
• Population
• Criteria for selection
• Variables
• Sample selection
BRM-Introduction to Research 57
COLLECTING DATA
• Several ways are there to collect the
appropriate data
• Primary data and secondary data
• By observation
• Personal interview
• Telephone interview
• Questionnaires
• Survey
BRM-Introduction to Research 58
EXECUTION OF PROJECT
• It is a very important step in research process
• If it is proceeds on correct lines, the data to be
collected would be adequate and dependable.
• The step should be taken that the data should be
in the control of statistics so that the collected
information is in accordance with the pre
defined designed to tackle this problem
BRM-Introduction to Research 59
ANALYSIS OF DATA
• The analysis of data requires a number of closely
related operations such as establishment of
categories, the application of theses categories to
raw data through tabulation, coding and editing
like statistical interference.
BRM-Introduction to Research 60
ANALYSIS OF DATA
• Coding: this operation is usually done at this
stage through which the categories of data are
transformed into symbols that may be tabulated
and counted.
• Editing: it is the procedure that improves the
quality of the data for coding
• Tabulation: It is a part of the technical
procedure wherein the classified data are put in
the form of tables.
BRM-Introduction to Research 61
HYPOTHESIS TESTING
• After analyzing the data, the researcher is in
position to test the hypothesis.
• Inference
• Student ‘t’ test, Chi-square, F- test are the
examples of statistical techniques
• At end, researcher have reject or not reject the
null hypothesis.
BRM-Introduction to Research 62
DISCUSSION
• Chapter or section of a research report
that explains what the results mean.
• Its very important section to add the
appropriate supportive literatures.
BRM-Introduction to Research 63
PREPARATION OF REPORT OR
THESIS
• The layout of the report should be as
follows.
– Preliminary pages
– The main text
– The end matter
BRM-Introduction to Research 64
Flow Chart of the Research Process
Problem Discovery Problem Selection of
and Definition discovery exploratory research
technique
Sampling
Selection of
exploratory research
technique Probability Nonprobability
Secondary
Experience Pilot Case Collection of
(historical) Data
survey study study data
data Gathering
(fieldwork)
Data
Editing and
Problem definition Processing
coding
(statement of and
Analysis data
research objectives)
Data
Selection of processing
Research Design basic research
method Conclusions
Interpretation
and Report
of
findings
Experiment Survey
Secondary
Laboratory Field Interview Questionnaire Observation
Data Study Report
BRM-Introduction to Research 65
Source: ‘Business Research Methods’, William. G. Zikmund, Chapter 4. Page 61.
When Should Business Research be Undertaken?
Is sufficient time
available?
Yes
Is information
NO Do not
Inadequate for DM?
undertake Business Research
Yes
High importance
of decision?
Yes
Research benefits
greater than costs? Undertake Business Research
BRM-Introduction to Research 66
Source: ‘Business Research Methods’, William. G. Zikmund, Chapter 1. Page 14.
CRITERIA FOR GOOD RESEARCH
BRM-Introduction to Research 67
Contd..
BRM-Introduction to Research 69
PROBLEMS ENCOUNTERED BY
RESEARCHERS IN INDIA
BRM-Introduction to Research 71
Part C
Identification and Formulation
of Research Problem
BRM-Introduction to Research 72
Research Problem
• It refers to some difficulty which a researcher
experiences in the context of either a theoretical or
practical situation and want to obtain a solution for the
same.
BRM-Introduction to Research 73
Selecting Research Problem
• Subject which is overdone should not be
chosen.
1
2 Research Methodology
some pressing practical problems, whereas basic research is directed towards finding
CHAPTER 1
information that has a broad base of applications and thus, adds to the already existing
organized body of scientific knowledge.
(iii) Quantitative vs. Qualitative: Quantitative research is based on the quantitative
measurements of some characterstics. It is applicable to phenomena that can be expressed
in terms of quantities. Qualitative research, on the other hand, is concerned with qualitative
phenomenon, i.e., phenomena relating to or involving quality or kind. For instance, when we
are interested in investigating the reasons for human behaviour (i.e., why people think or do
certain things), we quite often talk of Motivation Research, an important type of qualitative
research. This type of research aims at discovering the underlying motives and desires,
using in depth interviews for the purpose. Other techniques of such research are word
association tests, sentence completion tests, story completion tests and similar other projective
techniques. Attitude or opinion research i.e., research designed to find out how people feel
or what they think about a particular subject or institution is also qualitative research.
Qualitative research is specially important in the behavioural sciences where the aim is to
discover the underlying motives of human behaviour. Through such research we can analyse
the various factors which motivate people to behave in a particular manner or which make
people like or dislike a particular thing. It may be stated, however, that to apply for qualitative
research in practice is relatively a difficult job and therefore, while doing such research,
one should seek guidance from experimental psychologists.
(iv) Conceptual vs. Empirical: Conceptual research is that related to some abstract idea(s) or
theory. It is generally used by philosophers and thinkers to develop new concepts or to
reinterpret existing ones. On the other hand, empirical research relies on experience or
observation alone, often without due regard for system and theory. It is data-based research,
coming up with conclusions which are capable of being verified by observation or experiment.
We can also call it as experimental type of research. In such a research it is necessary to
get facts at firsthand, at their source, and actively to go about doing certain things to
stimulate the production of desired information. In such a research, the researcher must
first provide himself with a working hypothesis or guess as to the probable results. He then
works to get enough facts (data) to prove or disprove his hypothesis. He then sets up
experimental designs which he thinks will manipulate the persons or the materials concerned
so as to bring forth the desired information. Such research is thus characterised by the
experimenters control over the variables under study and his deliberate manipulation of
one of them to study its effects. Empirical research is appropriate when proof is sought that
certain variables affect other variables in some way. Evidence gathered through experiments
or empirical studies are considered to be the most powerful support possible for testing a
given hypothesis.
(v) Some Other Types of Research: All other types of research are variations of one or more
of the above stated approaches, based on either the purpose of research, or the time
required to accomplish research, on the environment in which research is done, or on the
basis of some other similar factors. Form the point of view of time, we can think of research
either as one-time research or longitudinal research. In the former case the research is
confined to a single time-period, whereas in the latter case the research is carried on over
4 Research Methodology
CHAPTER 1
All progress is born of inquiry. Doubt is often better than overconfidence, for it leads to inquiry, and
inquiry leads to invention is a famous Hudson Maxim in context of which the significance of research
can well be understood. Increased amounts of research make progress possible. Research inculcates
scientific and inductive thinking and it promotes the development of logical habits of thinking
and organisation.
The role of research in several fields of applied economics, whether related to business or
to the economy as a whole, has greatly increased in modern times. The increasingly complex
nature of business and governance has focussed attention on the use of research in solving operational
problems. Research, as an aid to economic policy, has gained added importance, both for governance
and business.
Research provides the basis for nearly all government policies in our economic system.
For instance, governments budgets rest in part on an analysis of the needs and desires of the people
and on the availability of revenues to meet those needs. The cost of needs has to be equated to
probable revenues and this is a field where research is most needed. Through research we can
devise alternative policies and can as well examine the consequences of each of these alternatives.
Decision-making may not be a part of research, but research certainly facilitates the decisions of the
policy maker. Government has to chalk out programmes for dealing with all facets of the countrys
various operations and most of these are related directly or indirectly to economic conditions. The
plight of cultivators, the problems of big and small business and industry, working conditions, trade
union activities, the problems of distribution, even the size and nature of defence services are matters
requiring research. Thus, research is considered necessary with regard to the allocation of nations
resources. Another area in government, where research is necessary, is collecting information on the
economic and social structure of the nation. Such information indicates what is happening in the
economy and what changes are taking place. Collecting such statistical information is by no means a
routine task, but it involves a variety of research problems. These days nearly all governments
maintain large staff of research technicians or experts to carry on this work. Thus, in the context of
government, research as a tool to economic policy has three distinct phases of operation, viz.,
(i) investigation of economic structure through continual compilation of facts; (ii) diagnosis of events
that are taking place and the analysis of the forces underlying them; and (iii) the prognosis, i.e., the
prediction of future developments.
Research has its special significance in solving various operational and planning problems
of business and industry. Operations research and market research, along with motivational research,
are considered crucial and their results assist, in more than one way, in taking business decisions.
Market research is the investigation of the structure and development of a market for the purpose of
formulating efficient policies for purchasing, production and sales. Operations research refers to the
application of mathematical, logical and analytical techniques to the solution of business problems of
cost minimisation or of profit maximisation or what can be termed as optimisation problems. Motivational
research of determining why people behave as they do is mainly concerned with market characteristics.
In other words, it is concerned with the determination of motivations underlying the consumer (market)
behaviour. All these are of great help to people in business and industry who are responsible for
taking business decisions. Research with regard to demand and market factors has great utility in
business. Given knowledge of future demand, it is generally not difficult for a firm, or for an industry
6 Research Methodology
to adjust its supply schedule within the limits of its projected capacity. Market analysis has become
an integral tool of business policy these days. Business budgeting, which ultimately results in a
projected profit and loss account, is based mainly on sales estimates which in turn depends on
business research. Once sales forecasting is done, efficient production and investment programmes
can be set up around which are grouped as the purchasing and financing plans. Research, thus,
replaces intuitive business decisions by more logical and scientific decisions.
Research is equally important for social scientists in studying social relationships and in
seeking answers to various social problems. It provides the intellectual satisfaction of knowing a
few things just for the sake of knowledge and also has practical utility for the social scientist to know
for the sake of being able to do something better or in a more efficient manner. Research in social
sciences is concerned with (i) the development of a body of principles that helps in understanding
the whole range of human interactions, and (ii) the practical guidance in solving immediate problems
of human relations.
In addition to what has been stated above, the significance of research can also be understood
keeping in view the following points:
(a) To those students who are to write a masters or Ph.D. thesis, research may mean a
careerism or a way to attain a high position in the social structure;
(b) To professionals in research methodology, research may mean a source of livelihood;
(c) To philosophers and thinkers, research may mean the outlet for new ideas and insights;
(d) To literary men and women, research may mean the development of new styles and creative
work; and
(e) To analysts and intellectuals, research may mean the development of new theories.
Thus, research is the fountain of knowledge for the sake of knowledge and an important source
of providing guidelines for solving different business, governmental and social problems. It is a sort of
formal training which enables one to understand the new developments in ones field in a better way.
2. The second group consists of those statistical techniques which are used for establishing
CHAPTER 1
relationships between the data and the unknowns;
3. The third group consists of those methods which are used to evaluate the accuracy of the
results obtained.
Research methods falling in the above stated last two groups are generally taken as the analytical
tools of research.
At times, a distinction is also made between research techniques and research methods. Research
techniques refer to the behaviour and instruments we use in performing research operations such as
making observations, recording data, techniques of processing data and the like. Research methods
refer to the behaviour and instruments used in selecting and constructing research technique. For
instance, the difference between methods and techniques of data collection can better be understood
from the details given in the following chart:
Type Methods Techniques
1. Library (i) Analysis of historical Recording of notes, Content analysis, Tape and Film listening and
Research records analysis.
(ii) Analysis of documents Statistical compilations and manipulations, reference and abstract
guides, contents analysis.
2. Field (i) Non-participant direct Observational behavioural scales, use of score cards, etc.
Research observation
(ii) Participant observation Interactional recording, possible use of tape recorders, photographic
techniques.
(iii) Mass observation Recording mass behaviour, interview using independent observers in
public places.
(iv) Mail questionnaire Identification of social and economic background of respondents.
(v) Opinionnaire Use of attitude scales, projective techniques, use of sociometric scales.
(vi) Personal interview Interviewer uses a detailed schedule with open and closed questions.
(vii) Focussed interview Interviewer focuses attention upon a given experience and its effects.
(viii) Group interview Small groups of respondents are interviewed simultaneously.
(ix) Telephone survey Used as a survey technique for information and for discerning
opinion; may also be used as a follow up of questionnaire.
(x) Case study and life history Cross-sectional collection of data for intensive analysis, longitudinal
collection of data of intensive character.
3. Laboratory Small group study of random Use of audio-visual recording devices, use of observers, etc.
Research behaviour, play and role analysis
From what has been stated above, we can say that methods are more general. It is the method
that generate techniques. However, in practice, the two terms are taken as interchangeable and
when we talk of research methods we do, by implication, include research techniques within their
compass.
Research methodology is a way to systematically solve the research problem. It may be
understood as a science of studying how research is done scientifically. In it we study the various
steps that are generally adopted by a researcher in studying his research problem along with the logic
behind them. It is necessary for the researcher to know not only the research methods/techniques
but also the methodology. Researchers not only need to know how to develop certain indices or tests,
how to calculate the mean, the mode, the median or the standard deviation or chi-square, how to
apply particular research techniques, but they also need to know which of these methods or techniques,
are relevant and which are not, and what would they mean and indicate. Researchers also need to
8 Research Methodology
understand the assumptions underlying various techniques and they need to know the criteria by
which they can decide that certain techniques and procedures will be applicable to certain problems
and others will not. All this means that it is necessary for the researcher to design a methodology for
his problem as the same may differ from problem to problem. For example, an architect, who designs
a building, has to consciously evaluate the basis of his decisions, i.e., he has to evaluate why and on
what basis he selects particular size, number and location of doors, windows and ventilators, uses
particular materials and not others and the like. Similarly, in research the scientist has to expose the
research decisions to evaluation before they are implemented. He has to specify very clearly and
precisely what decisions he selects and why he selects them so that they can be evaluated by others also.
From what has been stated above, we can say that research methodology has many dimensions
and research methods do constitute a part of the research methodology. The scope of research
methodology is wider than that of research methods. Thus, when we talk of research methodology
we not only talk of the research methods but also consider the logic behind the methods we use
in the context of our research study and explain why we are using a particular method or
technique and why we are not using others so that research results are capable of being
evaluated either by the researcher himself or by others. Why a research study has been undertaken,
how the research problem has been defined, in what way and why the hypothesis has been formulated,
what data have been collected and what particular method has been adopted, why particular technique
of analysing data has been used and a host of similar other questions are usually answered when we
talk about research methodology concerning a research problem or study.
Experimentation is done to test hypotheses and to discover new relationships, if any, among
CHAPTER 1
variables. However, sometimes the conclusions drawn on the basis of experimental data may be
misleading for either faulty assumptions, poorly designed experiments, badly executed experiments
or faulty interpretations. As such the researcher must pay all possible attention while developing the
experimental design and drawing inferences. The purpose of survey investigations may also be to
provide scientifically gathered information to work as a basis for the researchers for their conclusions.
The scientific method is, thus, based on certain basic postulates which can be stated as under:
1. It relies on empirical evidence;
2. It utilizes relevant concepts;
3. It is committed to only objective considerations;
4. It aims at nothing but making only adequate and correct statements about population objects;
5. It results into probabilistic predictions;
6. Its methodology is made known to all concerned for critical scrutiny and are for use in
testing the conclusions through replication;
7. It aims at formulating most general axioms or what can be termed as scientific theories.
Thus, the scientific method encourages a rigorous, method wherein the researcher is guided by
the rules of logical reasoning, a method wherein the investigation proceeds in an orderly manner and
a method that implies internal consistency.
FF
FF
Review concepts
Define and theories Design research Analyse data
research Formulate (including Collect data Interpret
(Test hypothesis
problem hypothesis sample design) (Execution) F and report
Review previous F if any)
research finding V VII
I III IV VI
II
Fig. 1.1
Research Methodology
Business Research Methods
Sessions 3 - 4 : Research Process Framework; Hypotheses - Types
Different researchers / experts have put forth / suggested their process, which
somewhat different from others
Business Research Methods
Sessions 3 - 4 : Research Process Framework; Hypotheses - Types
We will study two different formats – one put forth by C. R. Kothari; and
another one by Cooper-Schindler
Business Research Methods
Sessions 3 - 4 : Research Process Framework; Hypotheses - Types
Hypothesis
Hypothesis – A proposition or suggestion made as the basis for reasoning or
Investigation (Oxford dictionary)
Educated or informed guess about the answer to a question framed in a specific study
Types of Hypotheses –
■ A hypothesis is a statement that can be tested by scientific research. If you want to test
a relationship between two or more things, you need to write hypotheses before you
start your experiment or data collection.
What is a hypothesis?
■ A hypothesis states your predictions about what your research will find. It is a tentative
answer to your research question that has not yet been tested.
■ A hypothesis is not just a guess — it should be based on existing theories and
knowledge. It also has to be testable, which means you can support or refute it through
scientific research methods (such as experiments, observations and statistical analysis
of data).
Variables in hypotheses
■ Research Question: What effect does daily use of social media have on the attention
span of under-16s?
Introduction to Research
Methods in Psychology UNIT 4 HYPOTHESIS FORMULATION AND
SAMPLING
Structure
4.0 Introduction
4.1 Objectives
4.2 Meaning and Characteristics of Hypothesis
4.3 Formulation of Hypothesis
4.4 Possible Difficulties in Formulation of a Good Hypothesis
4.5 Types of Hypotheses
4.5.1 Null Hypothesis
4.5.2 Alternative Hypothesis
4.6 Errors in Testing a Hypothesis
4.7 Importance of Hypothesis Formulation
4.8 Sampling
4.8.1 Definition of Sampling
4.8.2 Sampling Terminology
4.8.3 Purpose of Sampling
4.9 Sampling Methods
4.9.1 Non Probability Sampling
4.9.2 Probability Sampling
4.10 Importance of Sampling
4.11 Let Us Sum Up
4.12 Unit and Questions
4.13 Glossary
4.14 Suggested Readings and References
4.0 INTRODUCTION
Scientific process or all empirical sciences are recognised by two inter-related
concepts, namely; (a) context of discovery (getting an idea) and (b) context of
justification (testing and results). Hypotheses are the mechanism and container
of knowledge moving from the unknown to known. These elements form
techniques and testing ground for scientific discovery. Hypotheses are tentative
explanation and potential answer to a problem. Hypothesis gives the direction
and helps the researcher interpret data. In this unit, you will be familiarised with
the term hypothesis and its characteristics. It is, then, followed by the hypothesis
formulation and types of hypothesis. Errors in hypothesis testing are also
highlighted.
Further, In order to test the hypothesis, researcher rarely collects data on entire
population owing to high cost and dynamic nature of the individual in population.
Therefore, they collect data from a subset of individual – a sample - and make
the inferences about entire population. This leads us to what we should know
about the population and sample. So, researcher plans sample design and uses
46
various method of sampling. This unit will acquaint you with the meaning of Hypothesis Formulation
and Sampling
sampling and basic terminology which is used in sampling design.
Now, it will move to purpose of sampling. And finally, various probability and
non-probability sampling methods along with advantages and disadvantages are
described.
4.1 OBJECTIVES
After reading this unit, you will be able to:
• Define and describe hypothesis and its characteristics;
• explain formulation of hypothesis;
• Enumerate the possible difficulties in formulating hypothesis;
• Explain types of hypotheses;
• Identify in hypotheses testing;
• Define sampling;
• Explain the purpose of sampling; and
• Analyse various probability and non-probability sampling methods.
By stating a specific hypothesis, the researcher narrows the focus of the data
collection effort and is able to design a data collection procedure which is aimed
at testing the plausibility of the hypothesis as a possible statement of the
relationship between the terms of the research problem.
It is, therefore, always useful to have a clear idea and vision about the hypothesis.
It is essential for the research question as the researcher intents to verify, as it
will direct and greatly help to interpretation of the results.
Hypothesis plays a key role in formulating and guiding any study. The hypotheses
are generally derived from earlier research findings, existing theories and personal
observations and experience. For instance, you are interested in knowing the
effect of reward on learning. You have analysed the past research and found that
two variables are positively related. You need to convert this idea in terms of a
testable statement. At this point you may develop the following hypothesis.
Those who are rewarded shall require lesser number of trails to learn the lesson
than those who are not rewarded.
A researcher should consider certain points while formulating a hypothesis:
i) Expected relationship or differences between the variables.
ii) Operational definition of variable.
iii) Hypotheses are formulated following the review of literature
The literature leads a researcher to expect a certain relationship.
Hypotheses are the statement that is assumed to be true for the purpose of testing
its validity.
49
Introduction to Research
Methods in Psychology 4.5 TYPES OF HYPOTHESES
As explained earlier, any assumption that you seek to validate through
investigation is called hypotheses. Hence theoretically, there should be one type
of hypotheses on the basis of the investigation that is, research hypothesis.
However, because of the conventions in scientific enquiries and wording used in
the constructions of the hypothesis, Hypotheses can be classified into several
types, like; universal hypotheses, existential hypotheses, conceptual hypotheses
etc. Broadly, there are two categories of the hypothesis:
i) Null hypothesis
ii) Alternative hypothesis
Researchers usually can not make direct observation of every individual in the
population under study. Instead, they collect data from a subset of individuals- a
sample – and use those observations to make inferences about the entire
population.
Sampling unit: Each individual or case that becomes the basis for selecting a
sample is called sampling unit or sampling elements.
Sampling frame: The list of people from which the sample is taken. It should be
comprehensive, complete and up-to-date. Examples of sampling frame: Electoral
Register; Postcode Address File; telephone book.
Self Assessment Questions (Fill in the blanks)
1) Any identifiable and well specified group of individual is known as
.............................................
2) List of all the units of the population is called ............................
3) Purposes of sampling is to derive the desired information about the
population at the minimum ..................... and maximum ....................
4) The way the researcher selects the sample is known as .....................
5) ........................... is the miniature picture of entire group.
Answers: (1) population, (2) sampling frame, (3) cost, reliability,
(4) sampling design, (5) sample.
For example, an investigator may take student of class X into research plan
because the class teacher of the class happens to be his / her friend. This illustrates
accidental or convenience sampling.
Quota sampling ensures that some differences are in the sample. In haphazard
sampling, all those interviewed might be of the same age, sex, or background.
But, once the quota sampler fixes the categories and number of cases in each
category, he or she uses haphazard or convenience sampling. Nothing prevents
the researcher from selecting people who act friendly or who want to interviewed.
Quota sampling methods are not appropriate when the interviewers choose who
they like (within above criteria) and may therefore select those who are easiest
to interview, so, sampling bias can take place. Because not using the random
method, it is impossible to estimate the accuracy. Despite these limitations, quota
sampling is a popular method among non-probability methods of sampling,
because it enables the researcher to introduce a few controls into his research
plan and this methods of sampling are more convenient and less costly then
many other methods of sampling.
For studying attitude toward any national issue, a sample of journalists, teacher
and legislators may he taken as an example of purposive sampling because they
can more reasonably be expected to represent the correct attitude than other
class of people residing in country.
Purposes sampling is some what less costly, more readily accessible, more
convenient and select only those individual that are relevant to research design.
v) Systematic sampling
Systematic sampling is another method of non-probability sampling plan, though
the label ‘systematic’ is somewhat misleading in the sense that all probability
sampling methods are also systematic sampling methods. Due to this, it often
sounds that systematic sampling should be included under one category of
probability sampling, but in reality this is not the case.
Despite these advantages, systematic sampling ignores all persons between every
ninth element chosen. Then it is not a probability sampling plan. In Systematic
sampling there is a chance to happen the sampling error if the list is arranged in
a particular order.
Activity
Make a list of some research studies where some of the non probability
methods could be used. Also justify the choice of particular sampling method
you have selected for the study.
A blindfolded person, then, may be asked to pick up one slip. Here, the probability
of each slip being selected is 1-40. Suppose that after selecting the slip and
noting the name written on the slip, he again returns it to the box. In this case, the
probability of the second slip being selected is again 1/40. But if he does not
return the first slip to the box, the probability of the second slip becomes 1/39.
When an element of the population is returned to the population after being
selected, it is called sampling with replacement and when it is not returned, it is
called sampling without replacement.
Thus random sampling may be defined as one in which all possible combinations
of samples of fixed size have an equal probability of being selected.
57
Introduction to Research Advantages of simple random sampling are:
Methods in Psychology
1) Each person has equal chance as any other of being selected in the sample.
2) Simple random sampling serves as a foundation against which other methods
are sometimes evaluated.
3) It is most suitable where population is relatively small and where sampling
frame is complete and up-to-date.
4) As the sample size increases, it becomes more representative of universe.
5) This method is least costly and easily assessable of accuracy.
Despite these advantages, some of the disadvantages are:
1) Complete and up-to-date catalogued universe is necessary.
2) Large sample size is required to establish the reliability.
3) When the geographical dispersion is so wider therefore study of sample
item has larger cost and greater time.
4) Unskilled and untrained investigator may cause wrong results.
Activity
In a class of 140 students, select a simple random sample of size 20 students
with replacement technique. Also mention the probability of each one of
140 students being included in the sample.
Having divided the population into two or more strata, which are considered to
be homogeneous internally, a simple random sample for the desired number is
taken from each population stratum. Thus, in stratified random sampling the
stratification of population is the first requirement.
There can be many reasons for stratification in a population.
Two of them are:
1) Stratification tends to increases the precision in estimating the attributes of
the whole population.
2) Stratification gives some convenience in sampling. When the population is
divided into several units, a person or group of person may be deputed to
supervise the sampling survey in each unit.
Advantages of stratified Random Sampling are:
1) Stratified sampling is more representative of the population because
formation of stratum and random selection of item from each stratum make
it hard to exclude in strata of the universe and increases the sample’s
58
representation to the population or universe.
2) It is more precise and avoids the bias to great extent. Hypothesis Formulation
and Sampling
3) It saves time and cost of data collection since the sample size can be less in
the method.
Despite these advantages, some of the disadvantages of stratified sampling are:
1) Improper stratification may cause wrong results.
2) Greater geographical concentration may result in heavy cost and more time.
3) Trained investigators are required for stratification.
iii) Cluster sampling
A type of random sample that uses multiple stages and is often used to cover
wide geographic areas in which aggregated units are randomly selected and then
sample are drawn from the sampled aggregated units or cluster
For example, if the investigator wanted to survey some aspect of 3rd grade
elementary school going children. First, a random sample of number of states
from the country would be selected. Next, within each selected state, a random
selection of certain number of districts would be made. Then within district a
random selection of certain number of elementary schools would be made. Finally
within each elementary school, a certain number of children would be randomly
selected. Because each level is randomly sampled, the final sample becomes
random. However, selection of samples is done to different stages. This is also
called multi stage sampling.
This sampling method is more flexible than the other methods. Sub-divisions at
the second stage unit needs be carried out only those unit selected in the first
stage. Despite these merits, this sampling method is less accurate than a sample,
containing the same number of the units in single stage samples.
Self Assessment Questions
1) Non probability sampling is one which there is way of assessing the
probability of the element or group of element of population, being
included in the sample. T/F
2) Simple random sampling is the core technique and attaches equal
probability to each unit of the population to be selected. T/F
3) Cluster sampling method sometimes known as multi stage sampling
method. T/F
4) Snowball technique is a probability sampling method. T/F
5) Stratified sampling is more representative for the population than other
methods. T/F
Answer: (1) F, (2) T, (3) T, (4) F, (5) T.
The three main advantage of sampling are that cost in lowest, data collection is
faster, and since the data set is smaller, it is possible to ensure homogeneity and
to improve the accuracy and quality of data (Ader, Mellenbergh & Hard (2008)
Researchers rarely survey the entire population for two reasons: The cost is too
high, and the population is dynamic in that the individual making up the
population may change over time. Sampling methods are of two types i.e. Non
probability and probability sampling methods. Probability sampling methods
are those in which some probability to each unit of the population to be included
in the sample and this is more representative. Three different probability sampling
method are discussed as simple random sampling, stratified random sampling
and cluster / multi stage sampling. The other non probability sampling methods
discussed are convenience sampling, Quota sampling, Purposive sampling,
Snowball sampling and systematic sampling. These methods are also used but
lack the representative character of samples.
4.13 GLOSSARY
Hypothesis : A tentative and testable statement of a potential
relationship between two or more variables.
Null hypothesis : The hypothesis that is of no scientific interest;
sometimes the hypothesis of no difference.
Alternative hypothesis : Statistical term for research hypothesis that
specifies values that researcher believes to hold
true.
Population : It is the aggregate from which a sample is drawn.
In statistics, it refers to any specified collection of
objects, people, organisation etc.
Population size : It is the total number of units present in the
population.
Sampling units : They are members of the population.
Sampling frame : It is the list of all the units of population.
Sampling design : It is a definite plan for obtaining a sample from a
given population.
Sample size : It is the total number of units in the sample.
Simple random sample : It is a sample in which each unit of the population
has an equal chance of being selected in the
sample.
Research Design
Exploratory Conclusive
Research Research
Design Design
Descriptive Causal
Research Research
Cross –
Longitudinal
Sectional
Design
Design
Single Multiple
Cross – Cross –
Sectional Sectional
Design Design
Business Research Methods
Sessions 5 - 6 : Exloratory Research; LR; and Descriptive
Focus Group
Interview Types
Standardised Non-Standardised
Variations
a) Single or b) Multiple cross- sectional designs
Business Research Methods
Sessions 5 - 6 : Exloratory Research; LR; and Descriptive
Variations / Types
a) Panel variety
b) Cohort groups
Business Research Methods
Sessions 5 - 6 : Exloratory Research; LR; and Descriptive
Survey research
Surveys - Questionnaire
Types
Merits and Demerits
Business Research Methods
Sessions 7 - 8 : Causal Research; Experimental Designs
Research Design
Exploratory Conclusive
Research Research
Design Design
Descriptive Causal
Research Research
Cross –
Longitudinal
Sectional
Design
Design
Single Multiple
Cross – Cross –
Sectional Sectional
Design Design
Business Research Methods
Sessions 7 - 8 : Causal Research; Experimental Designs
Concomitant variation
Time Order of occurrence of variables
Absence of other possible causal factors
Business Research Methods
Sessions 7 - 8 : Causal Research; Experimental Designs
Some Terms:
Dependent Variable and Independent Variable Test Units
Response Quantitative Factors and Qualitative Factors
Levels of a Factor Experiment Treatment
Experimental Group Control Group
Extraneous Variable Moderating Variable Hypothesis
Active Factor Blocking Factor Interaction Effect
Construct Dimensions
Business Research Methods
Sessions 7 - 8 : Causal Research; Experimental Designs
Randomization Matching
Classification of
Experimental Designs
Business Research Methods
Sessions 7 - 8 : Causal Research; Experimental Designs
Experimental Group 1 : R O1 X O2
Control Group 1: R O3 O4
Experimental Group 2: R X O5
Control Group 2: R O6
Business Research Methods
Sessions 7 - 8 : Causal Research; Experimental Designs
Statistical Designs
1. History
2. Maturation
3. Testing
4. Instrumentation
5. Selection bias
6. Statistical Regression
7. Experimental Mortality
Business Research Methods
Sessions 7 - 8 : Causal Research; Experimental Designs
Causal studies focus on an analysis of a situation or a specific problem to explain the patterns of
relationships between variables. Experiments are the most popular primary data collection methods
in studies with causal research design.
The presence of cause cause-and-effect relationships can be confirmed only if specific causal
evidence exists. Causal evidence has three important components:
1. Temporal sequence. The cause must occur before the effect. For example, it would not be
appropriate to credit the increase in sales to rebranding efforts if the increase had started before the
rebranding.
2. Concomitant variation. The variation must be systematic between the two variables. For example,
if a company doesn’t change its employee training and development practices, then changes in
customer satisfaction cannot be caused by employee training and development.
3. Nonspurious association. Any covarioaton between a cause and an effect must be true and not
simply due to other variable. In other words, there should be no a ‘third’ factor that relates to both,
cause, as well as, effect.
The following are examples of research objectives for causal research design:
• To assess the impacts of foreign direct investment on the levels of economic growth in
Taiwan
• To analyse the effects of re-branding initiatives on the levels of customer loyalty
• To identify the nature of impact of work process re-engineering on the levels of employee
motivation
• Causal studies may play an instrumental role in terms of identifying reasons behind a wide
range of processes, as well as, assessing the impacts of changes on existing norms, processes
etc.
• Causal studies usually offer the advantages of replication if necessity arises
• This type of studies are associated with greater levels of internal validity due to systematic
selection of subjects
It can be difficult to reach appropriate conclusions on the basis of causal research findings. This is
due to the impact of a wide range of factors and variables in social environment. In other words,
while casualty can be inferred, it cannot be proved with a high level of certainty.
It certain cases, while correlation between two variables can be effectively established; identifying
which variable is a cause and which one is the impact can be a difficult task to accomplish.
Imagine taking 2 samples of the same plant and exposing one of them to sunlight, while the other is
kept away from sunlight. Let the plant exposed to sunlight be called sample A, while the latter is
called sample B.
If after the duration of the research, we find out that sample A grows and sample B dies, even
though they are both regularly wetted and given the same treatment. Therefore, we can conclude
that sunlight will aid growth in all similar plants.
The experimental research method is widely used in physical and social sciences, psychology, and
education. It is based on the comparison between two or more groups with a straightforward logic,
which may, however, be difficult to execute.
Mostly related to a laboratory test procedure, experimental research designs involve collecting
quantitative data and performing statistical analysis on them during research. Therefore, making it
an example of quantitative research method
Although very practical, experimental research is lacking in several areas of the true-experimental
criteria. The pre-experimental research design is further divided into three types
In this type of experimental study, only one dependent group or variable is considered. The study is
carried out after some treatment which was presumed to cause change, making it a posttest study.
This research design combines both posttest and pretest study by carrying out a test on a single
group before the treatment is administered and after the treatment is administered. With the
former being administered at the beginning of treatment and later at the end.
Static-group Comparison:
In a static-group comparison study, 2 or more groups are placed under observation, where only one
of the groups is subjected to some treatment while the other groups are held static. All the groups
are post-tested, and the observed differences between the groups are assumed to be a result of the
treatment.
This is very common in educational research, where administrators are unwilling to allow the
random selection of students for experimental samples.
Some examples of quasi-experimental research design include; the time series, no equivalent control
group design, and the counterbalanced design.
The true experimental research design must contain a control group, a variable that can be
manipulated by the researcher, and the distribution must be random. The classification of true
experimental design include:
The posttest-only Control Group Design: In this design, subjects are randomly selected and assigned
to the 2 groups (control and experimental), and only the experimental group is treated. After close
observation, both groups are post-tested, and a conclusion is drawn from the difference between
these groups.
The pretest-posttest Control Group Design: For this control group design, subjects are randomly
assigned to the 2 groups, both are presented, but only the experimental group is treated. After close
observation, both groups are post-tested to measure the degree of change in each group.
Solomon four-group Design: This is the combination of the pretest-only and the pretest-posttest
control groups. In this case, the randomly selected subjects are placed into 4 groups.
The first two of these groups are tested using the posttest-only method, while the other two are
tested using the pretest-posttest method.
Only one group of carefully selected subjects are considered in this research, making it a pre-
experimental research design example. We will also notice that tests are only carried out at the end
of the semester, and not at the beginning.
Further making it easy for us to conclude that it is a one-shot case study research.
In the course of employment, organizations also carry out employee training to improve employee
productivity and generally grow the organization. Further evaluation is carried out at the end of each
training to test the impact of the training on employee skills, and test for improvement.
Here, the subject is the employee, while the treatment is the training conducted. This is a pretest-
posttest control group experimental research example.
However, this may be influenced by factors like the natural sweetness of a student. For example, a
very smart student will grab more easily than his or her peers irrespective of the method of teaching.
Variables
Experimental research contains dependent, independent and extraneous variables. The dependent
variables are the variables being treated or manipulated and are sometimes called the subject of the
research.
The independent variables are the experimental treatment being exerted on the dependent
variables. Extraneous variables, on the other hand, are other factors affecting the experiment that
may also contribute to the change.
Setting
The setting is where the experiment is carried out. Many experiments are carried out in the
laboratory, where control can be exerted on the extraneous variables, thereby eliminating them.
Other experiments are carried out in a less controllable setting. The choice of setting used in
research depends on the nature of the experiment being carried out.
Multivariable
Experimental research may include multiple independent variables, e.g. time, skills, test scores, etc.
Medicine: Experimental research is used to provide the proper treatment for diseases. In most
cases, rather than directly using patients as the research subject, researchers take a sample of the
bacteria from the patient's body and are treated with the developed antibacterial
The changes observed during this period are recorded and evaluated to determine its effectiveness.
This process can be carried out using different experimental research methods.
Education: Asides from science subjects like Chemistry and Physics which involves teaching students
how to perform experimental research, it can also be used in improving the standard of an academic
institution. This includes testing students' knowledge on different topics, coming up with better
teaching methods, and the implementation of other programs that will aid student learning.
Human Behavior: Social scientists are the ones who mostly use experimental research to test human
behaviour. For example, consider 2 people randomly chosen to be the subject of the social
interaction research where one person is placed in a room without human interaction for 1 year.
The other person is placed in a room with a few other people, enjoying human interaction. There will
be a difference in their behaviour at the end of the experiment.
UI/UX: During the product development phase, one of the major aims of the product team is to
create a great user experience with the product. Therefore, before launching the final product
design, potential are brought in to interact with the product.
For example, when finding it difficult to choose how to position a button or feature on the app
interface, a random sample of product testers are allowed to test the 2 samples and how the button
positioning influences the user interaction is recorded.
Observational Study: This type of study is carried out over a long period. It measures and observes
the variables of interest without changing existing conditions.
When researching the effect of social interaction on human behavior, the subjects who are placed in
2 different environments are observed throughout the research. No matter the kind of absurd
behavior that is exhibited by the subject during this period, it's condition will not be changed.
This may be a very risky thing to do in medical cases because it may lead to death or worse medical
conditions.
Simulations: This procedure uses a mathematical, physical, or computer models to replicate a real-
life process or situation. It is frequently used when the actual situation is too expensive, dangerous,
or impractical to replicate in real life.
This method is commonly used in engineering and operational research for learning purposes and
sometimes as a tool to estimate possible outcomes of real research. Some common situation
software are Simulink, MATLAB, and Simul8.
Not all kinds of experimental research can be carried out using simulation as a ldata collection tool.
It is very impractical for a lot of laboratory-based research that involves chemical processes.
Surveys: A survey is a tool used to gather relevant data about the characteristics of a population,
and is one of the most common data collection tools. A survey consists of a group of questions
prepared by the researcher, to be answered by the research subject.
Surveys can be shared with the respondents both physically and electronically. When collecting data
through surveys, the kind of data collected depends on the respondent, and researchers have
limited control over it.
This is because it takes place in a real-life setting, where extraneous variables cannot be eliminated.
Therefore, it is more difficult to conclude non-experimental studies, even though they are much
more flexible and allow for a greater range of study fields.
The relationship between cause and effect cannot be established in non-experimental research,
while it can be established in experimental research. This may be because many extraneous
variables also influence the changes in the research subject, making it difficult to point at a particular
variable as the cause of a particular change
Internal and external validity are concepts that reflect whether or not the results of a study are
trustworthy and meaningful. While internal validity relates to how well a study is conducted (its
structure), external validity relates to how applicable the findings are to the real world.
For example, if you implement a smoking cessation program with a group of individuals, how sure
can you be that any improvement seen in the treatment group is due to the treatment that you
administered?
Internal validity depends largely on the procedures of a study and how rigorously it is performed.
Internal validity is not a "yes or no" type of concept. Instead, we consider how confident we can be
with the findings of a study, based on whether it avoids traps that may make the findings
questionable.
The less chance there is for "confounding" in a study, the higher the internal validity and the more
confident we can be in the findings. Confounding refers to a situation in which other factors come
into play that confuses the outcome of a study. For instance, a study might make us unsure as to
whether we can trust that we have identified the above "cause-and-effect" scenario.
In short, you can only be confident that your study is internally valid if you can rule out alternative
explanations for your findings. As a brief summary, you can only assume cause-and-effect when you
meet the following three criteria in your study:
If you are looking to improve the internal validity of a study, you will want to consider
aspects of your research design that will make it more likely that you can reject alternative
hypotheses. There are many factors that can improve internal validity.
Just as there are many ways to ensure that a study is internally valid, there is also a list of
potential threats to internal validity that should be considered when planning a study.2
• Attrition: Participants dropping out or leaving a study, which means that the results are
based on a biased sample of only the people who did not choose to leave (and possibly who
all have something in common, such as higher motivation)
• Confounding: A situation in which changes in an outcome variable can be thought to have
resulted from some third variable that is related to the treatment that you administered.
• Diffusion: This refers to the treatment in a study spreading from the treatment group to the
control group through the groups interacting and talking with or observing one another. This
can also lead to another issue called resentful demoralization, in which a control group tries
less hard because they feel resentful over the group that they are in.
• Experimenter bias: An experimenter behaving in a different way with different groups in a
study, which leads to an impact on the results of this study (and is eliminated through
blinding)
• Historical events: May influence the outcome of studies that occur over a period of time,
such as a change in the political leader or natural disaster that influences how study
participants feel and act
• Instrumentation: It's possible to "prime" participants in a study in certain ways with the
measures that you use, which causes them to react in a way that is different than they
would have otherwise.
• Maturation: This describes the impact of time as a variable in a study. If a study takes place
over a period of time in which it is possible that participants naturally changed in some way
(grew older, became tired), then it may be impossible to rule out whether effects seen in the
study were simply due to the effect of time.
• Statistical regression: The natural effect of participants at extreme ends of a measure falling
in a certain direction just due to the passage of time rather than the effect of an intervention
• Testing: Repeatedly testing participants using the same measures influences outcomes. If
you give someone the same test three times, isn't it likely that they will do better as they
learn the test or become used to the testing process so that they answer differently?
Ecological validity, an aspect of external validity, refers to whether a study's findings can be
generalized to the real world.
While rigorous research methods can ensure internal validity, external validity, on the other hand,
may be limited by these methods.
Another term called transferability relates to external validity and refers to a qualitative research
design. Transferability refers to whether results transfer to situations with similar characteristics.
Factors That Improve External Validity
• Consider psychological realism: Make sure that participants are experiencing the events of a
study as a real event by telling them a "cover story" about the aim of the study. Otherwise,
in some cases, participants might behave differently than they would in real life if they know
what to expect or know what the aim of the study is.
• Do reprocessing or calibration: Use statistical methods to adjust for problems related to
external validity. For example, if a study had uneven groups for some characteristic (such as
age), reweighting might be used.
• Replicate: Conduct the study again with different samples or in different settings to see if
you get the same results. When many studies have been conducted, meta-analysis can also
be used to determine if the effect of an independent variable is reliable (based on examining
the findings of a large number of studies on one topic).
• Try field experiments: Conduct a study outside the laboratory in a natural setting.
• Use inclusion and exclusion criteria: This will ensure that you have clearly defined the
population that you are studying in your research.
External validity is threatened when a study does not take into account the interactions of
variables in the real world.2
• Pre- and post-test effects: When the pre- or post-test is in some way related to the effect
seen in the study, such that the cause-and-effect relationship disappears without these
added tests
• Sample features: When some feature of the particular sample was responsible for the effect
(or partially responsible), leading to limited generalizability of the findings
• Selection bias: Considered a threat to internal validity, selection bias describes differences
between groups in a study that may relate to the independent variable (once again,
something like motivation or willingness to take part in the study, specific demographics of
individuals being more likely to take part in an online survey).3
• Situational factors: Time of day, location, noise, researcher characteristics, and how many
measures are used may affect the generalizability of findings.
Similarities
What are the similarities between internal and external validity? They are both factors that should
be considered when designing a study, and both have implications in terms of whether the results of
a study have meaning. Both are not "either/or" concepts, and so you will always be deciding to what
degree your study performs in terms of both types of validity.
Each of these concepts is typically reported in a research article that is published in a scholarly
journal. This is so that other researchers can evaluate the study and make decisions about whether
the results are useful and valid.
Differences
The essential difference between internal and external validity is that internal validity refers to the
structure of a study and its variables while external validity relates to how universal the results
are.4 There are further differences between the two as well.
Internal Validity
External Validity
Internal validity focuses on showing a difference that is due to the independent variable alone,
whereas external validity results can be translated to the world at large.
Examples of Validity
An example of a study with good internal validity would be if a researcher hypothesizes that using a
particular mindfulness app will reduce negative mood. To test this hypothesis, the researcher
randomly assigns a sample of participants to one of two groups: those who will use the app over a
defined period, and those who engage in a control task.
The researcher ensures that there is no systematic bias in how participants are assigned to the
groups, and also blinds his research assistants to the groups the students are in during
experimentation.
A strict study protocol is used that outlines the procedures of the study. Potential confounding
variables are measured along with mood, such as the participants socioeconomic status, gender,
age, among other factors. If participants drop out of the study, their characteristics are examined to
make sure there is no systematic bias in terms of who stays in the study.
Conclusion
Experimental research designs are often considered to be the standard in research designs. This is
partly due to the common misconception that research is equivalent to scientific experiments—a
component of experimental research design.
In this research design, one or more subjects or dependent variables are randomly assigned to
different treatments (i.e. independent variables manipulated by the researcher) and the results are
observed to conclude. One of the uniqueness of experimental research is in its ability to control the
effect of extraneous variables.
Experimental research is suitable for research whose goal is to examine cause-effect relationships,
e.g. explanatory research. It can be conducted in the laboratory or field settings, depending on the
aim of the research that is being carried out.
Bibliography
formpl. (n.d.). https://www.formpl.us/blog/experimental-research. Retrieved from
https://www.formpl.us.
RESEARCH DESIGN
Introduction:
Research Design is a plan for collecting and analyzing evidence that will make it possible for the investigator to answer whatever questions he or she
has posed. The design of an investigation touches almost all aspects of the research, from the minute details of data collection to the selection of the
techniques of data analysis.
Research Design is the overall strategy , framework , blueprint that has been created for the collection , measurement , analysis of data and to find
answers to research question.
i). Interviews to be conducted, Observationsto be made, Experiments to be conducted, Data analysis to be made.
Research design comes after the problem formulation stage and Research design is not same as Research Method.
➢ Research Design is the specific frame work , a blue print that has been created to seek answers to the research problem [ collecting ,
measuring and analyzing data ]
➢ Research Method is the technique to collect the information required to answer the research problem.
The Researcher has a number of designs available to him for investigating the research objectives.
Choice of Research Design largely depends upon the objectives of the Research and how much is known about the problem and Research objectives.
In Applied Research and Market Research , we cannot take as much time as we want because things change.
i). Neutrality: The results collected in research should be free from bias and neutral. One should discuss and get evaluated our conclusions with
experienced multiple individuals and consider those who agree with our research’s results.
ii). Reliability: Research Design should be able to ensure the standard results by indicating how research questions can be formed because a
researcher will always want the same results every time, he performs an experiment.
iii). Validity: The validity of a Research Design is used to calculate the expected results and to estimate the truthfulness of the results.
iv). Generalization: Generalization is one of the most important key characteristics of research design. The results obtained from the research should
be applicable to a population and not just to a limited sample.
ii). It would result in more accurate results with minimum usage of time, effort and money.
Based on the information to be collected , the objective or the purpose of the study Research Design can be classified into 3 categories/kinds
I. EXPLORATORY RESEARCH DESIGN [ Formulative Research , Explorative Research Design is sometimes referred to as Qualitative
research]
Exploratory research is defined as a research used to investigate a problem which is not clearly defined. It is conducted to have a better understanding
of the existing problem, but will not provide conclusive results. It is Non – Conclusive because there is no conclusion.
Exploratory research is conducted when the researcher does not know how and why certain phenomenon occurs and what needs to be answered. The
actions are explored and evaluated by the decision – maker.
For such a research, a researcher starts with a general idea and uses this research as a medium to identify issues, that can be the focus for future
research
Many a times we lack from sufficient understanding of the problem to formulate a specific hypothesis and there are often several tentative
explanations.
Example:
1). Management may conduct exploratory research to find out the causes of declining sales in the last few months.
These shows typically start with a crime that needs to be investigated . The initial step is to look for hints which can help establish
what has happened [ exploratory ]. The clues found in the exploratory phase of the research usually point in the direction of a specific
hypothesis or explanation of the events which happened and investigators start focusing their efforts in this direction. Performing
interviews with witnesses and suspects [ Descriptive ].
• Looking for hints ( observation) is a kind of way in which you try to explore what is the reason and then from here move building
hypothesis and reaching conclusions.
• Every research starts with Exploratoryresearch and then move to Descriptive research.
The major emphasis is on the discovery of ideas and insights done in the following ways.
3) . Consider a scenario where a juice bar owner feels that increasing the variety of juices will enable increase in customers, however he is not sure
and needs more information. The owner intends to carry out an exploratory research , to find out if expanding their juices selection will enable him to
get more customer or if there is a better idea.
iii). This research answers all questions like ‘’ what’’ , ‘’why ‘, ‘ how ‘.
3
i). Findings of the exploratory group are not generalized on the whole population.
ii). Outcomes of this study are tentative because of its unstructured style of research.
The quickest and the cheapest way to formulate a hypothesis in exploratory research is by using any of the following methods:
Secondary data are data which have already been collected for purposes other than the problem at hand. These data can be located quickly and
inexpensively. So secondary research is a type of research that has already been copied, gathered, organized and published by others.
4
5
Internal that is from inside, might be something like your internal company records, something which is available ready to use. Requires further
processing are something like require some request for processing.
Published Material: Sources of published external secondary data include federal, state and local governments, non profitorganizations(eg:
Chamber of commerce), trade association and professional marketing research firm.
⚫ It includes
i) . Guides
ii) Directories
iv) Census
Computerized data base: Consists of information that has been made available in computer readable form for electronic distribution.
⚫ It includes
6
i) . Online database.
Syndicated services: These services are also referred to as Syndicated sources, are companies that collect and sell common pools of data of
known commercial value designed to serve information needs, shared by a number of clients. Syndicated services can be classified based on the
unit of measurement ( households/ consumers/ institutions)
⚫ It includes
i) . Surveys
ii) Panels
Advantages:
2) . No need to re-invent the wheel, because already it has been done by somebody. So if it is close to your problem at hand or your research topic,
you can follow, you can use it.
Something that you can realize without even getting into the market by crunching the data or analyzing the trend.
Disadvantages:
1) . May be collected for a purpose that does not match your need.
3) May not have been collected long enough for detecting trends.
4) Access may be difficult or costly when data is collected for commercial reason.
5) Aggregations (aggregation is a collection, or the gathering of things together, “ a desktop aggregation app that brings together Facebook, Twitter
and Linkedin”) may be unsuitable for your need.
In general validity refers to how sound your RESEARCH DESIGN and methods are. Does the measure accurately measure the concept it is supposed
to measure and not something else.
1) CONTENT VALIDITY:
Content validity is an important research methodology term that refers to how well a test measures the behavior for which it is intended. For
example, lets say your teacher gives you a psychology test on the psychological principles of sleep. The purpose of this test is to measure your
knowledge or mastery of the psychological principles of sleep? If the test does indeed measure this, then it is said to have content validity.
2) . CRITERION VALIDITY:
Criterion validity is used to measure the ability of an instrument to predict future outcome. It is an extent to which a measure is related to an
outcome.
7
It is defined as the degree to which a measure of interest relates to a measure of established validity.
One of the simplest ways to assess criterion related validity is to compare it to a known standard.
Example: A new intelligent test could be statistically analyzed against a standard IQ test….. The standard IQ test is the superior test because of its
ability to detect intelligence of the individual. If the new intelligence test produces the same result as the standard IQ test, then there is a high
criterion validity.
3) . CONSTRUCT VALIDITY:
Construct validity is “ the degree to which a test measures what it claims to be measuring.” It is used to determine how well a test measures what
it is supposed to measure. CONSTRUCT Validity is one way to test the validity of a test, used in education, social sciences, psychology.
For example: You might try to find out if an educational program increases emotional maturity in elementary school age children. Construct validity
would measure if your research is actually measuring emotional maturity.
Validity is usually determined by comparing two instruments ability to predict a similar outcome with a single variable being measured.
4) . FACE VALIDITY:
Face validity
Face validity considers how suitable the content of a test seems to be on the surface. It’s similar to content validity,
but face validity is a more informal and subjective assessment.
Example
You create a survey to measure the regularity of people’s dietary habits. You review the survey items, which ask
questions about every meal of the day and snacks eaten in between for every day of the week. On its surface, the
survey seems like a good representation of what you want to test, so you consider it to have high face validity.
As face validity is a subjective measure, it’s often considered the weakest form of validity. However, it can be useful in
the initial stages of developing a method.
This refers to “referring to a literature to develop a new hypothesis”. The literature referred are :- trade jourrnals, professional journals, market
research finding publications, statistical publications etc.
EXAMPLE:
The subject of interest is: Depression in older people. This is the starting point, the research idea.
NOTE: The key words depression, older people, exercise are broad areas and most databases such as Medline or BIDS will allow you to restrict your
search. This gives you an idea of how you could get started with your Literature review.
[ see The NIHR RDSEM/YH Resource packs: How to search and critically evaluate Research Literatur.]
8
➢ Once You have got a few keywords, which is enough to do a preliminary literature search. A good Librarian can help you with this and let
you know what literature databases are available , which are most suitable for your area of interest and whether there is any cost involved
in carrying out the research. For guidance in undertaking this , one may wish to contact the local health sciences or hospital library.
➢ Do not rely on computerized databases alone and if you come across a good paper on the subject, check through the reference to see what
else you may need to get hold of.
➢ Avoid the urge to track down every possible reference that is vaguely connected to your subject area. Concentrate on more recent articles,
cause they are likely to summarise older work.
Standard therapy for treatment of depression among older people is drug treatment but many older people cannot comply with
antidepressant medication because of side effects.
In studies of younger adults there is evidence to suggest that aerobic and resistance ( weight ) training can help midely depressed patients.
Modified resistance training appears to be associated with a higher level of compliance than aerobic exercise and is safer for older people
who are at risk of injury from falls.
Can weight training improve the quality of life in depressed older people.
iv).Observation Method.
Indirect Methods:
i). Projective Techniques.
a). Association Techniques
b). Completion Techniques
Direct Methods
As a business owner, you can’t properly target or service your audience if you don’t gather information about their specific wants, needs and
fears. One of the most effective means of obtaining this type of information is to go directly to your audience to find out what’s on their minds.
A focus group is a common qualitative research technique used by companies for marketing purposes. It typically consists of a small number of
participants, usually about six to 12, from within a company's target market. The consumers are brought together and led through discussions of
important company and brand topics by a moderator.
A focus group is qualitative research because it asks participants for open-ended responses conveying thoughts or feelings.
Within a focus group, a moderator poses a series of questions intended to gain insight about the way the group views the brand, product, related
images, slogans, concepts or symbols. As a representative sample of consumers targeted by the company, a focus group can offer insights consistent
with those shared by the broader target market. Focus-group moderators should pose questions in a way that does not lead group members to provide
In most instances, you will have to offer some type of incentive to your focus group members to ensure their active and honest participation. If you
hire a research firm to conduct your focus group, that firm will typically handle the disbursement of the incentive, whether it’s a monetary payment
A focus group is generally more useful when outcomes of research are very unpredictable and you’re looking for more open feedback rather than
comparisons of potential results as in a quantified research method. A focus group also allows consumers to express clear ideas and share feelings
that do not typically come out in a quantified survey or paper test. Because of the open conversation among group members, topics and discussions
are freer flowing and members can use comments from others to stimulate recall.
Another benefit is that the moderator can observe the dynamics among members of the focus group as they discuss their opinions with each other. In
many of these groups, the moderator will leave the room to allow focus group members to communicate with each other without feeling self-
conscious.
People participating in these groups can be divided into three groups: Moderator, participant and observe.
Characteristics of Focus Groups:
Groupn size 8 to 12
SIZE: Ideal recommended size for a group discussion is 8 to 12 members. Less than eight would not generate all the possible perspectives on the
topic and the group dynamics required for a meaningful session. And more than 12 would make it difficult to get any meaningful insight.
NATURE: i).The respondents must be similar in terms of the subject/policy/product knowledge and experience with the product under study.
ii).Respondents from similar background must be included to avoid disagreement. Iii). It is recommended that the group should consist of strangers
rather than subjects who know each other.
SETTING: The space or setting in which the discussion takes place should be as neutral, normal and comfortable as possible. In case one-way
mirrors or cameras are installed, there is a need to ensure that these gadgets are not directly visible.
TIME PERIOD: The discussion should be held in a single setting unless there is a ‘before ‘ and ‘after’ design, which requires group perceptions,
initially before the study variable is introduced: and later in order to guage the group’s reactions. The ideal duration of conduction should not exceed
an hour and a half. This is usually preceded by a short rapport formation session between the moderator and the group members.
The recording: Most often it is machine recording, sometimes may be accompanied by human recording as well.
THE MODERATOR: The moderator is the one who manages the discussion. He is the key conductor of the whole session and is supposed to
supervise over the nature, content and the validity of the data collected and needs to possess some critical moderating skills.
i). Listening: The moderator must have a good listening ability. He must not miss the participant’s comment, due to lack of attention.
ii). Permissive: The moderator must be permissive, yet alert to the signs that the group is disintegrating.
iii). Memory: He must have a good memory. The moderator must be able to remember the comments of the participants.
Example: A discussion is centered around a new advertisement by a telecom company. The participant may make a statement early and make another
statement later., which is opposite to what was said early, like the participant may say that s(he) never subscribed to the views expressed in the
advertisement by the competitor, but subsequently may say that the “current advertisement of competitor is excellent”.
vi). Sensitivity: The moderator must be sensitive enough to guide the group discussion.
EXAMPLES:
2) A political party may be interested in how young adults voters would react to certain policies. By observing young adults discussing those
policies, market researchers would then report their findings to their client.
restictions imposed by a formal list of questions. The interview may be conducted in a causal and informal manner in which the flow of the
conversation determines what questions are to be asked and the order in which they should be asked.
4) It may take from 30 minutes to more than one hour.
5) To understand this technique lets take andepartment store example.
6) The interviewer begins by asking a general question such as,
7) Interviewer: “How do you feel about shopping at department stores?”
8) Respondent: “Shopping isn’t fun anymore”
9) Interviewer: “Why isn’t it fun anymore?”
10) Respondent: “Fun has just disappeared from shopping” [ answer is not very revealing, so the interviewer may ask a probing
question]
11) Interviewer: “Why was it fun before and what has changed?”
12) Wordings of the questions and the order in which they are asked is influenced by the subjects replies. Probing is of critical
importance in obtaining meaningful responses and uncovering hidden issues.
13) Probing is done by asking questions like “Why do you say that?” , “ That’s interesting, can you tell me more?” or “Would you
like to add anything else?”
ETHNOGRAPHY:
is the systematic study of people and cultures. It is designed to explore cultural phenomena.
Ethnography is like a picture of a culture. It describes the various systems and relationships that surround every day life of individuals and sometimes
specific cultural practices.
Photo Ethnography:
This is a method where in the researcher clicks pictures of behaviours, attitudes and emotions of individuals in various situations instead of
interrogating the respondents.
Observation method is one of the most common data collection methods in primary research. Some of the significant of scientific discoveries in
human history were made using observational techniques. Today this technique is used by social scientist, natural scientist, engineers, computer
scientists, educational researchers, market researchers and so on.
It is best used in situations where one may be required to study mass behaviours like students in a college canteen, commuters in a railway
station or shoppers in a mall. Such observations are conducted in a natural settings and not in a laboratory or controlled experimental setting.
And ethically speaking, the recording of your observations should be without a bias and judgement .
The research involves observing people and there are two common ways of observations.
In participation observation, researcher interacts with the group being observed. This is a common method within ethnographic research in
sociology and anthropology. Here researcher may interact with the group and become a part of their community.
In non-participant observation there is no interaction with the participants. The participant’s behavior is simply recorded without their
knowledge. Hence the group being observed does not know they are being observed.
There is a further classification to this. Both these methods can be done either covertly i.e secretly where the participants do not know what’s
going on OR overtly meaning openly, where the participants know exactly why the observer is there.
12
Obviously there are advantages and disadvantages to all these methods. The advantage of covert observation (participant or non-participant)is
that you are likely to get the most natural behavior and attitudes of the persons being observed. Covert methods are suitable, when it is difficult
to reach groups or those groups which do not welcome observer. Disadvantage of covert method is that it may be difficult to get into a group
(close-knit groups).
In Overt methods again participant or non-participant the advantage is that its quicker and simpler because all it requires is to convince the
group to be allowed to observed. Ethically this is the best way to conduct observational method study. However the disadvantage is that again
with participant or non-participant, you can never be sure if the participants are putting up an act for the observer’s benefit or they are being
truthful and not hiding facts.
IV. ETHNOGRAPHY
Ethnography is the systematic study of people and cultures. It is designed to explore cultural phenomena.
Ethnography is like a picture of a culture. It describes the various systems and relationships that surround every day life of individuals and sometimes
specific cultural practices.
Photo Ethnography:
This is a method where in the researcher clicks pictures of behaviours, attitudes and emotions of individuals in various situations instead of
interrogating the respondents.
Projective techniques:
Projective techniques are indirect and unstructured methods of investigation. These techniques are useful in giving respondent opportunities to
express their attitudes without personal embarrassment. These techniques helps the respondents to project his own attitude and feelings
unconsciously on the subject under study. They play an important role in motivational researches or in attitude surveys.
Example: Many a time people do not want to reveal their true motive for fear of being branded ‘ old fashioned’. Questions such as “ Do you do
all household work yourself?” The answer may be ‘no’, though the truth is ‘yes’. A ‘yes’ answer may not be given because it may suggest that
the family is not financially sound and cannot afford a maid for help.
iv). The respondent selected may not be representative of the entire population.
In association technique, an individual is presented with a stimulus words to the respondent, who is asked to respond with the first word that
comes to his mind. The respondent, by answering quickly, gives the word that he or she associates most closely with the stimulus word.
An individual is given a clue or hint and asked to respond to the first thing that comes to mind. The association can take the shape of a picture or
a word. There can be many interpretations of the same thing. A list of words is given and you do not know in which word they are more
interested. The interviewer records the responses which reveal the inner feelings of the respondents. The frequency with which any word is
13
given, a response and the amount of time that elapses before the response is given are important for the researcher. For example out of 50
respondents 20 people associate the word “ Fair” with “complexion”.
Example 1: What brand of detergent comes to your mind first, when I mention washing of an expensive cloth?
Example 3. In a study of cigarettes, the respondent is asked to give the forst word that comes to his mind.
i). Injurious. Ii). Style. Iii). Strong. Iv). Stimulus. V). Bad manners vi). Disease vii).Pleasure.
The subject’s response to each word is recorded and responses are timed so that respondents who hesitate or reason out can be identified.
.Eample: 4
The sets of responses are quite different, suggesting that the women differ in personality and in their attitudes toward housekeeping.`
These findings suggest that the market for detergents could be segmented on the basis of attitudes.
iii). The number of respondents who do not respond at all to a test word within a reasonable period of time.
An individual’s pattern of responses and the details of the response are used to determine the person’s underlying attitudes or feelings on the topic
of interest.
Sentence completion
14
Story completion
Sentence completion
This technique is an extension of association technique. These techniques involve presenting an incomplete object to the respondent, which can be
completed by the respondent in any way and is the most popular of all projective techniques and is inevitably used in almost all measuring
instruments as an open-ended questions. Generally, they are given a single word orpharse and asked to fill it in. Here the respondent have to finish a
set of incomplete sentences.
Example: Let us make a study dealing with people’s inner feelings towards software professionals.
Example: Suppose you want to provide a basis for developing advertising appeal for a brand of cooking oil, the following sentences may be used.
v). One important feature to be highlighted in the advertisement about cooking oil is -----------------------------------------
Story completion
Another extension of the completion technique is story completion. Here, the individual is given an incomplete story or idea. The subject is supposed
to complete the story and provide a conclusion. The completion of the story/sentence will reflect the underlying attitude, personality traits of the
person, and state of mind.
A situation is described to a respondent who is asked to complete the story based on his opinion and attitude. This technique will reveal the interest of
the respondent, but it is difficult to interpret.
Example: Two children are quarreling at the breakfast table before going to school. The younger of the two spilled coffe on her brother’s shirt which
he was supposed to wear on the same day for attending annual sports event.
Example; Mr. X belongs to the upper-middle class. He received a telephone call, where the caller said that “ I am from Globe Travels”. Sir, I want to
tell you about our recent offer, that is, if you travel to the US this summer, you will get two tickets free by the year end to fly to the Far East.
III).Construction Technique:
This is more or less like completion test. They can give you a picture and you are asked to write a story about it. The initial structure is limited
and not detailed like the completion test. For example: Two cartoons are given and a dialogue is to be written.
Cartoon tests: The tests make use of animated characters in a particular situation. The cartoon usually has a picture that has two or more
characters talking to each other. One or more of ‘balloons’ include the conversation of the character, and is left open and the respondent is asked
to fill in.
Usually the statement/question by one character is denoted and one needs to fill in the response made by the other character. The picture has a
direct relation with the topic under study and is assumed to reveal the respondents attitude, feelings or intended behavior. They are one of the
easiest to administer, analyze and score.
In this the people are asked to express the feelings or attitude of other people.
Examples:
i). Clay modelling: here the emphasis is on the manner in which the person uses or works with clay and not on the end result.
ii). Role playing : It is another technique that is used in Business Research. The respondents are asked to play the role or assume the behavior of
some one else.
iii). The third person technique: In this case the respondent is presented with a verbal or visual situation and needs to express what might be the
persons beliefs and attitude.
➢ Descriptive Research is used when a Researcher tries to understand the relationship between two things or to describe a particular behavior
as it occurs in the environment.
➢ Descriptive research cannot establish a cause and effect relationship between the characteristics of interest. This is the distict disadvantage
of descriptive research.
➢ Here the researcher does not have control over the variables.
➢ Research is based on primary data.
➢ It is a theoretical type of research and answers ‘what’ , ‘why’ and ‘How’ in a research. It is used to describe specific situation based on data
collection and comparative study of time and situation.
➢ It is conclusive type of research. It used the features of both types of research that is qualitative research as well as quantitative research.
Survey method.
➢ Survey method involves a structured questionnaire given to respondents and designed to elicit specific information.
➢ Variety of questions are asked regarding behavior, intentions, attitudes, awareness, motivations, demographic and life style characteristics.
➢ Questions may be asked verbally, in writing or via computer.
Survey Research is one of the most popular and easy forms of research to obtain information or to collect data.
A questionnaire is prepared to contain questions related to the research problem either on paper or in any digital format. These questionnaires are
distributed among random people in the hope of getting their accurate opinion.
A questionnaire is a very convenient way of collecting information from large number of people within a period of time. Hence the design of the
questionnaire is of utmost importance to ensure accurate data is collected so that the results are interpretable and generalizable.
16
A good questionnaire should be valid, reliable, clear, interesting and succinct { A succinct questionnaire asks questions that aim to answer only the
research objectives]
A survey can come in many forms: Postal survey, telephone interviews, face-to-face interviews and internet surveys.
The survey research method is popularly used in University researches and business researches.
Survey research is also called Primary research and can be used with other methods to obtain accurate outcomes.
Moreover, data collected from survey research can be used as secondary research data by other researchers.
Observation method.
An naturalistic observation study means the study of subjects when they are at their natural behavior
In participant observation, people being observed in the research study are aware of the observation. They are asked to take part in the observation
study.
Observational research methods are suitable for studying the behavior of subjects under the study. However this research is incapable of providing
information about the actual cause of the behavior of subjects.
Based on the time period of the collection of research information DRD is further divided into two categories:
A). In Longitudinal study, researchers repeatedly examine the same individuals to detect any changes that might occur over a period of time. This is
also known as ‘ Time Series Study ‘ Through longitudinal study, the researcher comes to know how the market changes over time.
Longitudinal studies are a type of correlational research in which researchers observe and collect data on a number of variables without trying to
influence those variables.
Longitudinal studies involves panels. The elements of these panels may be individuals, stores, dealers etc. The panel or sample remains constant
throught the period. There may be some dropouts and additions. The sample members in the panel are being measured repeatedly. The periodicity of
the study may be monthly or quarterly.
Example:
Assume a market research is conducted on ready to eat food at two different points of time T1 andsT2 with a gap of 4 months. Each of the above
times, a sample of 2.000 households is chosen and interviewed. The brands used most in the household is recorded as follows.
17
Brands At T1 At T2
Brand X 500(25%) 600(30%)
Brand Y 700(35%) 650(32.5%)
Brand Z 400(20%) 300(15%0
Brand M 200(10%) 250(12.5%)
All others 200(10%) 200(10%)
2000 (100%) 2000 ( 100% )
As can be seen between period T1 and T2 Brand X and Brand M has shown an improvement in market share. Brand Y and Brand Z has decreased in
market share.Where as all other categories remains the same. This shows that Brand A and M has gained market share at the cost of Y and Z.
The opposite of a longitudinal study is a Cross-Sectional study. Longitudinal study studies repeatedly observe the same participants over a period of
time, Cross- Sectional studies examine different samples ( or cross-section) of the population at one point in time. Cross-sectional study is one of the
most important types of descriptive research.
A cross-sectional design in which one sample of respondents is drawn from the target population and information is obtained from this sample once.
It is scientific in its approach and is the most frequently used descriptive design in marketing research. It involves study of individuals usually
attitude or belief at one point of time or in a selected time period only one time data is collected .
Now Cross – Sectional Research Design [CSRD] is again divided into two:
In Single cross sectional designs, only one sample of respondents is drawn from the target population and information is obtained from this sample
only once. These designs are also called sample survey research designs.
. In Multiple cross sectional RD there are two or more samples of respondents, and information from each sample is obtained only once.
Both types of study can prove useful in research because cross-sectional studies are shorter and therefore cheaper to carry out, they can be used to
discover correlations that can then be invested in a Longitudinalstudy.Example:
18
To study the relationship between smoking and stomach cancer, first conduct a cross-section study to see if there is a link between smoking and
stomach cancer, if it is discovered that a link exists in men but not in women.
Then decide to design a longitudinal study to further examine this relationship in men. Without the Cross- sectional study, we would not have known
to focus on men in particular.
COHORT ANALYSIS:
A cohort is a group of respondents who experience the same event within the same time interval ;
For eg: A cohort of people born in Mumbai in the year 1990. This will be called a “ birth cohort “
Many a times the data is obtained from different samples at different time intervals and then they are compared. Such cross-sectional surveys
conducted on different time intervals is called COHORT ANALYSIS.
For eg: In India GSK comes with Horlicks for Ladies, Horlicks for children, so this is a very interesting way of understanding how to retain the
consumer all those at a different time period.
It is unlikely that any of the individuals studied at time one will also be in the sample at time two. For example, the age cohort of people between 8
and 19 years old was selected and their soft drink consumption was examined every 10 years for 30 years. In other words, every 10 years a different
sample of respondents was drawn from the population of those who were then between 8 and 19 years old.
Attitudes are latent constructs and because of this, they are not directly observable
Attitudes are thought to have three components:
1. Affective component 2. Cognitive component 3. Behavioural component
Business Research Methods
Sessions 9 - 10 : Measurement and Scaling
Attitude …….
1. An affective component that expresses how much affinity someone has
toward the relevant matter. More simply, this is the feeling of liking or
not liking something.
Scales – two categories of Attitudinal Scales : Rating Scales, and Ranking Scales
We know the four different types of scales that can be used to
measure the operationally defined dimensions and elements of a variable,
it is necessary to examine the methods of scaling (that is, assigning numbers
or symbols) to elicit the attitudinal responses of subjects toward objects,
events, or persons. There are two main categories of attitudinal scales
(not to be confused with the four different types of scales)—
the Rating scale and the Ranking scale. Rating scales have several response
categories and are used to elicit responses with regard to the object, event,
or person studied. Ranking scales, on the other hand, make comparisons
between or among objects, events, or persons and elicit the preferred
choices and ranking among them.
Business Research Methods
Sessions 9 - 10 : Measurement and Scaling
Rank the following services while selecting a new mobile service provider.
You have to give rank from 1 to 6 to each service.(one respondent’s eg.)
Q-Sort
Non-Comparative Scales:
Graphic Rating Scale:
Please indicate how much do you like Kitkat chocolate by pointing to the face
that best shows your attitude and taste.
If you do not prefer it at all, you would point to face one “1”. In case you prefer it
the most, you would point to face seven “7”.
Business Research Methods
Sessions 9 - 10 : Measurement and Scaling
Non-Comparative Scales:
Itemized Rating Scales:
Likert Scale; Semantic Differential Scale; Staple Scale; Thurstone Scale
Likert Scale:
- Lensis Likert, 1932
- Scale Item: eg. expressing the degree of agreement or disagreement
- Number of points : 4, 5, 6 or 7
Advantages
Business Research Methods
Sessions 9 - 10 : Measurement and Scaling
Staple Scale: eg
No neutral point, or zero
Validity : the extent to which differences in observed scale scores reflect true
differences among objects on the characteristic being measured,
rather than systematic or random error. Perfect validity requires
that there be no measurement error (XO = XT, ES = 0, ER = 0).
Validity : Are we really measuring what we intend to measure?
Types of Validity:
Content Validity or Face Validity
Construct Validity – Convergent; and Discriminant
Criterion Validity – Concurrent; and Predictive
Business Research Methods
Sessions 9 - 10 : Measurement and Scaling
Choosing a Scale ??
Factors to be considered for choosing a scale:
Data Properties
Number of Dimensions
Number of Scale Categories
Odd or Even Number of Categories
Balanced v/s Unbalanced Scales
Forced v/s Unforced Scales
Characteristics or Goodness of Measurement Scales
Accuracy and Precision Reliability
Validity Practicality
Topic #1: Introduction to
measurement and statistics
"Statistics can be fun or at least they don't need to be feared." Many
folks have trouble believing this premise. Often, individuals walk into
their first statistics class experiencing emotions ranging from slight
anxiety to borderline panic. It is important to remember, however, that
the basic mathematical concepts that are required to understand basic
statistics are not prohibitive for any university. The key to doing well in
any statistics course can be summarized by two words, "KEEP UP!". If
you don't understand a concept - reread the material, do the practice
questions at the end of each chapter, and don't be afraid to ask your
instructor for clarification or help. This is important because the material
discussed four weeks from today will be based on material discussed
today. If you keep on top of the material and relax a little bit, you might
even find you enjoy this introduction to basic measurements and
statistics. With that preface out of the way, we can now get down to the
business of discussing, "What do the terms measurement and statistic
mean?" and "Why should we study measurement and statistics?".
What is a Statistic?
Statistics are part of our everyday life. All one needs to do is examine
the baseball boxscores in the newspaper or their bank statement
(hopefully, not in the newspaper) for examples of statistics. Statistics
in and of themselves are not anxiety producing. For example, most
individuals (particularly those familiar with baseball) will not
experience anxiety when a player's batting average is displayed on
the television screen. The "batting average" is a statistic but as we
know what it means and how to interpret it, we do not find it
particularly frightening. The idea of statistics is often anxiety
provoking simply because it is a tool with which we are unfamiliar.
Therefore, let us examine what is meant by the term statistic; Kuzma
(1984) provides a formal definition:
A body of techniques and procedures dealing with the collection,
organization, analysis, interpretation, and presentation of information
that can be stated numerically.
What is Measurement?
Normally, when one hears the term measurement, they may think in
terms of measuring the length of something (ie. the length of a piece
of wood) or measuring a quantity of something (ie. a cup of flour).
This represents a limited use of the term measurement. In statistics,
the term measurement is used more broadly and is more
appropriately termed scales of measurement. Scales of
measurement refer to ways in which variables/numbers are defined
and categorized. Each scale of measurement has certain properties
which in turn determines the appropriateness for use of certain
statistical analyses. The four scales of measurement are nominal,
ordinal, interval, and ratio.
The table below will help clarify the fundamental differences between
the four scales of measurement
Indications Indicates Direction of Indicates Amount of
Absolute Zero
Difference Difference Difference
Nominal X
Ordinal X X
Interval X X X
Ratio X X X X
You will notice in the above table that only the ratio scale meets the
criteria for all four properties of scales of measurement.
Qualitative Research
Procedures
Qualitative
Research
Procedures Direct Indirect
(Nondisguised) (Disguised)
Survey Research is used: “to answer questions that have been raised, to solve
problems that have been posed or observed, to assess needs and set goals,
to determine whether or not specific objectives have been met, to establish
baselines against which future comparisons can be made, to analyze trends
across time, and generally, to describe what exists, in what amount, and in
what context.” (Isaac & Michael, 1997)
Consumer Panels:
This method of collecting information is used by distributors as well as
manufacturers through their salesmen at regular intervals. Distributors get
the retail stores audited through their salesmen and use such information to
estimate market size seasonal purchasing pattern and so on.
The data are collected not by questions but by observation.
Oculometers
Pupilometers
Sociometric Analysis:
Sociometry is "a method for discovering, describing, and evaluating social status,
structure, and development through measuring the extent of acceptance or
rejection between individuals in groups".
The sociometric test or measure is a means of eliciting positive, negative, and
neutral reactions, or attractions, repulsions and indifferent attitudes within a
given group. This is achieved by asking each member of the group to specify all
those in the group with whom, he would like to participate in a particular activity.
Application: Sociometric studies have been made of home groups, work groups,
and school groups; entire communities, fraternities, school students, college
students and camps; and of such problems and processes as leadership, morale,
social adjustment, race relations.
Business Research Methods
Sessions 11 - 12 : Survey Research
Sampling:
Sample Size depends upon the following factors:-
Size of the Population Heterogeneity of the Population
Desired Accuracy Tolerable level of Error
Resources – Manpower Time Cost
Sampling: Types
Probability types and Non-Probability types
Probability types
Simple Random Systematic
Stratified Cluster Multi-Stage
Non-Probability types
Haphazard Purposive
Quota Judgemental
Convenience Snowball
Business Research Methods
Sessions 11 - 12 : Survey Research
Observation method
Method format:
Standardized and structured
Non-standardized and unstructured
Business Research Methods
Sessions 11 - 12 : Survey Research
Observation setting
Natural environment
Simulated environment
We shall generally continue to make use of the terms “independent variable” and “dependent variable,”
but shall find the distinction between the two somewhat blurred in multivariate designs, especially those
observational rather than experimental in nature. Classically, the independent variable is that which is
manipulated by the researcher. With such control, accompanied by control of extraneous variables through
means such as random assignment of subjects to the conditions, one may interpret the correlation between the
dependent variable and the independent variable as resulting from a cause-effect relationship from
independent (cause) to dependent (effect) variable. Whether the data were collected by experimental or
observational means is NOT a consideration in the choice of an analytic tool. Data from an experimental
design can be analyzed with either an ANOVA or a regression analysis (the former being a special case of the
latter) and the results interpreted as representing a cause-effect relationship regardless of which statistic was
employed. Likewise, observational data may be analyzed with either an ANOVA or a regression analysis, and
the results cannot be unambiguously interpreted with respect to causal relationship in either case.
We may sometimes find it more reasonable to refer to “independent variables” as “predictors”, and
“dependent variables” as “response-,” “outcome-,” or “criterion-variables.” For example, we may use SAT
scores and high school GPA as predictor variables when predicting college GPA, even though we wouldn’t
want to say that SAT causes college GPA. In general, the independent variable is that which one considers
the causal variable, the prior variable (temporally prior or just theoretically prior), or the variable on which one
has data from which to make predictions.
While psychologists generally think of multivariate statistics in terms of making inferences from a
sample to the population from which that sample was randomly or representatively drawn, sometimes it may
be more reasonable to consider the data that one has as the entire population of interest. In this case, one
may employ multivariate descriptive statistics (for example, a multiple regression to see how well a linear
model fits the data) without worrying about any of the assumptions (such as homoscedasticity and normality of
conditionals or residuals) associated with inferential statistics. That is, multivariate statistics, such as R2, can
be used as descriptive statistics. In any case, psychologists rarely ever randomly sample from some
population specified a priori, but often take a sample of convenience and then generalize the results to some
abstract population from which the sample could have been randomly drawn.
Rank-Data
I have mentioned the assumption of normality common to “parametric” inferential statistics. Please
note that ordinal data may be normally distributed and interval data may not, so scale of measurement is
irrelevant. Both ordinal and interval data may be distributed in any way. There is no relationship between
scale of measurement and shape of distribution for ordinal, interval, or ratio data. Rank-ordinal data will,
©
Copyright 2019 Karl L. Wuensch - All rights reserved.
Intro.MV.docx
2
however, be non-normally distributed (rectangular) in the marginal distribution (not necessarily within groups),
so one might be concerned about the robustness of a statistic’s normality assumption with rectangular data.
Although this is a controversial issue, I am moderately comfortable with rank data when there are twenty to
thirty or more ranks in the sample (or in each group within the total sample).
Consider IQ scores. While these are commonly considered to be interval scale, a good case can be
made that they are ordinal and not interval. Is the difference between an IQs of 70 and 80 the same as the
difference between 110 and 120? There is no way we can know, it is just a matter of faith. Regardless of
whether IQs are ordinal only or are interval, the shape of a distribution of IQs is not constrained by the scale of
measurement. The shape could be normal, it could be very positively skewed, very negatively skewed, low in
kurtosis, high in kurtosis, etc.
One might object that psychologists got along OK for years without multivariate statistics. Why the
sudden surge of interest in multivariate stats? Is it just another fad? Maybe it is. There certainly do remain
questions that can be well answered with simpler statistics, especially if the data were experimentally
generated under controlled conditions. But many interesting research questions are so complex that they
demand multivariate models and multivariate statistics. And with the greatly increased availability of high
speed computers and multivariate software, these questions can now be approached by many users via
multivariate techniques formerly available only to very few. There is also an increased interest recently with
observational and quasi-experimental research methods. Some argue that multivariate analyses, such as
ANCOV and multiple regression, can be used to provide statistical control of extraneous variables. While I
opine that statistical control is a poor substitute for a good experimental design, in some situations it may be
the only reasonable solution. Sometimes data arrive before the research is designed, sometimes experimental
or laboratory control is unethical or prohibitively expensive, and sometimes somebody else was just plain
sloppy in collecting data from which you still hope to distill some extract of truth.
But there is danger in all this. It often seems much too easy to find whatever you wish to find in any
data using various multivariate fishing trips. Even within one general type of multivariate analysis, such as
multiple regression or factor analysis, there may be such a variety of “ways to go” that two analyzers may
easily reach quite different conclusions when independently analyzing the same data. And one analyzer may
select the means that maximize e’s chances of finding what e wants to find or e may analyze the data many
different ways and choose to report only that analysis that seems to support e’s a priori expectations (which
may be no more specific than a desire to find something “significant,” that is, publishable). Bias against the
null hypothesis is very great.
It is relatively easy to learn how to get a computer to do multivariate analysis. It is not so easy correctly
to interpret the output of multivariate software packages. Many users doubtlessly misinterpret such output, and
many consumers (readers of research reports) are being fed misinformation. I hope to make each of you a
more critical consumer of multivariate research and a novice producer of such. I fully recognize that our
computer can produce multivariate analyses that cannot be interpreted even by very sophisticated persons.
Our perceptual world is three dimensional, and many of us are more comfortable in two dimensional space.
Multivariate statistics may take us into hyperspace, a space quite different from that in which our brains (and
thus our cognitive faculties) evolved.
We shall consider multivariate extensions of statistics for designs where we treat all of the variables as
categorical. You are already familiar with the bivariate (two-way) Pearson Chi-square analysis of contingency
tables. One can expand this analysis into 3 dimensional space and beyond, but the log-linear model covered
in Chapter 17 of Howell is usually used for such multivariate analysis of categorical data. As a example of
such an analysis consider the analysis reported by Moore, Wuensch, Hedges, & Castellow in the Journal of
Social Behavior and Personality, 1994, 9: 715-730. In the first experiment reported in this study mock jurors
were presented with a civil case in which the female plaintiff alleged that the male defendant had sexually
3
harassed her. The manipulated independent variables were the physical attractiveness of the defendant
(attractive or not), and the social desirability of the defendant (he was described in the one condition as being
socially desirable, that is, professional, fair, diligent, motivated, personable, etc., and in the other condition as
being socially undesirable, that is, unfriendly, uncaring, lazy, dishonest, etc.) A third categorical independent
variable was the gender of the mock juror. One of the dependent variables was also categorical, the verdict
rendered (guilty or not guilty). When all of the variables are categorical, log-linear analysis is appropriate.
When it is reasonable to consider one of the variables as dependent and the others as independent, as in this
study, a special type of log-linear analysis called a LOGIT ANALYSIS is employed. In the second experiment
in this study the physical attractiveness and social desirability of the plaintiff were manipulated.
Earlier research in these authors’ laboratory had shown that both the physical attractiveness and the
social desirability of litigants in such cases affect the outcome (the physically attractive and the socially
desirable being more favorably treated by the jurors). When only physical attractiveness was manipulated
(Castellow, Wuensch, & Moore, Journal of Social Behavior and Personality, 1990, 5: 547-562) jurors favored
the attractive litigant, but when asked about personal characteristics they described the physically attractive
litigant as being more socially desirable (kind, warm, intelligent, etc.), despite having no direct evidence about
social desirability. It seems that we just assume that the beautiful are good. Was the effect on judicial
outcome due directly to physical attractiveness or due to the effect of inferred social desirability? When only
social desirability was manipulated (Egbert, Moore, Wuensch, & Castellow, Journal of Social Behavior and
Personality, 1992, 7: 569-579) the socially desirable litigants were favored, but jurors rated them as being more
physically attractive than the socially undesirable litigants, despite having never seen them! It seems that we
also infer that the bad are ugly. Was the effect of social desirability on judicial outcome direct or due to the
effect on inferred physical attractiveness? The 1994 study attempted to address these questions by
simultaneously manipulating both social desirability and physical attractiveness.
In the first experiment of the 1994 study it was found that the verdict rendered was significantly affected
by the gender of the juror (female jurors more likely to render a guilty verdict), the social desirability of the
defendant (guilty verdicts more likely with socially undesirable defendants), and a strange Gender x Physical
Attractiveness interaction: Female jurors were more likely to find physically attractive defendants guilty, but
male jurors’ verdicts were not significantly affected by the defendant’s physical attractiveness (but there was a
nonsignificant trend for them to be more likely to find the unattractive defendant guilty). Perhaps female jurors
deal more harshly with attractive offenders because they feel that they are using their attractiveness to take
advantage of a woman.
The second experiment in the 1994 study, in which the plaintiff’s physical attractiveness and social
desirability were manipulated, found that only social desirability had a significant effect (guilty verdicts were
more likely when the plaintiff was socially desirable). Measures of the strength of effect ( 2 ) of the
independent variables in both experiments indicated that the effect of social desirability was much greater than
any effect of physical attractiveness, leading to the conclusion that social desirability is the more important
factor—if jurors have no information on social desirability, they infer social desirability from physical
attractiveness and such inferred social desirability affects their verdicts, but when jurors do have relevant
information about social desirability, litigants’ physical attractiveness is of relatively little importance.
Continuous Variables
We shall usually deal with multivariate designs in which one or more of the variables is considered to
be continuously distributed. We shall not nit-pick on the distinction between continuous and discrete variables,
as I am prone to do when lecturing on more basic topics in statistics. If a discrete variable has a large number
of values and if changes in these values can be reasonably supposed to be associated with changes in the
magnitudes of some underlying construct of interest, then we shall treat that discrete variable as if it were
continuous. IQ scores provide one good example of such a variable.
4
MULTIPLE REGRESSION
Univariate regression. Here you have only one variable, Y. Predicted Y will be that value which
satisfies the least squares criterion – that is, the value which makes the sum of the squared deviations about it
as small as possible -- Yˆ = a , error = Y − Yˆ . For one and only one value of Y, a, the intercept, is it true that
(Y − Yˆ ) 2
is as small as possible. Of course you already know that, as it was one of the three definitions of
the mean you learned very early in PSYC 6430. Although you did not realize it at the time, the first time you
calculated a mean you were actually conducting a regression analysis.
Consider the data set 1,2,3,4,5,6,7. Predicted Y = mean = 4. Here is a residuals plot. The sum of the
squared residuals is 28. The average squared residual, also known as the residual variance, is 28/7 = 4. I am
considering the seven data points here to be the entire population of interest. If I were considering these data
a sample, I would divide by 6 instead of 7 to estimate the population residual variance. Please note that this
residual variance is exactly the variance you long ago learned to calculate as 2
=
(Y − ) 2
.
n
Bivariate regression. Here we have a value of X associated with each value of Y. If X and Y are not
independent, we can reduce the residual (error) variance by using a bivariate model. Using the same values of
Y, but now each paired with a value of X, here is a scatter plot with regression line in black and residuals in
red.
5
The residuals are now -2.31, .30, .49, -.92, .89, -.53, and 2.08. The sum of the squared residuals is
11.91, yielding a residual variance of 11.91/7 = 1.70. With our univariate regression the residual variance was
4. By adding X to the model we have reduced the error in prediction considerably.
Trivariate regression. Here we add a second X variable. If that second X is not independent from
error variance in Y from the bivariate regression, the trivariate regression should provide even better prediction
of Y.
Here is a three-dimensional scatter plot of the trivariate data (produced with Proc g3d):
The lines (“needles”) help create the illusion of three-dimensionality, but they can be suppressed.
6
The predicted values here are those on the plane that passes through the three-dimensional space
such that the residuals (differences between predicted Y, on the plane, and observed Y) are as small as
possible.
The sum of the squared residuals now is .16 for a residual variance of .16/7 = .023. We have almost
eliminated the error in prediction.
Hyperspace. If we have three or more predictors, our scatter plot will be in hyperspace, and the
predicted values of Y will be located on the “regression surface” passing through hyperspace in such a way
that the sum of the squared residuals is as small as possible.
Dimension-Jumping. In univariate regression the predicted values are a constant. You have a point
in one-dimensional space. In bivariate regression the predicted values form a straight line regression surface
in two-dimensional space. In trivariate regression the predicted values form a plane in three dimensional
space. I have not had enough bourbons and beers tonight to continue this into hyperspace.
Standard multiple regression. In a standard multiple regression we have one continuous Y variable
and two or more continuous X variables. Actually, the X variables may include dichotomous variables and/or
categorical variables that have been “dummy coded” into dichotomous variables. The goal is to construct a
linear model that minimizes error in predicting Y. That is, we wish to create a linear combination of the X
variables that is maximally correlated with the Y variable. We obtain standardized regression coefficients (
weights ZY = 1Z1 + 2 Z2 ++ p Z p ) that represent how large an “effect” each X has on Y above
and beyond the effect of the other X’s in the model. The predictors may be entered all at once (simultaneous)
or in sets of one or more (sequential). We may use some a priori hierarchical structure to build the model
sequentially (enter first X1, then X2, then X3, etc., each time seeing how much adding the new X improves the
model, or, start with all X’s, then first delete X1, then delete X2, etc., each time seeing how much deletion of an
X affects the model). We may just use a statistical algorithm (one of several sorts of stepwise selection) to
build what we hope is the “best” model using some subset of the total number of X variables available.
For example, I may wish to predict college GPA from high school grades, SATV, SATQ, score on a
“why I want to go to college” essay, and quantified results of an interview with an admissions officer. Since
some of these measures are less expensive than others, I may wish to give them priority for entry into the
model. I might also give more “theoretically important” variables priority. I might also include sex and race as
predictors. I can also enter interactions between variables as predictors, for example, SATM x SEX, which
would be literally represented by an X that equals the subject’s SATM score times e’s sex code (typically 0 vs.
1 or 1 vs. 2). I may fit nonlinear models by entering transformed variables such as LOG(SATM) or SAT 2. We
shall explore lots of such fun stuff later.
7
As an example of a multiple regression analysis, consider the research reported by McCammon,
Golden, and Wuensch in the Journal of Research in Science Teaching, 1988, 25, 501-510. Subjects were
students in freshman and sophomore level Physics courses (only those courses that were designed for
science majors, no general education <football physics> courses). The mission was to develop a model to
predict performance in the course. The predictor variables were CT (the Watson-Glaser Critical Thinking
Appraisal), PMA (Thurstone’s Primary Mental Abilities Test), ARI (the College Entrance Exam Board’s
Arithmetic Skills Test), ALG (the College Entrance Exam Board’s Elementary Algebra Skills Test), and ANX
(the Mathematics Anxiety Rating Scale). The criterion variable was subjects’ scores on course examinations.
All of the predictor variables were significantly correlated with one another and with the criterion variable. A
simultaneous multiple regression yielded a multiple R of .40 (which is more impressive if you consider that the
data were collected across several sections of different courses with different instructors). Only ALG and CT
had significant semipartial correlations (indicating that they explained variance in the criterion that was not
explained by any of the other predictors). Both forward and backwards selection analyses produced a
model containing only ALG and CT as predictors. At Susan McCammon’s insistence, I also separately
analyzed the data from female and male students. Much to my surprise I found a remarkable sex difference.
Among female students every one of the predictors was significantly related to the criterion, among male
students none of the predictors was. There were only small differences between the sexes on variance in the
predictors or the criterion, so it was not a case of there not being sufficient variability among the men to support
covariance between their grades and their scores on the predictor variables. A posteriori searching of the
literature revealed that Anastasi (Psychological Testing, 1982) had noted a relatively consistent finding of sex
differences in the predictability of academic grades, possibly due to women being more conforming and more
accepting of academic standards (better students), so that women put maximal effort into their studies,
whether or not they like the course, and according they work up to their potential. Men, on the other hand, may
be more fickle, putting forth maximum effort only if they like the course, thus making it difficult to predict their
performance solely from measures of ability.
CANONICAL CORRELATION/REGRESSION:
Also known as multiple multiple regression or multivariate multiple regression. All other multivariate
techniques may be viewed as simplifications or special cases of this “fully multivariate general linear model.”
We have two sets of variables (set X and set Y). We wish to create a linear combination of the X variables
(b1X1 + b2X2 + .... + bpXp), called a canonical variate, that is maximally correlated with a linear combination of
the Y variables (a1Y1 + a2Y2 + .... + aqYq). The coefficients used to weight the X’s and the Y’s are chosen with
one criterion, maximize the correlation between the two linear combinations.
As an example, consider the research reported by Patel, Long, McCammon, & Wuensch (Journal of
Interpersonal Violence, 1995, 10: 354-366). We had two sets of data on a group of male college students.
The one set was personality variables from the MMPI. One of these was the PD (psychopathically deviant)
scale, Scale 4, on which high scores are associated with general social maladjustment and hostility. The
second was the MF (masculinity/femininity) scale, Scale 5, on which low scores are associated with
stereotypical masculinity†. The third was the MA (hypomania) scale, Scale 9, on which high scores are
associated with overactivity, flight of ideas, low frustration tolerance, narcissism, irritability, restlessness,
hostility, and difficulty with controlling impulses. The fourth MMPI variable was Scale K, which is a validity
scale on which high scores indicate that the subject is “clinically defensive,” attempting to present himself in a
favorable light, and low scores indicate that the subject is unusually frank. The second set of variables was a
pair of homonegativity variables. One was the IAH (Index of Attitudes Towards Homosexuals), designed to
measure affective components of homophobia. The second was the SBS, (Self-Report of Behavior Scale),
designed to measure past aggressive behavior towards homosexuals, an instrument specifically developed for
this study.
With luck, we can interpret the weights (or, even better, the loadings, the correlations between each
canonical variable and the variables in its set) so that each of our canonical variates represents some
underlying dimension (that is causing the variance in the observed variables of its set). We may also think of
a canonical variate as a superordinate variable, made up of the more molecular variables in its set. After
8
constructing the first pair of canonical variates we attempt to construct a second pair that will explain as much
as possible of the (residual) variance in the observed variables, variance not explained by the first pair of
canonical variates. Thus, each canonical variate of the X’s is orthogonal to (independent of) each of the other
canonical variates of the X’s and each canonical variate of the Y’s is orthogonal to each of the other canonical
variates of the Y’s. Construction of canonical variates continues until you can no longer extract a pair of
canonical variates that accounts for a significant proportion of the variance. The maximum number of pairs
possible is the smaller of the number of X variables or number of Y variables.
In the Patel et al. study both of the canonical correlations were significant. The first canonical
correlation indicated that high scores on the SBS and the IAH were associated with stereotypical masculinity
(low Scale 5), frankness (low Scale K), impulsivity (high Scale 9), and general social maladjustment and
hostility (high Scale 4). The second canonical correlation indicated that having a low IAH but high SBS (not
being homophobic but nevertheless aggressing against gays) was associated with being high on Scales 5 (not
being stereotypically masculine) and 9 (impulsivity). The second canonical variate of the homonegativity
variables seems to reflect a general (not directed towards homosexuals) aggressiveness.
LOGISTIC REGRESSION
Logistic regression is used to predict a categorical (usually dichotomous) variable from a set of
predictor variables. With a categorical dependent variable, discriminant function analysis is usually employed if
all of the predictors are continuous and nicely distributed; logit analysis is usually employed if all of the
predictors are categorical; and logistic regression is often chosen if the predictor variables are a mix of
continuous and categorical variables and/or if they are not nicely distributed (logistic regression makes no
assumptions about the distributions of the predictor variables). Logistic regression has been especially popular
with medical research in which the dependent variable is whether or not a patient has a disease.
For a logistic regression, the predicted dependent variable is the estimated probability that a particular
subject will be in one of the categories (for example, the probability that Suzie Cue has the disease, given her
set of scores on the predictor variables).
As an example of the use of logistic regression in psychological research, consider the research done
by Wuensch and Poteat and published in the Journal of Social Behavior and Personality, 1998, 13, 139-150.
College students (N = 315) were asked to pretend that they were serving on a university research committee
hearing a complaint against animal research being conducted by a member of the university faculty. Five
different research scenarios were used: Testing cosmetics, basic psychological theory testing, agricultural
(meat production) research, veterinary research, and medical research. Participants were asked to decide
whether or not the research should be halted. An ethical inventory was used to measure participants’ idealism
(persons who score high on idealism believe that ethical behavior will always lead only to good consequences,
never to bad consequences, and never to a mixture of good and bad consequences) and relativism (persons
who score high on relativism reject the notion of universal moral principles, preferring personal and situational
analysis of behavior).
Since the dependent variable was dichotomous (whether or not the respondent decided to halt the
research) and the predictors were a mixture of continuous and categorical variables (idealism score, relativism
score, participant’s gender, and the scenario given), logistic regression was employed. The scenario variable
was represented by k−1 dichotomous dummy variables, each representing the contrast between the medical
scenario and one of the other scenarios. Idealism was negatively associated and relativism positively
associated with support for animal research. Women were much less accepting of animal research than were
men. Support for the theoretical and agricultural research projects was significantly less than that for the
medical research.
In a logistic regression, odds ratios are commonly employed to measure the strength of the partial
relationship between one predictor and the dependent variable (in the context of the other predictor variables).
It may be helpful to consider a simple univariate odds ratio first. Among the male respondents, 68 approved
9
continuing the research, 47 voted to stop it, yielding odds of 68 / 47. That is, approval was 1.45 times more
likely than nonapproval. Among female respondents, the odds were 60 / 140. That is, approval was only .43
times as likely as was nonapproval. Inverting these odds (odds less than one are difficult for some people to
comprehend), among female respondents nonapproval was 2.33 times as likely as approval. The ratio of
68 47
these odds, = 3.38 , indicates that a man was 3.38 times more likely to approve the research than
60 140
was a woman.
The odds ratios provided with the output from a logistic regression are for partial effects, that is, the
effect of one predictor holding constant the other predictors. For our example research, the odds ratio for
gender was 3.51. That is, holding constant the effects of all other predictors, men were 3.51 times more likely
to approve the research than were women.
The odds ratio for idealism was 0.50. Inverting this odds ratio for easier interpretation, for each one
point increase on the idealism scale there was a doubling of the odds that the respondent would not approve
the research. The effect of relativism was much smaller than that of idealism, with a one point increase on the
nine-point relativism scale being associated with the odds of approving the research increasing by a
multiplicative factor of 1.39. Inverted odds ratios for the dummy variables coding the effect of the scenario
variable indicated that the odds of approval for the medical scenario were 2.38 times higher than for the meat
scenario and 3.22 times higher than for the theory scenario.
Classification: The results of a logistic regression can be used to predict into which group a subject
will fall, given the subject’s scores on the predictor variables. For a set of scores on the predictor variables, the
model gives you the estimated probability that a subject will be in group 1 rather than in group 2. You need a
decision rule to determine into which group to classify a subject given that estimated probability. While the
most obvious decision rule would be to classify the subject into group 1 if p > .5 and into group 2 if p < .5, you
may well want to choose a different decision rule given the relative seriousness of making one sort of error (for
example, declaring a patient to have the disease when she does not) or the other sort of error (declaring the
patient not to have the disease when she does). For a given decision rule (for example, classify into group 1 if
p > .7) you can compute several measures of how effective the classification procedure is. The Percent
Correct is based on the number of correct classifications divided by the total number of classifications. The
Sensitivity is the percentage of occurrences correctly predicted (for example, of all who actually have the
disease, what percentage were correctly predicted to have the disease). The Specificity is the percentage of
nonoccurrences correctly predicted (of all who actually are free of the disease, what percentage were correctly
predicted not to have the disease). Focusing on error rates, the False Positive rate is the percentage of
predicted occurrences which are incorrect (of all who were predicted to have the disease, what percentage
actually did not have the disease), and the False Negative rate is the percentage of predicted nonoccurrences
which are incorrect (of all who were predicted not to have the disease, what percentage actually did have the
disease). For a screening test to detect a potentially deadly disease (such as breast cancer), you might be
quite willing to use a decision rule that makes false positives fairly likely, but false negatives very unlikely. I
understand that the false positive rate with mammograms is rather high. That is to be expected in an initial
screening test, where the more serious error is the false negative. Although a false positive on a mammogram
can certainly cause a woman some harm (anxiety, cost and suffering associated with additional testing), it may
be justified by making it less likely that tumors will go undetected. Of course, a positive on a screening test is
followed by additional testing, usually more expensive and more invasive, such as collecting tissue for biopsy.
For our example research, the overall percentage correctly classified is 69% with a decision rule being
“if p > .5, predict the respondent will support the research.” A slightly higher overall percentage correct (71%)
would be obtained with the rule “if p > .4, predict support” (73% sensitivity, 70% specificity) or with the rule “if p
> .54, predict support” (52% sensitivity, 84% specificity).
10
HIERARCHICAL LINEAR MODELING
Here you have data at two or more levels, with cases at one level nested within cases at at the next
higher level. For example, you have pupils at the lowest level, nested within schools at the second level, with
schools nested within school districts at the third level.
You may have different variables at the different levels and you may be interested in relating variables
to one another within levels and between levels.
Consider the research conducted by Rowan et al. (1991). At the lowest level the cases were teachers.
They provided ratings of the climate at the school (the “dependent” variables: Principal Leadership, Teacher
Control <of policy>, and Staff Cooperation) as well as data on Level 1 predictors such as race, sex, years of
experience, and subject taught. Teachers were nested within schools. Level 2 predictors were whether the
school was public or Catholic, its size, percentage minority enrollment, average student SES, and the like. At
Level 1, ratings of the climate were shown to be related to the demographic characteristics of the teacher. For
example, women thought the climate better than did men, and those teaching English, Science, and Math
thought the climate worse than did those teaching in other domains. At Level 2, the type of school (public or
Catholic) was related to ratings of climate, with climate being rated better at Catholic schools than at public
schools.
As another example, consider the analysis reported by Tabachnick and Fidell (2007, pp. 835-852),
using data described in the article by Fidell et al. (1995). Participants from households in three different
neighborhoods kept track of
• How annoyed they were by aircraft noise the previous night
• How long it took them to fall asleep the previous night
• How noisy it was at night (this was done by a noise-monitoring device in the home).
At the lowest level, the cases were nights (data were collected across several nights). At the next level
up the cases were the humans. Nights were nested within humans. At the next level up the cases were
households. Humans were nested within households. Note that Level 1 represents a repeated measures
dimension (nights).
There was significant variability in annoyance both among humans and among households, and both
sleep latency and noise level were significantly related to annoyance. The three different neighborhoods did
not differ from each other on amount of annoyance.
Here we start out with one set of variables. The variables are generally correlated with one another.
We wish to reduce the (large) number of variables to a smaller number of components or factors (I’ll explain
the difference between components and factors when we study this in detail) that capture most of the variance
in the observed variables. Each factor (or component) is estimated as being a linear (weighted) combination of
the observed variables. We could extract as many factors as there are variables, but generally most of them
would contribute little, so we try to get a few factors that capture most of the variance. Our initial extraction
generally includes the restriction that the factors be orthogonal, independent of one another.
Consider the analysis reported by Chia, Wuensch, Childers, Chuang, Cheng, Cesar-Romero, & Nava in
the Journal of Social Behavior and Personality, 1994, 9, 249-258. College students in Mexico, Taiwan, and the
US completed a 45 item Cultural Values Survey. A principal components analysis produced seven
components (each a linear combination of the 45 items) which explained in the aggregate 51% of the variance
in the 45 items. We could have explained 100% of the variance with 45 components, but the purpose of the
PCA is to explain much of the variance with relatively few components. Imagine a plot in seven dimensional
space with seven perpendicular (orthogonal) axes. Each axis represents one component. For each variable I
plot a point that represents its loading (correlation) with each component. With luck I’ll have seven “clusters” of
dots in this hyperspace (one for each component). I may be able to improve my solution by rotating the axes
so that each one more nearly passes through one of the clusters. I may do this by an orthogonal rotation
(keeping the axes perpendicular to one another) or by an oblique rotation. In the latter case I allow the axes
11
to vary from perpendicular, and as a result, the components obtained are no longer independent of one
another. This may be quite reasonable if I believe the underlying dimensions (that correspond to the extracted
components) are correlated with one another.
With luck (or after having tried many different extractions/rotations), I’ll come up with a set of loadings
that can be interpreted sensibly (that may mean finding what I expected to find). From consideration of which
items loaded well on which components, I named the components Family Solidarity (respect for the family),
Executive Male (men make decisions, women are homemakers), Conscience (important for family to conform
to social and moral standards), Equality of the Sexes (minimizing sexual stereotyping), Temporal
Farsightedness (interest in the future and the past), Independence (desire for material possessions and
freedom), and Spousal Employment (each spouse should make decisions about his/her own job). Now, using
weighting coefficients obtained with the analysis, I computed for each subject a score that estimated how much
of each of the seven dimensions e had. These component scores were then used as dependent variables in
3 x 2 x 2, Culture x Sex x Age (under 20 vs. over 20) ANOVAs. US students (especially the women) stood out
as being sexually egalitarian, wanting independence, and, among the younger students, placing little
importance on family solidarity. The Taiwanese students were distinguished by scoring very high on the
temporal farsightedness component but low on the conscience component. Among Taiwanese students the
men were more sexually egalitarian than the women and the women more concerned with independence than
were the men. The Mexican students were like the Taiwanese in being concerned with family solidarity but not
with sexual egalitarianism and independence, but like the US students in attaching more importance to
conscience and less to temporal farsightedness. Among the Mexican students the men attached more
importance to independence than did the women.
Factor analysis also plays a prominent role in test construction. For example, I factor analyzed
subjects’ scores on the 21 items in Patel’s SBS discussed earlier. Although the instrument was designed to
measure a single dimension, my analysis indicated that three dimensions were being measured. The first
factor, on which 13 of the items loaded well, seemed to reflect avoidance behaviors (such as moving away
from a gay, staring to communicate disapproval of proximity, and warning gays to keep away). The second
factor (six items) reflected aggression from a distance (writing anti-gay graffiti, damaging a gay’s property,
making harassing phone calls). The third factor (two items) reflected up-close aggression (physical fighting).
Despite this evidence of three factors, item analysis indicated that the instrument performed well as a measure
of a single dimension. Item-total correlations were good for all but two items. Cronbach’s alpha was .91, a
value which could not be increased by deleting from the scale any of the items. Cronbach’s alpha is
considered a measure of the reliability or internal consistency of an instrument. It can be thought of as the
mean of all possible split-half reliability coefficients (correlations between scores on half of the items vs. the
other half of the items, with the items randomly split into halves) with the Spearman-Brown correction (a
correction for the reduction in the correlation due to having only half as many items contributing to each score
used in the split-half reliability correlation coefficient—reliability tends to be higher with more items, ceteris
paribus). Please read the document Cronbach's Alpha and Maximized Lambda4. Follow the instructions there to
conduct an item analysis with SAS and with SPSS. Bring your output to class for discussion.
In recent years there has been considerable criticism of the use of Cronbach’s alpha as an estimate of
reliability. Many have suggested use of McDonald’s omega in place of Cronbach’s alpha. See From Alpha to
Omega: A Practical Solution to the Pervasive Problem of Internal Consistency Estimation. ECU folks have
access to the article through our library’s E-Journal Portal, and my current students can find it in
BlackBoard/Articles/Factor and Principal Components Analysis/McDonald’s Omega. I found a SAS macro to
compute omega, but never tried it out, since it is so easy to compute using JASP or R.
12
STRUCTURAL EQUATION MODELING (SEM)
This is a special form of hierarchical multiple regression analysis in which the researcher specifies a
particular causal model in which each variable affects one or
more of the other variables both directly and through its effects
upon intervening variables. The less complex models use only
unidirectional paths (if X1 has an effect on X2, then X2 cannot
have an effect on X1) and include only measured variables. Such
an analysis is referred to as a path analysis. Patel’s data,
discussed earlier, were originally analyzed (in her thesis) with a
path analysis. The model was that the MMPI scales were
noncausally correlated with one another but had direct causal
effects on both IAH and SBS, with IAH having a direct causal
effect on SBS. The path analysis was not well received by
reviewers the first journal to which we sent the manuscript, so we
reanalyzed the data with the atheoretical canonical correlation/regression analysis presented earlier and
submitted it elsewhere. Reviewers of that revised manuscript asked that we supplement the canonical
correlation/regression analysis with a hierarchical multiple regression analysis (essentially a path analysis).
In a path analysis one obtains path coefficients, measuring the strength of each path (each causal or
noncausal link between one variable and another) and one assesses how well the model fits the data. The
arrows from ‘e’ represent error variance (the effect of variables not included in the model). One can compare
two different models and determine which one better fits the data. Our analysis indicated that the only
significant paths were from MF to IAH (–.40) and from MA (.25) and IAH (.4) to SBS.
SEM can include latent variables (factors), constructs that are not directly measured but rather are
inferred from measured variables (indicators).
The relationships between latent variables are referred to as the structural part of a model (as opposed
to the measurement part, which is the relationship between latent variables and measured variables). As an
example of SEM including latent variables, consider the research by Greenwald and Gillmore (Journal of
Educational Psychology, 1997, 89, 743-751) on the validity of student ratings of instruction (check out my
review of this research). Their analysis indicated that when students expect to get better grades in a class they
work less on that class and evaluate the course and the instructor more favorably. The indicators (measured
variables) for the Workload latent variable were questions about how much time the students spent on the
course and how challenging it was. Relative expected grade (comparing the grade expected in the rated
course with that the student usually got in other courses) was a more important indicator of the Expected
Grade latent variable than was absolute expected grade. The Evaluation latent variable was indicated by
questions about challenge, whether or not the student would take this course with the same instructor if e had
it to do all over again, and assorted items about desirable characteristics of the instructor and course.
Greenwald’s research suggests that instructors who have lenient grading policies will get good
evaluations but will not motivate their students to work hard enough to learn as much as they do with
instructors whose less lenient grading policies lead to more work but less favorable evaluations.
13
.57
Hours Worked per
Workload Credit Hour
.93
Evaluation
Confirmatory factor analysis can be considered a special case of SEM. In confirmatory factor
analysis the focus is on testing an a priori model of the factor structure of a group of measured variables.
Tabachnick and Fidell (5th edition) present an example (pages 732 - 749) in which the tested model
hypothesizes that intelligence in learning disabled children, as estimated by the WISC, can be represented by
two factors (possibly correlated with one another) with a particular simple structure (relationship between the
indicator variables and the factors).
You wish to predict group membership from a set of two or more continuous variables. The analysis
creates a set of discriminant functions (weighted combinations of the predictors) that will enable you to
predict into which group a case falls, based on scores on the predictor variables (usually continuous, but
could include dichotomous variables and dummy coded categorical predictors). The total possible number of
discriminant functions is one less than the number of groups, or the number of predictor variables, whichever is
less. Generally only a few of the functions will do a good job of discriminating group membership. The second
function, orthogonal to the first, analyses variance not already captured by the first, the third uses the residuals
from the first and second, etc. One may think of the resulting functions as dimensions on which the groups
differ, but one must remember that the weights are chosen to maximize the discrimination among groups,
not to make sense to you. Standardized discriminant function coefficients (weights) and loadings
(correlations between discriminant functions and predictor variables) may be used to label the functions. One
might also determine how well a function separates each group from all the rest to help label the function. It is
possible to do hierarchical/stepwise analysis and factorial (more than one grouping variable) analysis.
Consider what the IRS does with the data they collect from “random audits” of taxpayers. From each
taxpayer they collect data on a number of predictor variables (gross income, number of exemptions, amount of
deductions, age, occupation, etc.) and one classification variable, is the taxpayer a cheater (underpaid e’s
14
taxes) or honest. From these data they develop a discriminant function model to predict whether or not a
return is likely fraudulent. Next year their computers automatically test every return, and if yours fits the profile
of a cheater you are called up for a “discriminant analysis” audit. Of course, the details of the model are a
closely guarded secret, since if a cheater knew the discriminant function e could prepare his return with the
maximum amount of cheating that would result in e’s (barely) not being classified as a cheater.
As another example, consider the research done by Poulson, Braithwaite, Brondino, and Wuensch
(1997, Journal of Social Behavior and Personality, 12, 743-758). Subjects watched a simulated trial in which
the defendant was accused of murder and was pleading insanity. There was so little doubt about his having
killed the victim that none of the jurors voted for a verdict of not guilty. Aside from not guilty, their verdict
options were Guilty, NGRI (not guilty by reason of insanity), and GBMI (guilty but mentally ill). Each mock juror
filled out a questionnaire, answering 21 questions (from which 8 predictor variables were constructed) about
e’s attitudes about crime control, the insanity defense, the death penalty, the attorneys, and e’s assessment of
the expert testimony, the defendant’s mental status, and the possibility that the defendant could be
rehabilitated. To avoid problems associated with multicollinearity among the 8 predictor variables (they were
very highly correlated with one another, and such multicollinearity can cause problems in a multivariate
analysis), the scores on the 8 predictor variables were subjected to a principal components analysis, with the
resulting orthogonal components used as predictors in a discriminant analysis. The verdict choice (Guilty,
NGRI, or GBMI) was the criterion variable.
Both of the discriminant functions were significant. The first function discriminated between jurors
choosing a guilty verdict and subjects choosing a NGRI verdict. Believing that the defendant was mentally ill,
believing the defense’s expert testimony more than the prosecution’s, being receptive to the insanity defense,
opposing the death penalty, believing that the defendant could be rehabilitated, and favoring lenient treatment
were associated with rendering a NGRI verdict. Conversely, the opposite orientation on these factors was
associated with rendering a guilty verdict. The second function separated those who rendered a GBMI verdict
from those choosing Guilty or NGRI. Distrusting the attorneys (especially the prosecution attorney), thinking
rehabilitation likely, opposing lenient treatment, not being receptive to the insanity defense, and favoring the
death penalty were associated with rendering a GBMI verdict rather than a guilty or NGRI verdict.
This is essentially a DFA turned around. You have two or more continuous Y’s and one or more
categorical X’s. You may also throw in some continuous X’s (covariates, giving you a MANCOVA, multiple
analysis of covariance). The most common application of MANOVA in psychology is as a device to guard
against inflation of familywise alpha when there are multiple dependent variables. The logic is the same as
that of the protected t-test, where an omnibus ANOVA on your K-level categorical X must be significant before
you make pairwise comparisons among your K groups’ means on Y. You do a MANOVA on your multiple Y’s.
If it is significant, you may go on and do univariate ANOVAs (one on each Y), if not, you stop. In a factorial
analysis, you may follow-up any effect which is significant in the MANOVA by doing univariate analyses for
each such effect.
As an example, consider the MANOVA I did with data from a simulated jury trial with Taiwanese
subjects (see Wuensch, Chia, Castellow, Chuang, & Cheng, Journal of Cross-Cultural Psychology, 1993, 24,
414-427). The same experiment had earlier been done with American subjects. X’s consisted of whether or
not the defendant was physically attractive, sex of the defendant, type of alleged crime (swindle or burglary),
culture of the defendant (American or Chinese), and sex of subject (juror). Y’s consisted of length of sentence
given the defendant, rated seriousness of the crime, and ratings on 12 attributes of the defendant. I did two
MANOVAs, one with length of sentence and rated seriousness of the crime as Y’s, one with ratings on the 12
attributes as Y’s. On each I first inspected the MANOVA. For each effect (main effect or interaction) that was
significant on the MANOVA, I inspected the univariate analyses to determine which Y’s were significantly
associated with that effect. For those that were significant, I conducted follow-up analyses such as simple
interaction analyses and simple main effects analyses. A brief summary of the results follows: Female
subjects gave longer sentences for the crime of burglary, but only when the defendant was American;
attractiveness was associated with lenient sentencing for American burglars but with stringent sentencing for
15
American swindlers (perhaps subjects thought that physically attractive swindlers had used their
attractiveness in the commission of the crime and thus were especially deserving of punishment); female jurors
gave more lenient sentences to female defendants than to male defendants; American defendants were rated
more favorably (exciting, happy, intelligent, sociable, strong) than were Chinese defendants; physically
attractive defendants were rated more favorably (attractive, calm, exciting, happy, intelligent, warm) than were
physically unattractive defendants; and the swindler was rated more favorably (attractive, calm, exciting,
independent, intelligent, sociable, warm) than the burglar.
In MANOVA the Y’s are weighted to maximize the correlation between their linear combination and the
X’s. A different linear combination (canonical variate) is formed for each effect (main effect or interaction—in
fact, a different linear combination is formed for each treatment df—thus, if an independent variable consists
of four groups, three df, there are three different linear combinations constructed to represent that effect, each
orthogonal to the others). Standardized discriminant function coefficients (weights for predicting X from
the Y’s) and loadings (for each linear combination of Y’s, the correlations between the linear combination and
the Y’s themselves) may be used better to define the effects of the factors and their interactions. One may
also do a “stepdown analysis” where one enters the Y’s in an a priori order of importance (or based solely on
statistical criteria, as in stepwise multiple regression). At each step one evaluates the contribution of the newly
added Y, above and beyond that of the Y’s already entered.
As an example of an analysis which uses more of the multivariate output than was used with the
example two paragraphs above, consider again the research done by Moore, Wuensch, Hedges, and
Castellow (1994, discussed earlier under the topic of log-linear analysis). Recall that we manipulated the
physical attractiveness and social desirability of the litigants in a civil case involving sexual harassment. In
each of the experiments in that study we had subjects fill out a rating scale, describing the litigant (defendant or
plaintiff) whose attributes we had manipulated. This analysis was essentially a manipulation check, to verify
that our manipulations were effective. The rating scales were nine-point scales, for example,
Awkward Poised
1 2 3 4 5 6 7 8 9
There were 19 attributes measured for each litigant. The data from the 19 variables were used as
dependent variables in a three-way MANOVA (social desirability manipulation, physical attractiveness
manipulation, gender of subject). In the first experiment, in which the physical attractiveness and social
desirability of the defendant were manipulated, the MANOVA produced significant effects for the social
desirability manipulation and the physical attractiveness manipulation, but no other significant effects. The
canonical variate maximizing the effect of the social desirability manipulation loaded most heavily (r > .45) on
the ratings of sociability (r = .68), intelligence (r = .66), warmth (r = .61), sensitivity (r = .50), and kindness (r =
.49). Univariate analyses indicated that compared to the socially undesirable defendant, the socially desirable
defendant was rated significantly more poised, modest, strong, interesting, sociable, independent, warm,
genuine, kind, exciting, sexually warm, secure, sensitive, calm, intelligent, sophisticated, and happy. Clearly
the social desirability manipulation was effective.
The canonical variate that maximized the effect of the physical attractiveness manipulation loaded
heavily only on the physical attractiveness ratings (r = .95), all the other loadings being less than .35. The
mean physical attractiveness ratings were 7.12 for the physically attractive defendant and 2.25 for the
physically unattractive defendant. Clearly the physical attractiveness manipulation was effective. Univariate
analyses indicated that this manipulation had significant effects on several of the ratings variables. Compared
to the physically unattractive defendant, the physically attractive defendant was rated significantly more poised,
strong, interesting, sociable, physically attractive, warm, exciting, sexually warm, secure, sophisticated, and
happy.
In the second experiment, in which the physical attractiveness and social desirability of the plaintiff
were manipulated, similar results were obtained. The canonical variate maximizing the effect of the social
desirability manipulation loaded most heavily (r > .45) on the ratings of intelligence (r = .73), poise (r = .68),
sensitivity (r = .63), kindness (r = .62), genuineness (r = .56), warmth (r = .54), and sociability (r = .53).
16
Univariate analyses indicated that compared to the socially undesirable plaintiff the socially desirable
plaintiff was rated significantly more favorably on all nineteen of the adjective scale ratings.
The canonical variate that maximized the effect of the physical attractiveness manipulation loaded
heavily only on the physical attractiveness ratings (r = .84), all the other loadings being less than .40. The
mean physical attractiveness ratings were 7.52 for the physically attractive plaintiff and 3.16 for the physically
unattractive plaintiff. Univariate analyses indicated that this manipulation had significant effects on several of
the ratings variables. Compared to the physically unattractive plaintiff the physically attractive plaintiff was
rated significantly more poised, interesting, sociable, physically attractive, warm, exciting, sexually warm,
secure, sophisticated, and happy.
An ANOVA may be done as a multiple regression, with the categorical X’s coded as “dummy variables.”
A K-level X is represented by K-1 dichotomous dummy variables. An interaction between two X’s is
represented by products of the main effects X’s. For example, were factors A and B both dichotomous, we
could code A with X1 (0 or 1), B with X2 (0 or 1), and A x B with X3, where X3 equals X1 times X2. Were A
dichotomous and B had three levels, the main effect of B would require two dummy variables, X2 and X3, and
the A x B interaction would require two more dummy variables, X4 (the product of X1 and X2) and X5 (the
product of X1 and X3). [Each effect will require as many dummy variables as the df it has.] In the multiple
regression the SS due to X1 would be the SSA, the SSB would be the combined SS for X2 and X3, and the
interaction SS would be the combined SS for X4 and X5. There are various ways we can partition the SS, but
we shall generally want to use Overall and Spiegel’s Method I, where each effect is partialled for all other
effects. That is, for example, SSA is the SS that is due solely to A (the increase in the SSreg when we added
A’s dummy variable(s) to a model that already includes all other effects). Any variance in Y that is ambiguous
(could be assigned to more than one effect) is disregarded. There will, of course, be such ambiguous variance
only when the independent variables are nonorthogonal (correlated, as indicated by the unequal cell sizes).
Overall and Spiegel’s Method I least-squares ANOVA is the method that is approximated by the “by hand”
unweighted means ANOVA that you learned earlier.
ANCOV
In the analysis of covariance you enter one or more covariates (usually continuous, but may be
dummy coded categorical variables) into the multiple correlation before or at the same time that you enter
categorical predictor variables (dummy codes). The effect of each factor or interaction is the increase in the
SSreg when that factor is added to a model that already contains all of the other factors and interactions and all
of the covariates.
In the ideal circumstance, you have experimentally manipulated the categorical variables (independent
variables), randomly assigned subjects to treatments, and measured the covariate(s) prior to the manipulation
of the independent variable. In this case, the inclusion of the covariate(s) in the model will reduce what would
otherwise be error in the model, and this can greatly increase the power of your analysis. Consider the
following partitioning of the sums of squares of post-treatment wellness scores. The Treatment variable is
Type of Therapy used with your patients, three groups. The F ratio testing the treatment will be the ratio of the
Treatment Mean Square to the Error Mean Square.
17
Sums of Squares
Treatment
Error
Your chances of getting a significant result are going to be a lot better if you can do something to
reduce the size of the error variance, which goes into the denominator of the F ratio. Reducing the size of the
Mean Square Error (the denominator of the F ratio) will increase the value of F and lower the p value.
Suppose you find that you have, for each of your subjects, a score on the wellness measure taken prior to the
treatment. Those baseline scores are likely well correlated with the post-treatment scores. You add the
baseline wellness to the model. – that is, baseline wellness becomes a covariate.
Sum of Squares
Treatment
Baseline
Error
Wow! You have cut the error in half. This will greatly increase the value of the F testing the effect of
the treatment. In statistics, getting a big F is generally a good thing, as it leads to significant results.
Now you discover that you also have, for each subject, a pre-treatment measure of blood levels of
scatophobin, a neurohormone thought to be associated with severity of the treated illness. You now include
that as a second covariate.
Sum of Squares
Treatment
Baseline
Scatophobin
Error
Double WOW! You have reduced the error variance even more, gaining additional power and
additional precision with respect to your estimates of effect sizes (tighter confidence intervals).
If your categorical predictor variables are correlated with the covariate(s), then removing the
effects of the covariates may also remove some of the effects of the factors, which may not be what you
wanted to do. Such a confounding of covariates with categorical predictors often results from:
• subjects not being randomly assigned to treatments
• the covariates being measured after the manipulations of the independent variables(s) -- and those
manipulations changed subjects’ scores on the covariates
• the categorical predictors being nonexperimental (not manipulated),
18
Typically the psychologist considers the continuous covariates to be nuisance variables, whose
effects are to be removed prior to considering the effects of categorical predictor variables. The same model
can be used to predict scores on a continuous outcome variable from a mixture of continuous and categorical
predictor variables, even when the researcher does not consider the continuous covariates to be nuisance
variables. For example, consider the study by Wuensch and Poteat discussed earlier as an example of logistic
regression. A second dependent variable was respondents’ scores on a justification variable (after reading the
case materials, each participant was asked to rate on a 9-point scale how justified the research was, from “not
at all” to “completely”). We used an ANCOV model to predict justification scores from idealism, relativism,
gender, and scenario. Although the first two predictors were continuous (“covariates”), we did not consider
them to be nuisance variables, we had a genuine interest in their relationship with the dependent variable. A
brief description of the results of the ANCOV follows:
There were no significant interactions between predictors, but each predictor had a significant main
effect. Idealism was negatively associated with justification, = −0.32, r = −0.36, F(1, 303) = 40.93, p < .001,
relativism was positively associated with justification, = .20, r = .22, F(1, 303) = 15.39, p < .001, mean
justification was higher for men (M = 5.30, SD = 2.25) than for women (M = 4.28, SD = 2.21), F(1, 303) =
13.24, p < .001, and scenario had a significant omnibus effect, F(4, 303) = 3.61, p = .007. Using the medical
scenario as the reference group, the cosmetic and the theory scenarios were found to be significantly less
justified.
An ANOVA may include one or more categorical predictors for which the groups are not independent.
Subjects may be measured at each level of the treatment variable (repeated measures, within-subjects).
Alternatively, subjects may be blocked on the basis of variables known to be related to the dependent variable
and then, within each block, randomly assigned to treatments (the randomized blocks design). In either case,
a repeated measures ANOVA may be appropriate if the dependent variable is normally distributed and other
assumptions are met.
The traditional repeated measures analyses of variance (aka “univariate approach”) has a sphericity
assumption: the standard error of the difference between pairs of means is constant across all pairs of
means. That is, for comparing the mean at any one level of the repeated factor versus any other level of the
repeated factor, the diff is the same as it would be for any other pair of levels of the repeated factor. Howell
(page 443 of the 6th edition of Statistical Methods for Psychology) discusses compound symmetry, a
somewhat more restrictive assumption. There are adjustments (of degrees of freedom) to correct for violation
of the sphericity assumption, but at a cost of lower power.
A more modern approach, the multivariate approach to repeated measures designs, does not have
such a sphericity assumption. In the multivariate approach the effect of a repeated measures dimension (for
example, whether this score represents Suzie Cue’s headache duration during the first, second, or third week
of treatment) is coded by computing k−1 difference scores (one for each degree of freedom for the repeated
factor) and then treating those difference scores as dependent variables in a MANOVA.
You are already familiar with the basic concepts of main effects, interactions, and simple effects from
our study of independent samples ANOVA. We remain interested in these same sorts of effects in ANOVA
with repeated measures, but we must do the analysis differently. While it might be reasonable to conduct such
an analysis by hand when the design is quite simple, typically computer analysis will be employed.
If your ANOVA design has one or more repeated factors and multiple dependent variables, then you
can do a doubly multivariate analysis, with the effect of the repeated factor being represented by a set of
k−1 difference scores for each of the two or more dependent variables. For example, consider my study on the
effects of cross-species rearing of house mice (Animal Learning & Behavior, 1992, 20, 253-258). Subjects
were house mice that had been reared by house mice, deer mice, or Norway rats. The species of the foster
mother was a between-subjects (independent samples) factor. I tested them in an apparatus where they could
19
visit four tunnels: One scented with clean pine shavings, one scented with the smell of house mice, one
scented with the smell of deer mice, and one scented with the smell of rats. The scent of the tunnel was a
within-subjects factor, so I had a mixed factorial design (one or more between-subjects factor, one or more
within-subjects factor). I had three dependent variables: The latency until the subject first entered each tunnel,
how many visits the subject made to each tunnel, and how much time each subject spent in each tunnel.
Since the doubly multivariate analysis indicated significant effects (interaction between species of the foster
mother and scent of the tunnel, as well as significant main effects of each factor), singly multivariate ANOVA
(that is, on one dependent variable at a time, but using the multivariate approach to code the repeated factor)
was conducted on each dependent variable (latency, visits, and time). The interaction was significant for each
dependent variable, so simple main effects analyses were conducted. The basic finding (somewhat simplified
here) was that with respect to the rat-scented tunnel, those subjects who had been reared by a rat had shorter
latencies to visit the tunnel, visited that tunnel more often, and spent more time in that tunnel. If you consider
that rats will eat house mice, it makes good sense for a house mouse to be disposed not to enter tunnels that
smell like rats. Of course, my rat-reared mice may have learned to associate the smell of rat with obtaining
food (nursing from their rat foster mother) rather than being food!
CLUSTER ANALYSIS
In a cluster analysis the goal is to cluster cases (research units) into groups that share similar
characteristics. Contrast this goal with the goal of principal components and factor analysis, where one groups
variables into components or factors based on their having similar relationships with with latent variables.
While cluster analysis can also be used to group variables rather than cases, I have no familiarity with that
application.
I have never had a set of research data for which I though cluster analysis appropriate, but I wanted to
play around with it, so I obtained, from online sources, data on faculty in my department: Salaries, academic
rank, course load, experience, and number of published articles. I instructed SPSS to group the cases (faculty
members) based on those variables. I asked SPSS to standardize all of the variables to z scores. This
results in each variable being measured on the same scale and the variables being equally weighted. I had
SPSS use agglomerative hierarchical clustering. With this procedure each case initially is a cluster of its
own. SPSS compares the distance between each case and the next and then clusters together the two cases
which are closest. I had SPSS use the squared Euclidian distance between cases as the measure of
v 2
distance. This is quite simply ( X i − Yi ) , the sum across variables (from i = 1 to v) of the squared
i =1
difference between the score on variable i for the one case (Xi) and the score on variable i for the other case
(Yi). At the next step SPSS recomputes all the distances between entities (cases and clusters) and then
groups together the two with the smallest distance. When one or both of the entities is a cluster, SPSS
computes the averaged squared Euclidian distance between members of the one entity and members of the
other entity. This continues until all cases have been grouped into one giant cluster. It is up to the researcher
to decide when to stop this procedure and accept a solution with k clusters. K can be any number from 1 to
the number of cases.
SPSS produces both tables and graphics that help the analyst follow the process and decide which
solution to accept I obtained 2, 3, and 4 cluster solutions. In the k = 2 solution the one cluster consisted of all
the adjunct faculty (excepting one) and the second cluster consisted of everybody else. I compared the two
clusters (using t tests) and found compared to the regular faculty the adjuncts had significantly lower salary,
experience, course load, rank, and number of publications.
In the k = 3 solution the group of regular faculty was split into two groups, with one group consisting of
senior faculty (those who have been in the profession long enough to get a decent salary and lots of
publications) and the other group consisting of junior faculty (and a few older faculty who just never did the
things that gets one merit pay increases). I used plots of means to show that the senior faculty had greater
salary, experience, rank, and number of publications than did the junior faculty.
20
In the k = 4 solution the group of senior faculty was split into two clusters. One cluster consisted of
the acting chair of the department (who had a salary and a number of publications considerably higher than the
others) and the other cluster consisting of the remaining senior faculty (excepting those few who had been
clustered with the junior faculty).
There are other ways of measuring the distance between clusters and other methods of doing the
clustering. For example, one can do divisive hierarchical clustering, in which one starts out with all cases in
one big cluster and then splits off cases into new clusters until every case is a cluster all by itself.
Aziz and Zickar (2006: A cluster analysis investigation of workaholism as a syndrome, Journal of
Occupational Health Psychology, 11, 52-62) is a good example of the use of cluster analysis with
psychological data. Some have defined workaholism as being high in work involvement, high in drive to work,
and low in work enjoyment. Aziz and Zickar obtained measures of work involvement, drive to work, and work
enjoyment and conducted a cluster analysis. One of the clusters in the three-cluster solution did look like
workaholics – high in work involvement and drive to work but low in work enjoyment. A second cluster
consisted of positively engaged workers (high on work involvement and work enjoyment) and a third consisted
of unengaged workers (low in involvement, drive, and enjoyment).
• Multivariate Effect Size Estimation – supplemental chapter from Kline, Rex. B. (2004).
Beyond significance testing: Reforming data analysis methods in behavioral research.
Washington, DC: American Psychological Association.
• Statistics Lessons
• MANOVA, Familywise Error, and the Boogey Man
• SAS Lessons
• SPSS Lessons
Endnote
† A high Scale 5 score indicates that the individual is more like members of the other gender than are most
people. A man with a high Scale 5 score lacks stereotypical masculine interests, and a woman with a high
Scale 5 score has interests that are stereotypically masculine. Low Scale 5 scores indicate stereotypical
masculinity in men and stereotypical femininity in women. MMPI Scale scores are “T-scores” – that is, they
have been standardized to mean 50, standard deviation 10. The normative group was residents of Minnesota
in the 1930’s. The MMPI-2 was normed on what should be a group more representative of US residents.
nasser.hasan@miami.edu
Overview
• Brief introduction of Multiple Linear Regression.
o Model specification
o Assumptions
• Variable Selection.
Multiple Regression Using SPSS
Overview
Simple Linear Regression
Error
X Y
Multiple Regression Using SPSS
Overview
Multiple Linear Regression
Error
X1
X2 Y
Xk
Multiple Regression Using SPSS
Overview
Assumptions
Multiple Regression Using SPSS
Overview
Assumptions
Multiple Regression Using SPSS
Overview
Assumptions
6) No perfect collinearity.
Multiple Regression Using SPSS
Predictors: hours spent revising, anxiety scores, and A-level entry points.
Multiple Regression Using SPSS
Error
Anxiety score
SPSS Output
Multiple Regression Using SPSS
SPSS Output
Multiple Regression Using SPSS
SPSS Output
Multiple Regression Using SPSS
The overall model is significantly useful in explaining exam score, ! (3, 16) = 32.81,
" < .05.
Multiple Regression Using SPSS
• Hours has significant effect on exam score, #(16)=3.23, " < .05.
• Anxiety does not have a significant effect on exam score, #(16)=1.80, " = .09.
• A-level has significant effect on exam score, #(16)=4.24, " < .05.
Multiple Regression Using SPSS
• Forward.
• Backward.
• Stepwise.
Multiple Regression Using SPSS
SPSS Output
Multiple Regression Analysis using
SPSS Statistics
Introduction
Multiple regression is an extension of simple linear regression. It is used when we
want to predict the value of a variable based on the value of two or more other
variables. The variable we want to predict is called the dependent variable (or
sometimes, the outcome, target or criterion variable). The variables we are using to
predict the value of the dependent variable are called the independent variables (or
sometimes, the predictor, explanatory or regressor variables).
For example, you could use multiple regression to understand whether exam
performance can be predicted based on revision time, test anxiety, lecture
attendance and gender. Alternately, you could use multiple regression to understand
whether daily cigarette consumption can be predicted based on smoking duration,
age when started smoking, smoker type, income and gender.
Multiple regression also allows you to determine the overall fit (variance explained)
of the model and the relative contribution of each of the predictors to the total
variance explained. For example, you might want to know how much of the variation
in exam performance can be explained by revision time, test anxiety, lecture
attendance and gender "as a whole", but also the "relative contribution" of each
independent variable in explaining the variance.
This "quick start" guide shows you how to carry out multiple regression using SPSS
Statistics, as well as interpret and report the results from this test. However, before
we introduce you to this procedure, you need to understand the different
assumptions that your data must meet in order for multiple regression to give you a
valid result. We discuss these assumptions next.
SPSS Statistics
Assumptions
When you choose to analyse your data using multiple regression, part of the process
involves checking to make sure that the data you want to analyse can actually be
analysed using multiple regression. You need to do this because it is only
appropriate to use multiple regression if your data "passes" eight assumptions that
are required for multiple regression to give you a valid result. In practice, checking
for these eight assumptions just adds a little bit more time to your analysis, requiring
you to click a few more buttons in SPSS Statistics when performing your analysis, as
well as think a little bit more about your data, but it is not a difficult task.
Before we introduce you to these eight assumptions, do not be surprised if, when
analysing your own data using SPSS Statistics, one or more of these assumptions is
violated (i.e., not met). This is not uncommon when working with real-world data
rather than textbook examples, which often only show you how to carry out multiple
regression when everything goes well! However, don’t worry. Even when your data
fails certain assumptions, there is often a solution to overcome this. First, let's take a
look at these eight assumptions:
SPSS Statistics
Example
A health researcher wants to be able to predict "VO2max", an indicator of fitness and
health. Normally, to perform this procedure requires expensive laboratory equipment
and necessitates that an individual exercise to their maximum (i.e., until they can
longer continue exercising due to physical exhaustion). This can put off those
individuals who are not very active/fit and those individuals who might be at higher
risk of ill health (e.g., older unfit subjects). For these reasons, it has been desirable
to find a way of predicting an individual's VO2max based on attributes that can be
measured more easily and cheaply. To this end, a researcher recruited 100
participants to perform a maximum VO2max test, but also recorded their "age",
"weight", "heart rate" and "gender". Heart rate is the average of the last 5 minutes of
a 20 minute, much easier, lower workload cycling test. The researcher's goal is to be
able to predict VO2max based on these four attributes: age, weight, heart rate and
gender.
SPSS Statistics
Setup in SPSS Statistics
In SPSS Statistics, we created six variables: (1) VO2max , which is the maximal
aerobic capacity; (2) age , which is the participant's age; (3) weight , which is the
participant's weight (technically, it is their 'mass'); (4) heart_rate , which is the
participant's heart rate; (5) gender , which is the participant's gender; and (6) caseno ,
which is the case number. The caseno variable is used to make it easy for you to
eliminate cases (e.g., "significant outliers", "high leverage points" and "highly
influential points") that you have identified when checking for assumptions. In our
enhanced multiple regression guide, we show you how to correctly enter data in
SPSS Statistics to run a multiple regression when you are also checking for
assumptions. You can learn about our enhanced data setup content on
our Features: Data Setup page. Alternately, see our generic, "quick start"
guide: Entering Data in SPSS Statistics.
SPSS Statistics
Test Procedure in SPSS Statistics
The seven steps below show you how to analyse your data using multiple regression
in SPSS Statistics when none of the eight assumptions in the previous
section, Assumptions, have been violated. At the end of these seven steps, we show
you how to interpret the results from your multiple regression. If you are looking for
help to make sure your data meets assumptions #3, #4, #5, #6, #7 and #8, which are
required when using multiple regression and can be tested using SPSS Statistics,
you can learn more in our enhanced guide (see our Features: Overview page to
learn more).
Note: The procedure that follows is identical for SPSS Statistics versions 18 to
28, as well as the subscription version of SPSS Statistics, with version 28 and
the subscription version being the latest versions of SPSS Statistics. However,
in version 27 and the subscription version, SPSS Statistics introduced a new
look to their interface called "SPSS Light", replacing the previous look
for versions 26 and earlier versions, which was called "SPSS Standard".
Therefore, if you have SPSS Statistics versions 27 or 28 (or the subscription
version of SPSS Statistics), the images that follow will be light grey rather than
blue. However, the procedure is identical.
SPSS Statistics
Interpreting and Reporting the Output of Multiple Regression
Analysis
SPSS Statistics will generate quite a few tables of output for a multiple regression
analysis. In this section, we show you only the three main tables required to
understand your results from the multiple regression procedure, assuming that no
assumptions have been violated. A complete explanation of the output you have to
interpret when checking your data for the eight assumptions required to carry out
multiple regression is provided in our enhanced guide. This includes relevant
scatterplots and partial regression plots, histogram (with superimposed normal
curve), Normal P-P Plot and Normal Q-Q Plot, correlation coefficients and
Tolerance/VIF values, casewise diagnostics and studentized deleted residuals.
However, in this "quick start" guide, we focus only on the three main tables you need
to understand your multiple regression results, assuming that your data has already
met the eight assumptions required for multiple regression to give you a valid result:
The first table of interest is the Model Summary table. This table provides the R, R2,
adjusted R2, and the standard error of the estimate, which can be used to determine
how well a regression model fits the data:
Statistical significance
The F-ratio in the ANOVA table (see below) tests whether the overall regression
model is a good fit for the data. The table shows that the independent variables
statistically significantly predict the dependent variable, F(4, 95) = 32.393, p < .0005
(i.e., the regression model is a good fit of the data).
Unstandardized coefficients indicate how much the dependent variable varies with
an independent variable when all other independent variables are held constant.
Consider the effect of age in this example. The unstandardized coefficient, B1,
for age is equal to -0.165 (see Coefficients table). This means that for each one
year increase in age, there is a decrease in VO2max of 0.165 ml/min/kg.
You can test for the statistical significance of each of the independent variables. This
tests whether the unstandardized (or standardized) coefficients are equal to 0 (zero)
in the population. If p < .05, you can conclude that the coefficients are statistically
significantly different to 0 (zero). The t-value and corresponding p-value are located
in the "t" and "Sig." columns, respectively, as highlighted below:
You can see from the "Sig." column that all independent variable coefficients are
statistically significantly different from 0 (zero). Although the intercept, B0, is tested
for statistical significance, this is rarely an important or interesting finding.
• General
A multiple regression was run to predict VO2max from gender, age, weight and heart
rate. These variables statistically significantly predicted VO2max, F(4, 95) =
32.393, p < .0005, R2 = .577. All four variables added statistically significantly to the
prediction, p < .05.
If you are unsure how to interpret regression equations or how to use them to make
predictions, we discuss this in our enhanced multiple regression guide. We also
show you how to write up the results from your assumptions tests and multiple
regression output if you need to report this in a dissertation/thesis, assignment or
research report. We do this using the Harvard and APA styles. You can learn more
about our enhanced content on our Features: Overview page.
Join the 10,000s of students, academics and professionals who rely
on Laerd Statistics.TAKE THE TOURPLANS & PRICING
1
HomeAbout UsContact UsTerms & ConditionsPrivacy & Cookies© 2018 Lund Research Ltd
Exploratory Factor Analysis 1
1. INTRODUCTION
Many scientific studies are featured by the fact that “numerous variables are used to
characterize objects” (Rietveld & Van Hout 1993: 251). Examples are studies in which
questionnaires are used that consist of a lot of questions (variables), and studies in which
mental ability is tested via several subtests, like verbal skills tests, logical reasoning ability
tests, etcetera (Darlington 2004). Because of these big numbers of variables that are into play,
the study can become rather complicated. Besides, it could well be that some of the variables
measure different aspects of a same underlying variable.
For situations such as these, (exploratory1) factor analysis has been invented2. Factor
analysis attempts to bring intercorrelated variables together under more general, underlying
variables. More specifically, the goal of factor analysis is to reduce “the dimensionality of the
original space and to give an interpretation to the new space, spanned by a reduced number of
new dimensions which are supposed to underlie the old ones” (Rietveld & Van Hout 1993:
254), or to explain the variance in the observed variables in terms of underlying latent factors”
(Habing 2003: 2) Thus, factor analysis offers not only the possibility of gaining a clear view
of the data, but also the possibility of using the output in subsequent analyses (Field 2000;
Rietveld & Van Hout 1993).
In this paper an example will be given of the use of factor analysis. This will be done
by carrying out a factor analysis on data from a study in the field of applied linguistics, using
SPSS for Windows. For this to be understandable, however, it is necessary to discuss the
theory behind factor analysis.
1
Next to exploratory factor analysis, confirmatory factor analysis exists. This paper is only about exploratory
factor analysis, and will henceforth simply be named factor analysis.
2
A salient detail is that it was exactly the problem concerned with the multiple tests of mental ability that made
the psychologist Charles Spearman invent factor analysis in 1904 (Darlington 2004).
Exploratory Factor Analysis 2
2.2.1. Measurements
Since factor analysis departures from a correlation matrix, the used variables should first of all
be measured at (at least) an interval level. Secondly, the variables should roughly be normally
distributed; this makes it possible to “generalize the results of your analysis beyond the
sample collected” (Field 2000: 444). Thirdly, the sample size should be taken into
consideration, as correlations are not resistant (Moore & McCabe 2002: 103), and can hence
seriously influence the reliability of the factor analysis (Field 2000: 443; Habing 2003).
According to Field (2000: 443) “much has been written about the necessary sample
size for factor analysis resulting in many ‘rules-of-thumb’”. Field himself, for example, states
that a researcher should have “at least 10-15 subjects per variable” (p. 443). Habing (2003),
however, states that “you should have at least 50 observations and at least 5 times as many
observations as variables” (p. 3). Fortunately, Monte Carlo studies have resulted in more
specific statements concerning sample size (Field 2000: 443; Habing 2003). The general
conclusion of these studies was that “the most important factors in determining reliable factor
solutions was the absolute sample size and the absolute magnitude of factor loadings” (Field
2000: 443): the more frequent and higher the loadings are on a factor, the smaller the sample
can be.
Field (2000) also reports on a more recent study which concludes that “as
communalities become lower the importance of sample size increases” (p. 43).This
conclusion is adjacent to the conclusion above, as communalities can be seen as a
continuation of factor loadings: the communality of a variable is the sum of the loadings of
this variable on all extracted factors (Rietveld & Van Hout 1993: 264). As such, the
Exploratory Factor Analysis 3
NO reliable
measure-
ments
YES
correlation
matrix
NO Factor
Analysis?
YES
Principal Component
Analysis: unities in how to estimate
diagonal of correlation communalities?
matrix
factor rotation?
orthogonal/
oblique?
use in
subsequent
RESULTS:
analysis, like
multiple factor loadings
factor scores
regression
interpretation by
the researcher
• Figure 1: overview of the steps in a factor analysis. From: Rietveld & Van Hout (1993: 291).
communality of a variable represents the proportion of the variance in that variable that can be
accounted for by all (‘common’) extracted factors. Thus if the communality of a variable is
high, the extracted factors account for a big proportion of the variable’s variance. This thus
means that this particular variable is reflected well via the extracted factors, and hence that the
factor analysis is reliable. When the communalities are not very high though, the sample size
has to compensate for this.
Exploratory Factor Analysis 4
In SPSS a convenient option is offered to check whether the sample is big enough: the
Kaiser-Meyer-Olkin measure of sampling adequacy (KMO-test). The sample is adequate if
the value of KMO is greater than 0.5. Furthermore, SPSS can calculate an anti-image matrix
of covariances and correlations. All elements on the diagonal of this matrix should be greater
than 0.5 if the sample is adequate (Field 2000: 446).
1.00
0.77 1.00
0.66 0.87 1.00
0.09 0.04 0.11 1.00
0.12 0.06 0.10 0.51 1.00
0.08 0.14 0.08 0.61 0.49 1.00
In this matrix two clusters of variables with high intercorrelations are represented. As has
already been said before, these clusters of variables could well be “manifestations of the same
underlying variable” (Rietveld & Van Hout 1993: 255). The data of this matrix could then be
reduced down into these two underlying variables or factors.
With respect to the correlation matrix, two things are important: the variables have to
be intercorrelated, but they should not correlate too highly (extreme multicollinearity and
singularity) as this would cause difficulties in determining the unique contribution of the
variables to a factor (Field 2000: 444). In SPSS the intercorrelation can be checked by using
Bartlett’s test of spherity, which “tests the null hypothesis that the original correlation matrix
is an identity matrix” (Field 2000: 457). This test has to be significant: when the correlation
matrix is an identity matrix, there would be no correlations between the variables. Multi-
collinearity, then, can be detected via the determinant of the correlation matrix, which can
also be calculated in SPSS: if the determinant is greater than 0.00001, then there is no
multicollinearity (Field 2000: 445).
communalities have to estimated, which makes factor analysis more complicated than
principal component analysis, but also more conservative.
1. The new variables (principal components) should be chosen in such a way that the first component
accounts for the maximum part of the variance, the second component the maximum part of the
remaining variance, and so on.
2. The scores on the new variables (components) are not correlated.
between factor analysis and principal component analysis decreased when the number of
variables and the magnitudes of the factor loadings increased”.
The choice between factor analysis thus depends on the number of variables and the
magnitude of the factor loadings. After having made this choice, the question arises how
many factors there are to be retained.
1. Retain only those factors with an eigenvalue larger than 1 (Guttman-Kaiser rule);
2. Keep the factors which, in total, account for about 70-80% of the variance;
3. Make a scree-plot5; keep all factors before the breaking point or elbow.
It is furthermore always important to check the communalities after factor extraction. If the
communalities are low, the extracted factors account for only a little part of the variance, and
more factors might be retained in order to provide a better account of the variance. As has
already been discussed in section 2.2.1., the sample size also comes into play when
considering these communalities.
5
See Field (2000: 436): A scree plot is “a graph of each eigenvalue (Y-axis) against the factor with which it is
associated (X-axis)”.
Exploratory Factor Analysis 7
Factor 1 Factor 1
Factor 2 Factor 2
• Figure 2: graphical representation of factor rotation. The left graph represents orthogonal rotation and the
right one represents oblique rotation. The stars represent the loadings of the original variables on the
factors. (Source for this figure: Field 2000: 439).
There are several methods to carry out rotations. SPSS offers five: varimax, quartimax,
equamax, direct oblimin and promax. The first three options are orthogonal rotation; the last
two oblique. It depends on the situation, but mostly varimax is used in orthogonal rotation and
direct oblimin in oblique rotation. Orthogonal rotation results in a rotated component / factor
matrix that presents the ‘post-rotation’ loadings of the original variables on the extracted
factors, and a transformation matrix that gives information about the angle of rotation. In
oblique rotation the results are a pattern matrix, structure matrix, and a component correlation
matrix. The pattern matrix presents the ‘pattern loadings’ (“regression coefficients of the
variable on each of the factors”; Rietveld & Van Hout 1993: 281) while the structure matrix
presents ‘structure loadings’ (“correlations between the variables and the factors”; ibid.); most
of the time the pattern matrix is used to interpret the factors. The component correlation
matrix presents the correlation between the extracted factors / components, and is thus
important for choosing between orthogonal and oblique rotation.
The second result of factor analysis is the factor scores. These factor scores can be
useful in several ways. Field (2000: 431) and Rietveld & Van Hout (1993: 289-290) name the
following:
1. If one wants to find out “whether groups or clusters of subjects can be distinguished
that behave similarly in scoring on a test battery, [and] the latent, underlying variables
are considered to be more fundamental than the original variables, the clustering of
factor scores in the factor space can provide useful clues to that end” (Rietveld & Van
Hout: 289)
2. The factor scores can serve as a solution to multicollinearity problems in multiple
regression. After all, the factor scores are uncorrelated (in the case of orthogonal
rotation);
3. The factor scores can also be useful in big experiments, containing several measures
using the same subjects. If it is already known in advance that “a number of dependent
variables used in the experiment in fact constitute similar measures of the same
underlying variable, it may be a good idea to use the scores on the different factors,
instead of using the scores on the original variables” (Rietveld & Van Hout: 290).
In SPSS the factor scores for each subject can be saved as variables in the data editor.
Doing this via the Anderson-Rubin method (option in SPSS) ensures that the factor scores are
uncorrelated, and hence usable in a multiple regression analysis. If it does not matter whether
factor scores are correlated or not, the Regression method can be used. The correlation
between factor scores can also be represented in a factor score covariance matrix, which is
displayed in the SPSS output (next to a factor score coefficient matrix, which in itself is not
particularly useful (Field 2000: 467)).
This needs some more specification. First of all I will consider the targeted grammar rules:
these rules are degrees of comparison and subordination. Degrees of comparison (mooi –
mooier – mooist) are morphosyntactic and considered both meaningful and simple (Andringa
jaartal). Subordination, on the other hand, has a syntactic basis: hij is een aardige jongen
omdat hij een aardige jongen is. This word-order rule in subordination is considered
meaningless but simple (ibid.).
Above, the difference between implicit and explicit form-focused instruction has
already been explained. In Andringa (2004: 15) the implicit instruction is operationalized by
embedding the target structures in text-comprehension exercises: why does Coca cola
advertise? A- they advertise, because they a lot of cola want to sell; B- …). The explicit
instruction is done via the presentation of the target structures in grammar exercises: look at
the highlighted words: which sentence is correct? A- Coca cola advertises, because they a lot
of cola want to sell; B- …).
The dependent variables are measured by a written proficiency test and declarative
knowledge test for both target structures. The proficiency test is divided into 19
submeasurements per target structure; the declarative knowledge test measures have only
been divided into the two target structures6. In addition, the declarative knowledge test is only
administered as a posttest in order to ensure that no learning effect will have taken place. Both
the proficiency test and the declarative knowledge test are included in the appendix.
Finally, the first three learner variables are represented as covariates. The writing
proficiency is measured by examining the grammatical accuracy (errors/clause), fluency
(words/clause) and spelling (sp.errors/clause) of the written proficiency test. General L2
proficiency is tested via the so-called C-test (see appendix), Cito ISK-test. Aptitude is
measured by means of a memory test and a grammatical sensitivity test (see appendix). In
addition, a grammatical judgement test has been administered.
From this then, Andringa (2004: 18) formulated the following hypotheses:
• Declarative knowledge:
1. A – Degrees of comparison: expl. FonFs > impl. FonF
B – Subordination: expl. FonFs > impl. FonF
• Implicit knowledge / proficiency:
2. A – Degrees of comparison: expl. FonFs < impl. FonF
B – Subordination: expl. FonFs < impl. FonF
6
An obvious reason for this is that in the proficiency test spontaneous language use is tested, while in the
declarative knowledge test no spontaneous language is used (see appendix).
Exploratory Factor Analysis 10
In other words, Andringa expects that the explicit instruction is most effective for declarative
knowledge and the implicit instruction for proficiency (i.e. spontaneous language use).
Component 1 2 3 4 5 6
1 1,263 -,551 1,993 -,161 -,450 1,935
2 -,551 1,165 -,270 ,101 2,053 -,822
3 1,993 -,270 3,078 -,132 1,587 1,492
4 -,161 ,101 -,132 1,273 -,243 1,607
5 -,450 2,053 1,587 -,243 4,021 -,843
6 1,935 -,822 1,492 1,607 -,843 4,063
Extraction Method: Principal Component Analysis.
Rotation Method: Oblimin with Kaiser Normalization.
7
The data, next to the complete output of the factor analysis, can be examined on the floppy disk in the appendix
(in the envelope).
Exploratory Factor Analysis 12
In oblique rotation, interpretation of the factors mostly takes place by examining the pattern
matrix (see section 2.2.5.). Therefore the pattern matrix is represented below:
Pattern Matrix a
Component
1 2 3 4 5 6
Tot vergr. tr. correct
,928
Pretrap
Tot correct lexicaal
,854
Pretrap
Tot vergrot. trap Pretrap ,837
Tot correct Pretrap ,828
Tot correct gespeld
,786
Pretrap
Gram. Judg. trappen ,725
Cito ISK toets ,608
Tot approp. contexts
,514
Pretrap
C-toets
Tot non-use in approp.
context Pretrap
Tot cor os Presub -,869
Tot cor os other Presub -,808
Gram. Judg. subord -,687
Tot types os Presub -,687
Written Gram err /
,677
clause (correction tr)
Written Gram err /
,656
clause (correction os)
Tot cor os "als" Presub -,650
Tot approp. contexts
-,513
Presub
Tot overtr. tr. Pretrap ,891
Tot overt. tr. correct
,838
Pretrap
Written Words/Clause ,574
Tot incorrect Pretrap
Written Spell errors/
,919
clause
Written spell err / tot
,918
words
Aptitude Memory -,610
Aptitude Analysis
Contrasten Pretrap
Tot inc os Presub ,896
Tot inc os other ,895
Tot non-use in approp.
-,611
context Presub
Verdachte -e Pretrap ,743
Avoided Presub
Tot inc os "als" Presub
Avoided Pretrap
Extraction Method: Principal Component Analysis.
Rotation Method: Oblimin with Kaiser Normalization.
a. Rotation converged in 33 iterations.
Exploratory Factor Analysis 13
The first factor consists mainly of degrees of comparison measurements (‘trappen’), but the
ISK-test of the Cito also loads high on this factor. This could mean that for the ISK-test the
same kind of knowledge is necessary as for degrees of comparison assignments. The second
factor contains subordination measures and grammatical errors; the subordination measures
correlate negatively to the grammatical errors. In the third factor, a fluency measure is
clustered with measures of superlative in degrees of comparison. It could be then that the use
of superlative mostly occurs when the language learners are relatively fluent. In the fourth
factor, spelling errors and memory correlate negatively, which implicates in this case that the
learners with better memory make fewer spelling errors. This seems rather plausible. The fifth
factor constitutes subordination measures: incorrect usage correlates negatively to non-use.
Thus, when the non-use does not occur, incorrect usage occurs. This probably takes place in
difficult exercises. Finally, the sixth factor simply constitutes one variable: ‘verdachte –e’ in
degrees of comparison (mooie instead of mooi or mooier).
From this my suggestions for the names of the factors are:
1. ‘General language proficiency’ (because of the loading of the ISK-toets)
2. ‘Grammatical proficiency’
3. ‘Fluency’
4. ‘(Language) Memory’
5. ‘Behaviour in difficult exercises’
6. ‘Suspected –e’
As such, it was possible to examine which kinds of knowledge come into play in which
measurements. Furthermore, it can be inferred that subordination and degrees of comparison
are indeed different kinds of rules, as they load separately on different factors. For the future,
it might be interesting to carry out a similar analysis using other and more target structures.
After all, if it is known which kind of knowledge are important for learning grammatical
structures, then these kinds of knowledge can be emphasized in teaching a particular target
structure. Thus, language teaching can become more effective.
4. CONCLUSION
In this paper an example is given of the use of factor analysis in an applied linguistics study.
Unfortunately, the data was not appropriate because of multicollinearity, and hence no sound
conclusions can be drawn from this analysis. Nevertheless, a principal component analysis
has been carried out with oblique rotation. This resulted into six correlated factors,
constituting several aspects of ‘knowledge’ in language learning. It turned out that the
measurements of the two target structures loaded on different factors, which could indicate
that different kinds of knowledge are needed for these structures. This can be important for
language teaching, as it gives opportunities to improve language teaching effectiveness.
Exploratory Factor Analysis 14
BIBLIOGRAPHY
APPENDIX:
• Proficiency test
• C-test
• Memory test
Example
Factor analysis is frequently used to develop questionnaires: after all if you want to measure
an ability or trait, you need to ensure that the questions asked relate to the construct that you
intend to measure. I have noticed that a lot of students become very stressed about SPSS.
Therefore I wanted to design a questionnaire to measure a trait that I termed ‘SPSS anxiety’. I
decided to devise a questionnaire to measure various aspects of students’ anxiety towards
learning SPSS. I generated questions based on interviews with anxious and non-anxious
students and came up with 23 possible questions to include. Each question was a statement
followed by a five-point Likert scale ranging from ‘strongly disagree’ through ‘neither agree or
disagree’ to ‘strongly agree’. The questionnaire is printed in Field (2005, p. 639).
The questionnaire was designed to predict how anxious a given individual would be about
learning how to use SPSS. What’s more, I wanted to know whether anxiety about SPSS could
be broken down into specific forms of anxiety. So, in other words, are there other traits that
might contribute to anxiety about SPSS? With a little help from a few lecturer friends I
collected 2571 completed questionnaires (at this point it should become apparent that this
example is fictitious!). The data are stored in the file SAQ.sav.
9 Questionnaires are made up of multiple items each of which elicits a
response from the same person. As such, it is a repeated measures design.
9 Given we know that repeated measures go in different columns, different
questions on a questionnaire should each have their own column in SPSS.
Initial Considerations
Sample Size
Correlation coefficients fluctuate from sample to sample, much more so in small samples than
in large. Therefore, the reliability of factor analysis is also dependent on sample size. Field
(2005) reviews many suggestions about the sample size necessary for factor analysis and
concludes that it depends on many things. In general over 300 cases is probably adequate but
communalities after extraction should probably be above 0.5 (see Field, 2005).
Data Screening
SPSS will nearly always find a factor solution to a set of variables. However, the solution is
unlikely to have any real meaning if the variables analysed are not sensible. The first thing to
do when conducting a factor analysis is to look at the inter-correlation between variables. If
our test questions measure the same underlying dimension (or dimensions) then we would
expect them to correlate with each other (because they are measuring the same thing). If we
find any variables that do not correlate with any other variables (or very few) then you should
consider excluding these variables before the factor analysis is run. The correlations between
variables can be checked using the correlate procedure (see Chapter 4) to create a correlation
matrix of all variables. This matrix can also be created as part of the main factor analysis.
The opposite problem is when variables correlate too highly. Although mild multicollinearity is
not a problem for factor analysis it is important to avoid extreme multicollinearity (i.e.
variables that are very highly correlated) and singularity (variables that are perfectly
correlated). As with regression, singularity causes problems in factor analysis because it
becomes impossible to determine the unique contribution to a factor of the variables that are
highly correlated (as was the case for multiple regression). Therefore, at this early stage we
look to eliminate any variables that don’t correlate with any other variables or that correlate
very highly with other variables (R < .9). Multicollinearity can be detected by looking at the
determinant of the R-matrix (see next section).
As well as looking for interrelations, you should ensure that variables have roughly normal
distributions and are measured at an interval level (which Likert scales are, perhaps wrongly,
assumed to be!). The assumption of normality is important only if you wish to generalize the
results of your analysis beyond the sample collected.
KMO and Bartlett’s test of sphericity produces the Kaiser-Meyer-Olkin measure of sampling
adequacy and Bartlett’s test (see Field, 2005, Chapters 11 & 12). The value of KMO should be
greater than 0.5 if the sample is adequate.
Factor Extraction on SPSS
Click on to access the extraction dialog box (Figure 3). There are several ways to
conduct factor analysis and the choice of method depends on many things (see Field, 2005).
For our purposes we will use principal component analysis, which strictly speaking isn’t factor
analysis; however, the two procedures often yield similar results (see Field, 2005, 15.3.3).
The Display box has two options: to display the Unrotated factor solution and a Scree plot. The
scree plot was described earlier and is a useful way of establishing how many factors should be
retained in an analysis. The unrotated factor solution is useful in assessing the improvement of
interpretation due to rotation. If the rotated solution is little better than the unrotated solution
then it is possible that an inappropriate (or less optimal) rotation method has been used.
Q01 Q02 Q03 Q04 Q05 Q19 Q20 Q21 Q22 Q23
Correlation Q01 1.000 -.099 -.337 .436 .402 -.189 .214 .329 -.104 -.004
Q02 -.099 1.000 .318 -.112 -.119 .203 -.202 -.205 .231 .100
Q03 -.337 .318 1.000 -.380 -.310 .342 -.325 -.417 .204 .150
Q04 .436 -.112 -.380 1.000 .401 -.186 .243 .410 -.098 -.034
Q05 .402 -.119 -.310 .401 1.000 -.165 .200 .335 -.133 -.042
Q06 .217 -.074 -.227 .278 .257 -.167 .101 .272 -.165 -.069
Q07 .305 -.159 -.382 .409 .339 -.269 .221 .483 -.168 -.070
Q08 .331 -.050 -.259 .349 .269 -.159 .175 .296 -.079 -.050
Q09 -.092 .315 .300 -.125 -.096 .249 -.159 -.136 .257 .171
Q10 .214 -.084 -.193 .216 .258 -.127 .084 .193 -.131 -.062
Q11 .357 -.144 -.351 .369 .298 -.200 .255 .346 -.162 -.086
Q12 .345 -.195 -.410 .442 .347 -.267 .298 .441 -.167 -.046
Q13 .355 -.143 -.318 .344 .302 -.227 .204 .374 -.195 -.053
Q14 .338 -.165 -.371 .351 .315 -.254 .226 .399 -.170 -.048
Q15 .246 -.165 -.312 .334 .261 -.210 .206 .300 -.168 -.062
Q16 .499 -.168 -.419 .416 .395 -.267 .265 .421 -.156 -.082
Q17 .371 -.087 -.327 .383 .310 -.163 .205 .363 -.126 -.092
Q18 .347 -.164 -.375 .382 .322 -.257 .235 .430 -.160 -.080
Q19 -.189 .203 .342 -.186 -.165 1.000 -.249 -.275 .234 .122
Q20 .214 -.202 -.325 .243 .200 -.249 1.000 .468 -.100 -.035
Q21 .329 -.205 -.417 .410 .335 -.275 .468 1.000 -.129 -.068
Q22 -.104 .231 .204 -.098 -.133 .234 -.100 -.129 1.000 .230
Q23 -.004 .100 .150 -.034 -.042 .122 -.035 -.068 .230 1.000
Sig. (1-tailed) Q01 .000 .000 .000 .000 .000 .000 .000 .000 .410
Q02 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q03 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q04 .000 .000 .000 .000 .000 .000 .000 .000 .043
Q05 .000 .000 .000 .000 .000 .000 .000 .000 .017
Q06 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q07 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q08 .000 .006 .000 .000 .000 .000 .000 .000 .000 .005
Q09 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q10 .000 .000 .000 .000 .000 .000 .000 .000 .000 .001
Q11 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q12 .000 .000 .000 .000 .000 .000 .000 .000 .000 .009
Q13 .000 .000 .000 .000 .000 .000 .000 .000 .000 .004
Q14 .000 .000 .000 .000 .000 .000 .000 .000 .000 .007
Q15 .000 .000 .000 .000 .000 .000 .000 .000 .000 .001
Q16 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q17 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q18 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q19 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q20 .000 .000 .000 .000 .000 .000 .000 .000 .039
Q21 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q22 .000 .000 .000 .000 .000 .000 .000 .000 .000
Q23 .410 .000 .000 .043 .017 .000 .039 .000 .000
a. Determinant = 5.271E-04
SPSS Output 1
SPSS Output 2 shows several very important parts of the output: the Kaiser-Meyer-Olkin
measure of sampling adequacy and Bartlett's test of sphericity. The KMO statistic varies
eigenvalue in terms of the Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
percentage of variance
% of Cumulative % of Cumulative % of Cumulative
Component Total Variance % Total Variance % Total Variance %
amounts of variance
9 .751 3.265 68.676
10 .717 3.117 71.793
(especially factor 1) whereas 11
12
.684
.670
2.972
2.911
74.765
77.676
subsequent factors explain 13 .612 2.661 80.337
Rotation Sums of Squared Loadings), the eigenvalues of the factors after rotation are
displayed. Rotation has the effect of optimizing the factor structure and one consequence for
these data is that the relative importance of the four factors is equalized. Before rotation,
factor 1 accounted for considerably more variance than the remaining three (31.696%
compared to 7.560, 5.725, and 5.336%), however after extraction it accounts for only
16.219% of variance (compared to 14.523, 11.099 and 8.475% respectively).
SPSS Output 4 shows the table of communalities before and after extraction. Principal
component analysis works on the initial assumption that all variance is common; therefore,
before extraction the communalities are all 1. The communalities in the column labelled
Extraction reflect the common variance in the data structure. So, for example, we can say that
43.5% of the variance associated with question 1 is common, or shared, variance. Another
way to look at these communalities is in terms of the proportion of variance explained by the
underlying factors. After extraction some of the factors are discarded and so some information
is lost. The amount of variance in each variable that can be explained by the retained factors is
represented by the communalities after extraction.
Communalities
Initial Extraction
Q01 1.000 .435
Component Matrixa
Q02 1.000 .414
Component
Q03 1.000 .530
1 2 3 4
Q04 1.000 .469
Q18 .701
Q05 1.000 .343 Q07 .685
Q06 1.000 .654 Q16 .679
Q07 1.000 .545 Q13 .673
Q12 .669
Q08 1.000 .739
Q21 .658
Q09 1.000 .484 Q14 .656
Q10 1.000 .335 Q11 .652 -.400
Q11 1.000 .690 Q17 .643
Q12 Q04 .634
1.000 .513
Q03 -.629
Q13 1.000 .536 Q15 .593
Q14 1.000 .488 Q01 .586
Q15 1.000 .378 Q05 .556
Q16 1.000 .487 Q08 .549 .401 -.417
Q10 .437
Q17 1.000 .683
Q20 .436 -.404
Q18 1.000 .597 Q19 -.427
Q19 1.000 .343 Q09 .627
Q20 1.000 .484 Q02 .548
Q21 Q22 .465
1.000 .550
Q06 .562 .571
Q22 1.000 .464 Q23 .507
Q23 1.000 .412 Extraction Method: Principal Component Analysis.
Extraction Method: Principal Component a. 4 components extracted.
SPSS Output 4
This output also shows the component matrix before rotation. This matrix contains the
loadings of each variable onto each factor. By default SPSS displays all loadings; however, we
requested that all loadings less than 0.4 be suppressed in the output and so there are blank
spaces for many of the loadings. This matrix is not particularly important for interpretation.
At this stage SPSS has extracted four factors. Factor analysis is an exploratory tool and so it
should be used to guide the researcher to make various decisions: you shouldn't leave the
computer to make them. One important decision is the number of factors to extract. By
Kaiser's criterion we should extract four factors and this is what SPSS has done. However, this
criterion is accurate when there are less than 30 variables and communalities after extraction
are greater than 0.7 or when the sample size exceeds 250 and the average communality is
greater than 0.6. The communalities are shown in SPSS Output 4, and none exceed 0.7. The
average of the communalities can be found by adding them up and dividing by the number of
communalities (11.573/23 = 0.503). So, on both grounds Kaiser's rule may not be accurate.
However, you should consider the huge sample that we have, because the research into
Kaiser's criterion gives recommendations for much smaller samples. We can also use the scree
plot, which we asked SPSS to produce. The scree plot is shown below with a thunderbolt
indicating the point of inflexion on the curve. This curve is difficult to interpret because the
curve begins to tail off after three factors, but there is another drop after four factors before a
stable plateau is reached. Therefore, we could probably justify retaining either two or four
factors. Given the large sample, it is probably safe to assume Kaiser's criterion; however, you
could rerun the analysis specifying that SPSS extract only two factors and compare the results.
Scree Plot
8
2
Eigenvalue
0
1 3 5 7 9 11 13 15 17 19 21 23
Component Number
SPSS Output 5
9 If there are less than 30 variables and communalities after extraction are
greater than 0.7 or if the sample size exceeds 250 and the average
communality is greater than 0.6 then retain all factors with Eigen values
above 1 (Kaiser’s criterion).
9 If none of the above apply, a Scree Plot can be used when the sample size
is large (around 300 or more cases).
Factor Rotation
The first analysis I asked you to run was using an orthogonal rotation. SPSS Output 6 shows
the rotated component matrix (also called the rotated factor matrix in factor analysis) which is
a matrix of the factor loadings for each variable onto each factor. This matrix contains the
same information as the component matrix in SPSS Output 4 except that it is calculated after
rotation. There are several things to consider about the format of this matrix. First, factor
loadings less than 0.4 have not been displayed because we asked for these loadings to be
suppressed. If you didn't select this option, or didn't adjust the criterion value to 0.4, then
your output will differ. Second, the variables are listed in the order of size of their factor
loadings because we asked for the output to be Sorted by size. If this option was not selected
your output will look different. Finally, for all other parts of the output I suppressed the
variable labels (for reasons of space) but for this matrix I have allowed the variable labels to
be printed to aid interpretation.
Compare this matrix with the unrotated solution. Before rotation, most variables loaded highly
onto the first factor and the remaining factors didn't really get a look in. However, the rotation
of the factor structure has clarified things considerably: there are four factors and variables
load very highly onto only one factor (with the exception of one question). The suppression of
loadings less than 0.4 and ordering variables by loading size also makes interpretation
considerably easier (because you don't have to scan the matrix to identify substantive
loadings).
Component
1 2 3 4
I have little experience of computers .800
SPSS always crashes when I try to use it .684
I worry that I will cause irreparable damage because
.647
of my incompetenece with computers
All computers hate me .638
Computers have minds of their own and deliberately
.579
go wrong whenever I use them
Computers are useful only for playing games .550
Computers are out to get me .459
I can't sleep for thoughts of eigen vectors .677
I wake up under my duvet thinking that I am trapped
.661
under a normal distribtion
Standard deviations excite me -.567
People try to tell you that SPSS makes statistics
.473 .523
easier to understand but it doesn't
I dream that Pearson is attacking me with correlation
.516
coefficients
I weep openly at the mention of central tendency .514
Statiscs makes me cry .496
I don't understand statistics .429
I have never been good at mathematics .833
I slip into a coma whenever I see an equation .747
I did badly at mathematics at school .747
My friends are better at statistics than me .648
My friends are better at SPSS than I am .645
If I'm good at statistics my friends will think I'm a nerd .586
My friends will think I'm stupid for not being able to
.543
cope with SPSS
Everybody looks at me when I use SPSS .427
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 9 iterations.
SPSS Output 6
9 Use orthogonal rotation when you believe your factors should theoretically
independent (unrelated to each other).
9 Use oblique rotation when you believe factors should be related to each
other.
Interpretation
The next step is to look at the content of questions that load onto the same factor to try to
identify common themes. If the mathematical factor produced by the analysis represents some
real-world construct then common themes among highly loading questions can help us identify
what the construct might be. The questions that load highly on factor 1 seem to all relate to
using computers or SPSS. Therefore we might label this factor fear of computers. The
questions that load highly on factor 2 all seem to relate to different aspects of statistics;
therefore, we might label this factor fear of statistics. The three questions that load highly on
factor 3 all seem to relate to mathematics; therefore, we might label this factor fear of
mathematics. Finally, the questions that load highly on factor 4 all contain some component of
social evaluation from friends; therefore, we might label this factor peer evaluation. This
analysis seems to reveal that the initial questionnaire, in reality, is composed of four sub-
scales: fear of computers, fear of statistics, fear of maths, and fear of negative peer
evaluation. There are two possibilities here. The first is that the SAQ failed to measure what it
set out to (namely SPSS anxiety) but does measure some related constructs. The second is
that these four constructs are sub-components of SPSS anxiety; however, the factor analysis
does not indicate which of these possibilities is true.
Guided Example
The University of Sussex is constantly seeking to employ the best people possible as lecturers
(no, really, it is). Anyway, they wanted to revise a questionnaire based on Bland’s theory of
SPSS Output 6
C8057 (Research Methods II): Factor Analysis on SPSS
research methods lecturers. This theory predicts that good research methods lecturers should
have four characteristics: (1) a profound love of statistics; (2) an enthusiasm for experimental
design; (3) a love of teaching; and (4) a complete absence of normal interpersonal skills.
These characteristics should be related (i.e. correlated). The ‘Teaching Of Statistics for
Scientific Experiments’ (TOSSE) already existed, but the university revised this questionnaire
and it became the ‘Teaching Of Statistics for Scientific Experiments — Revised’ (TOSSE—R).
The gave this questionnaire to 239 research methods lecturers around the world to see if it
supported Bland’s theory.
The questionnaire is below.
I once woke up in the middle of a vegetable patch hugging a turnip that I'd mistakenly dug
1 A A A A A
up thinking it was Roy's largest root
2
If I had a big gun I'd shoot all the students I have to teach A A A A A
Teaching others makes me want to swallow a large bottle of bleach because the pain of my
6 A A A A A
burning oesophagus would be light relief in comparison
11 I like it when people tell me I've helped them to understand factor rotation A A A A A
12
People fall asleep as soon as I open my mouth to speak A A A A A
14
I'd rather think about appropriate dependent variables than go to the pub A A A A A
I enjoy sitting in the park contemplating whether to use participant observation in my next
17 A A A A A
experiment
21
Thinking about Bonferroni corrections gives me a tingly feeling in my groin A A A A A
23 I often spend my spare time talking to the pigeons ... and even they die of boredom A A A A A
I tried to build myself a time machine so that I could go back to the 1930s and follow Fisher
24 A A A A A
around on my hands and knees licking the floor on which he'd just trodden
25 I love teaching A A A A A
27 I love teaching because students have to pretend to like me or they'll get bad marks A A A A A
Your Answer:
Is the sample size adequate? Explain your answer quoting any relevant
statistics.
Your Answer:
How many factors should be retained? Explain your answer quoting any
relevant statistics.
Your Answer:
Your Answer:
Which items load onto which factors? Do these factors make psychological
sense (i.e. can you name them based on the items that load onto them?)
Your Answer:
Unguided Example
Re-run the SAQ analysis using oblique rotation (Use Field, 2005 to help you).
Compare the results to the current analysis. Also, look over Field (2005) and
find out about Factor Scores and how to interpret them.
Example:
In a verbal learning task, nonsense syllables are presented for later recall. Three different
groups of subjects see the nonsense syllables at a 1-, 5-, or 10-second presentation rate.
The data (number of errors) for the three groups are as follows:
Number of 1 9 3
errors 4 8 5
5 7 7
6 10 7
4 6 8
The research question is whether the three groups have the same error rates.
Remember that to do this, you can simply double-click at the top of the variable’s
column, and the screen will change from “data view” to “variable view,” prompting you
to enter properties of the variable. For your dependent variable, giving the variable a
name and a label is sufficient. For your independent variable (the grouping variable), you
will also want to have value labels identifying what numbers correspond with which
groups. See the following figure for how to do this.
Step 3: Select Oneway ANOVA from the command list in the menu as follows:
Note: There is more than one way to run ANOVA analysis in SPSS. For now, the
easiest way to do it is to go through the “compare means” option. However, since the
analysis of variance procedure is based on the general linear model, you could also use
the analyze/general linear model option to run the ANOVA. This command allows for
the analysis of much, much more sophisticated experimental designs than the one we
have here, but using it on these data would yield the same result as the One-way ANOVA
command.
EDPR 7/8542, Spring 2005 4
Dr. Jade Xu
Once you’ve selected the One-way ANOVA, you will get a dialog box like the one at the
top. Select your dependent and grouping variables (notice that unlike in the independent
samples t-test, you do not need to define your groups—SPSS assumes that you will
include all groups in the analysis.
If you wish to do this with syntax commands, you can see what the syntax looks like by
selecting “paste” when you are in the One-way ANOVA dialog box.
Once you have determined that differences exist among the group means, post hoc
pairwise and multiple comparisons can determine which means differ. SPSS presents
several choices, but different post hoc tests vary in their level by which they control Type
I error. Furthermore, some tests are more appropriate than other based on the
organization of one's data. The following information focuses on choosing an
appropriate test by comparing the tests.
EDPR 7/8542, Spring 2005 6
Dr. Jade Xu
Step 2. If you can assume equal variances, the F statistic is used to test the hypothesis. If
the test statistic's significance is below the desired alpha (typically, alpha = 0.05),
then at least one group is significantly different from another group.
Step 3. Once you have determined that differences exist among the means, post hoc
pairwise and multiple comparisons can be used to determine which means differ.
Pairwise multiple comparisons test the difference between each pair of means,
and yield a matrix where asterisks indicate significantly different group means at
an alpha level of 0.05.
Step 4. Choose an appropriate post hoc test:
a. Unequal Group Sizes: Whenever you violate the equal n assumption for groups,
select any of the following post hoc procedures in SPSS: LSD, Games-Howell,
Dunnett's T3, Scheffé, and Dunnett's C.
b. Unequal Variances: Whenever you violate the equal variance assumption for
groups (i.e., the homogeneity of variance assumption), check any of the following
post hoc procedures in SPSS: Tamhane’s T2, Games-Howell, Dunnett's T3, and
Dunnett's C.
EDPR 7/8542, Spring 2005 7
Dr. Jade Xu
Fisher's LSD (Least Significant Different): This test is the most liberal of all
Post Hoc tests and its critical t for significance is not affected by the number
of groups. This test is appropriate when you have 3 means to compare. It
is not appropriate for additional means.
Bonferroni (AKA, Dunn’s Bonferroni): This test does not require the overall
ANOVA to be significant. It is appropriate when the number of
comparisons (c = number of comparisons = k(k-1))/2) exceeds the number
of degrees of freedom (df) between groups (df = k-1). This test is very
conservative and its power quickly declines as the c increases. A good rule of
thumb is that the number of comparisons (c) be no larger than the degrees of
freedom (df).
Newman-Keuls: If there is more than one true null hypothesis in a set of
means, this test will overestimate they familywise error rate. It is
appropriate to use this test when the number of comparisons exceeds the
number of degrees of freedom (df) between groups (df = k-1) and one does
not wish to be as conservative as the Bonferroni.
Tukey's HSD (Honestly Significant Difference): This test is perhaps the most
popular post hoc. It reduces Type I error at the expense of Power. It is
appropriate to use this test when one desires all the possible comparisons
between a large set of means (6 or more means).
Tukey's b (AKA, Tukey’s WSD (Wholly Significant Difference)): This test
strikes a balance between the Newman-Keuls and Tukey's more conservative
HSD regarding Type I error and Power. Tukey's b is appropriate to use
when one is making more than k-1 comparisons, yet fewer than (k(k-1))/2
comparisons, and needs more control of Type I error than Newman-
Kuels.
Scheffé: This test is the most conservative of all post hoc tests. Compared to
Tukey's HSD, Scheffé has less Power when making pairwise (simple)
comparisons, but more Power when making complex comparisons. It is
appropriate to use Scheffé's test only when making many post hoc
complex comparisons (e.g. more than k-1).
End note:
Example:
Schooling
Home Public
Family type Dual Parent
Single Parent
Again, we’ll do the 5 steps of hypothesis testing for each F-test. Because step 5 can be
addressed for all three hypotheses in one fell swoop using SPSS, that will come last.
Here are the first 4 steps for each hypothesis:
Because there are two factors, there are now two columns for "group": one for family
type (1: dual-parent; 2: single-parent) and one for schooling type (1: home; 2: public).
Achievement is placed in the third column. Note: if we had more than two factors, we
would have more than two group columns. See how that works? Also, if we had more
than 2 levels in a given factor, we would use 1, 2, 3 (etc.) to denote level.
Step 3. When you see a pop-up window like this one below, plop Fmly_type and
Schooling into the "Fixed Factors" window and Achieve into the "Dependent Variable"
window...
I have highlighted the important parts of the summary table. As with the one-way
ANOVA, MS = SS/df and F = MSeffect / MSerror for each effect of interest. Also, values
add up to the numbers in the "Corrected Total" row.
An effect is significant if p< α or, equivalently, if Fobs / Fcrit. The beauty of SPSS is that
we don't have to look up a Fcrit if we know p. Because p< α for each of the three effects
(two main effects and one interaction), all three are statistically significant.
One way to plot the means (I used SPSS for this – the "Plots" option in the ANOVA
dialog window) is:
EDPR 7/8542, Spring 2005 11
Dr. Jade Xu
The two main effects and interaction effect are very clear in this plot. It would be very
good practice to conduct this factorial ANOVA by hand and see that the results match
what you get from SPSS.
Note that these sums of squares match those in the ANOVA summary table. So the F-
values are:
EDPR 7/8542, Spring 2005 13
Dr. Jade Xu
Note that these F values are within rounding error of those in the ANOVA summary
table.
According to Table C.3 (because α =.05 ), the critical value for all three F-tests is 4.49.
All three Fs exceed this critical value, so we have evidence for a main effect of family
type, a main effect of schooling type, and the interaction of family type and schooling
type. This agrees with our intuition based on the mean plots.
Schooling
Home Public Mean(i)
Dual Parent 0.50 0.54
0.30 0.29
0.43 0.31
0.52 0.47
0.41 0.48
Mean(1j) 0.432 0.418 0.425
Single Parent 0.12 0.88
0.32 0.69
0.22 0.91
0.19 0.86
0.19 0.82
Mean(2j) 0.208 0.832 0.520
Mean(j) 0.320 0.625 0.4725
Once you have determined that differences exist among the group means, post hoc
multiple comparisons can determine which means differ.
EDPR 7/8542, Spring 2005 14
Dr. Jade Xu
The Logic
Just as there is a repeated measures or dependent samples version of the Student t test,
there is a repeated measures version of ANOVA. Repeated measures ANOVA follows
the logic of univariate ANOVA to a large extent. As the same participants appear in all
conditions of the experiment, however, we are able to allocate more of the variance. In
univariate ANOVA we partition the variance into that caused by differences within
groups and that caused by differences between groups, and then compare their ratio. In
repeated measure ANOVA we can calculate the individual variability of participants as
the same people take part in each condition. Thus we can partition more of the error (or
within condition) variance. The variance caused by differences between individuals is
not helpful when deciding whether there is a difference between occasions. If we can
calculate it we can subtract it from the error variance and then compare the ratio of error
variance to that caused by changes in the independent variable between occasions. So
repeated measures allows us to compare the variance caused by the independent variable
to a more accurate error term which has had the variance caused by differences in
individuals removed from it. This increases the power of the analysis and means that
fewer participants are needed to have adequate power.
The Model
For the sake of simplicity, I will demonstrate the analysis by using the following three
participants that were measured in four occasions.
1 7 7 5 5 ∑ = 24
2 6 6 5 3 ∑ = 20
3 5 4 4 3 ∑ = 16
∑ =18 ∑ = 17 ∑ = 14 ∑ = 11
SPSS Analysis
Step 1: Enter the data
EDPR 7/8542, Spring 2005 15
Dr. Jade Xu
When you enter the data remember that it consists of three participants measured on four
occasions and each row is for a separate participant. Thus, for this data you have three
rows and four variables, one for each occasion.
Step 2. To perform a repeated measures ANOVA you need to go through Analyze to
General Linear Model, which is where you found one of the ways to perform Univariate
ANOVA. This time, however, you click on Repeated Measures.
Next we need to put the variables into the within subjects box; as you can see we have
already put occasion 1 in slot (1) and occasion 2 in slot (2). We could also ask for some
descriptive statistics by going to Options and selecting Descriptives, once this is done
press OK and the following output should appear.
Within-Subjects Factors
Measure: MEASURE_1
Dependent
FACTOR1 Variable
1 OCCAS1 This first box just tells
2 OCCAS2 us what the variables are
3 OCCAS3
4 OCCAS4
EDPR 7/8542, Spring 2005 17
Dr. Jade Xu
Descriptive Statistics
Multivariate Testsb
Measure: MEASURE_1
a
Epsilon
Approx. Greenhous
Within Subjects Effect Mauchly's W Chi-Square df Sig. e-Geisser Huynh-Feldt Lower-bound
FACTOR1 .000 . 5 . .667 . .333
Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is
proportional to an identity matrix.
a. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the
Tests of Within-Subjects Effects table.
b.
Design: Intercept
Within Subjects Design: FACTOR1
Measure: MEASURE_1
Type III Sum
Source of Squares df Mean Square F Sig.
FACTOR1 Sphericity Assumed 10.000 3 3.333 10.000 .009
Greenhouse-Geisser 10.000 2.000 5.000 10.000 .028
Huynh-Feldt 10.000 . . . .
Lower-bound 10.000 1.000 10.000 10.000 .087
Error(FACTOR1) Sphericity Assumed 2.000 6 .333
Greenhouse-Geisser 2.000 4.000 .500
Huynh-Feldt 2.000 . .
Lower-bound 2.000 2.000 1.000
This is the most important box for repeated measures ANOVA. As you can see the F
value is 10.
Measure: MEASURE_1
Type III Sum
Source FACTOR1 of Squares df Mean Square F Sig.
FACTOR1 Linear 9.600 1 9.600 48.000 .020
Quadratic .333 1 .333 1.000 .423
Cubic 6.667E-02 1 6.667E-02 .143 .742
Error(FACTOR1) Linear .400 2 .200
Quadratic .667 2 .333
Cubic .933 2 .467
The Within-Subjects contrast box test for significant trends. In this case there is a
significant linear trend, which means in this case there is a tendency for the data to fall on
a straight line. In other words the mean for occasion 1 is larger than occasion 2 which is
larger than occasion 3 which is larger than occasion 4. If we have a quadratic trend we
would have an inverted U or a U shaped pattern. It is important to remember that this
box is only of interest if the overall F value is significant and that it is a test of a trend not
a specific test of differences between occasions. For that we need to look at post hoc
tests.
Hand Calculation:
It would be very good practice to conduct this repeated-measures ANOVA by hand and
see that the results match what you get from SPSS. The only formula needed is the
formula for the sum of squares that we used for univariate ANOVA; ∑x2 - (∑x )2/n.
EDPR 7/8542, Spring 2005 19
Dr. Jade Xu
The first step is to calculate the total variability or the total sum of squares (SST). It will
not surprise you to learn that this is the same as you were doing a univariate ANOVA.
That is, (49+49+25+25+36+36+25+9+25+16+16+9) - (60)2/12 = 320 - 300 = 20.
We now calculate the variability due to occasions. This variability is calculated exactly
the same way as the within group variability is calculated for a univariate ANOVA. So
for this data the variability due to occasions is the sum of the variability within each
occasion.
The variability for occasion 1 is (49+36+25) - (18)2 /3; for occasion 2 it is (49+36+16) -
172/3. See if you can work out the sum for occasions 3 and 4. The answers should
come to occasion 3 = 0.66 and occasion 4 = 2.66 and added to occasion 1 and 2, we get a
sum of 10. Again this is no surprise as it should be the same for within group variability
for a univariate ANOVA. This time, however, this variability is very important as it is
not a measure of error but a measure of the effect of the independent variable, as the
participants have remained the same but the independent variable has altered with the
occasion.
We now need to calculate the variation caused by individual variability, that is the three
participants in this study differ in their overall measures here. This calculation is
something we have not met in univariate ANOVA but the principles remain the same.
Looking at the data you can see that overall participant 1 had the highest score of 24,
participant 2 had an overall score of 20 and participant 3 had an overall score of 16.
To calculate individual variability we still use the sum of squares formula ∑x2 - (∑x )2/n.
In this case we get (242+202+162) - (60)2/3 = (576+400+256)-3600/3 = 32.
Alarm bells may be ringing as you will see that we have more variability than the total.
However, we have not adjusted this figure for the number of occasions; to do that we
divide in this instance by 4 to get a figure of 8. So the variability that is caused by
differences in participants is 8.
Another way to calculate the individual variability is to divide the squared row totals by
the number of occasions and then subtract the correction factor. This would give us;
242/4+202/4+162/4 = 308, and then subtracting the correction factor gives us 308-300 = 8.
At this stage we have calculated all of the variabilities that we need to perform our
analysis; the variability due to occasions that is caused by the differences in the
independent variable across occasions is 10, the variability due to differences in
individuals is 8 and the residual variability is 2. The residual or error variability is the
total variability (20) minus the sum of the variability due to occasions and individuals
(18).
The next step is to calculate the degrees of freedom so that we can turn these variabilities
into variances or mean squares (MS). The total degrees of freedom is 12 -1 =11, as in
the univariate ANOVA. The degrees of freedom for individuals is the number of
participants minus 1, in this case 3 -1=2 and for occasions it is the number of occasions
EDPR 7/8542, Spring 2005 20
Dr. Jade Xu
minus 1, in this case 4-1=3. The residual degrees of freedom is the total minus the sum
of the degrees of freedom from individuals and occasions; 11-5=6.
The mean squares are then calculated by dividing the variability by the degrees of
freedom. The mean square for individuals is 8/2=4, for occasions 10/3= 3.33 and for the
residual 2/6=0.33.
To calculate the F statistic it is important to remember that we are not interested in the
individual variability of subjects. This is part of the error variance which we cannot
control for in a univariate ANOVA, but which we can measure in a repeated measures
design and then discard. What we are concerned with is whether our independent
variable which changes on different occasions has an effect on subject's performance.
The F ratio we are interested in is, therefore, MSoccasions / MSresidual; 3.33/0.33 which is 10.
If you were calculating statistics by hand you would now need to go to a table of values
for Fcrit to work out whether it is significant or not. We won't do that we will just
perform the same calculation by SPSS and check the significance there.
Notes:
1. The information presented in this handout is modified from the following websites:
http://employees.csbsju.edu/rwielk/psy347/spssinst.htm
http://www.colby.edu/psychology/SPSS/
http://www.oswego.edu/~psychol/spss/spsstoc.html
http://webpub.alleg.edu/dept/psych/SPSS/SPSS1wANOVA.html
4. Research Report Writing:
Introduction: Mostly, research work is presented in a written form. The practical utility of
research study depends heavily on the way it is presented to those who are expected to act on
the basis of research findings. Research report is a written document containing key aspects
of research project.
Research report is a medium to communicate research work with relevant people. It is also a
good source of preservation of research work for the future reference. Many times, research
findings are not followed because of improper presentation. Preparation of research report is
not an easy task. It is an art. It requires a good deal of knowledge, imagination, experience,
and expertise. It demands a considerable time and money.
Definitions:
1. In simple words:
Research report is the systematic, articulate, and orderly presentation of research work in a
written form.
Research report is a research document that contains basic aspects of the research project.
Research report involves relevant information on the research work carried out. It may be in
form of hand-written, typed, or computerized.
Report Format:
There is no one best format for all reports. Format depends on several relevant variables. One
must employ a suitable format to create desirable impression with clarity. Report must be
attractive. It should be written systematically and bound carefully. A report must use the
format (often called structure) that best fit the needs and wants of its readers. Normally,
following format is suggested as a basic outline, which has sufficient flexibly to meet the
most situations.
Types of reports –
There are many different formats for reporting research; journal articles, technical research
reports, monographs or books, graduate theses or dissertations.
These oral reports, however, are usually based on previous written reports.
The Harvard referencing style is another popular style using the author-date system for in-
text citations.
In-text citation:
It consists mainly of the authors' last name and the year of publication (and page numbers if it
is directly quoted) in round brackets placed within the text. If there is no discernable author,
the title and date are used.
Reference list:
The reference list should be ordered alphabetically by the last name of the first author of each
work. References with no author are ordered alphabetically by the first significant word of
the title.
Use only the initials of the authors' given names. No full stop and space between the initials.
Last name comes first.
Here is an example that cites a book with one author using Harvard style.
Bibliography –
A bibliography is a list of all of the sources you have used (whether referenced or not) in the
process of researching your work. In general, a bibliography should include:
Footnotes –
Footnotes are notes placed at the bottom of a page. They cite references or comment on a
designated part of the text above it. For example, say you want to add an interesting comment
to a sentence you have written, but the comment is not directly related to the argument of
your paragraph. In this case, you could add the symbol for a footnote. Then, at the bottom of
the page you could reprint the symbol and insert your comment. Here is an example:
This is an illustration of a footnote.1 The number “1” at the end of the previous sentence
corresponds with the note below. See how it fits in the body of the text?
1 At the bottom of the page you can insert your comments about the sentence preceding the
footnote.
When your reader comes across the footnote in the main text of your paper, he or she could
look down at your comments right away, or else continue reading the paragraph and read
your comments at the end. Because this makes it convenient for your reader, most citation
styles require that you use either footnotes or endnotes in your paper. Some, however, allow
you to make parenthetical references (author, date) in the body of your work
Footnotes are not just for interesting comments, however. Sometimes they simply refer to
relevant sources -- they let your reader know where certain material came from, or where
they can look for other sources on the subject.
1. Title: The report should have proper title explaining or giving an appropriate glimpse of
the subject matter contained within it.
2. Objective: The report-should be factual. The whims and ideas of the person preparing the
report should not be allowed to influence the report.
3. Timeline: The report should relate to a certain period and the period of time should be
indicated on the top of the report.
4. Clarity: The report should be clear, brief and concise. Clarity should not be sacrificed at
the cost of brevity.
6. Clarity on action plan: A report should distinguish between controllable and non-
controllable factors and should report them separately. It is because management can take
suitable action regarding controllable factors.
7. Margin of error: The report should be taken as correct within the permissible degree of
inaccuracy. The margin of error allowed will depend upon the purpose for which the report is
prepared.
8. Scope of report: The report should draw manager’s attention immediately to the
exceptional matters so that management by exception may be carried out effectively. Thus,
reports should highlight significant deviations from standards.
9. Infographics: Visual reporting through graphs, charts and diagrams should be preferred to
descriptive reports because visual reporting attract the eye more quickly and leaves a lasting
impression on the mind.
10. Detailed analysis: In all possible cases a detailed analysis should be given for all the
resultant variances between actual for the period compared to standards/budgets, be it sales,
purchases, production, profit or loss, capital expenditure, working capital position, etc., so
that exact causes of low performance may be known and timely corrective action may be
taken.
Where comparison is reflected in a report it should be ensured that the same is between
comparable (i.e., like) matters so that meaningful comparison may be made and idea about
efficiency or inefficiency may be formed.
12. Proper Language: Researcher must use a suitable language. Language should be
selected as per its target users.
13. Reliability: Research report must be reliable. Manager can trust on it. He can be
convinced to decide on the basis of research reports.
14. Proper Format: An ideal report is one, which must be prepared as per commonly used
format. One must comply with the contemporary practices; completely a new format should
not be used.
Sample Exercise:
The body of your report is a detailed discussion of your work for those readers who want to
know in some depth and completeness what was done. The body of the report shows what
was done, how it was done, what the results were, and what conclusions and
recommendations can be drawn.
Introduction
The introduction states the problem and its significance, states the technical goals of the
work, and usually contains background information that the reader needs to know in order to
understand the report. Consider, as you begin your introduction, who your readers are and
what background knowledge they have. For example, the information needed by someone
educated in medicine could be very different from someone working in your own field of
engineering.
While academic reports often include extensive literature reviews, reports written in industry
often have the literature review in an appendix.
Summary or background
This section gives the theory or previous work on which the experimental work is based if
that information has not been included in the introduction.
Methods/procedures
This section describes the major pieces of equipment used and recaps the essential step of
what was done. In scholarly articles, a complete account of the procedures is important.
However, general readers of technical reports are not interested in a detailed methodology.
This is another instance in which it is necessary to think about who will be using your
document and tailor it according to their experience, needs, and situation.
A common mistake in reporting procedures is to use the present tense. This use of the present
tense results in what is sometimes called “the cookbook approach” because the description
sounds like a set of instructions. Avoid this and use the past tense in your
“methods/procedures” sections.
Results
This section presents the data or the end product of the study, test, or project and includes
tables and/or graphs and a brief interpretation of what the data show. When interpreting your
data, be sure to consider your reader, what their situation is and how the data you have
collected will pertain to them.
Discussion of results
This section explains what the results show, analyzes uncertainties, notes significant trends,
compares results with theory, evaluates limitations or the chance for faulty interpretation, or
discusses assumptions. The discussion section sometimes is a very important section of the
report, and sometimes it is not appropriate at all, depending on your reader, situation, and
purpose.
It is important to remember that when you are discussing the results, you must be specific.
Avoid vague statements such as “the results were very promising.”
Conclusions
This section interprets the results and is a product of thinking about the implications of the
results. Conclusions are often confused with results. A conclusion is a generalization about
the problem that can reasonably be deduced from the results.
Be sure to spend some time thinking carefully about your conclusions. Avoid such obvious
statements as “X doesn’t work well under difficult conditions.” Be sure to also consider how
your conclusions will be received by your readers, and as well as by your shadow readers—
those to whom the report is not addressed, but will still read and be influenced by your report.
Recommendations
The recommendations are the direction or actions that you think must be taken or additional
work that is need to expand the knowledge obtained in your report. In this part of your report,
it is essential to understand your reader. At this point you are asking the reader to think or do
something about the information you have presented. In order to achieve your purposes and
have your reader do what you want, consider how they will react to your recommendations
and phrase your words in a way to best achieve your purposes.
Assume that you were walking down the street, staring at the treetops, and stepped in a deep
puddle while wearing expensive new shoes. What results, conclusions, and recommendations
might you draw from this situation?
• Results: The shoes got soaking wet, the leather cracked as it dried, and the soles
separated from the tops.
• Conclusions: These shoes were not waterproof and not meant to be worn when
walking in water. In addition, the high price of the shoes is not closely linked with
durability.
• Recommendations: In the future, the wearer of this type of shoe should watch out for
puddles, not just treetops. When buying shoes, the wearer should determine the extent
of the shoes’ waterproofing and/or any warranties on durability.
Bibliography
Chand, S. (n.d.). https://www.yourarticlelibrary.com/management/essentials-of-a-good-
report-business-management/25777. Retrieved from
https://www.yourarticlelibrary.com.