Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Research Methodology

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 145

RESEARCH METHODOLOGY & STATISTICS

Almas Jabeen Abdul Razak


What is research ?
• Research = Re + Search
• It is the process finding solution to a problem.
• It’s the process of arriving as a dependable solution to a problem
through planned & systematic collection , analysis and
interpretation of Data.
• It seeks answer only of those questions which answers can be
given on the basis of available facilities
• It’s a movement from known to unknown.
Person
Observes Again and again

Phenomena
Collection of Data Analysis

Conclusion
DEFINITIONS OF RESEARCH
• V REDMAN & AVH MORY – “Research is a systematized effort
to gain knowledge”

• Emory defines research as “any organized inquiry designed and


carried out to provide information for solving a problem”.
FEATURES OF RESEARCH
• It gathers new knowledge / data from primary / first hand resources.
• It requires plan.
• It requires expertise.
• Research is patient and un hurried activity.
• It places emphasis upon the discovery of general principles.
• Its an exact systematic and accurate investigation.
• Its logical and objective.
• It Endeavour to oraginze data in quantitaive forms.
• Researcher carefully record and report the data
• Conclusion and generalization are arrived at carefully and
cautiously .
OBJECTIVES OF RESEARCH
1. THEROTICAL OBJECTIVE
• Formulate new theories, principals etc.
• This type of theory is explanatory because it explains the relationship between
variables.
• Its mainly used in Physics, Chemistry, Math's etc
2. FACTUAL OBJECTIVE
• Find out new facts.
• Its of descriptive nature
• These are mainly historical type of research which describes facts or events
which has previously happened.
3. APPLICATION OBJECTIVE
• It doesn't contribute to new knowledge in the fund of human knowledge but
suggest new application, by application here it means improvement and
modification in practice.
GENERAL OBJECTIVES OF RESEARCH
• To gain familiarity with a phenomenon or to achive new
insight into it,
• To portray accurately the characteristics of particular
individual/ situation/ group.
• To determine the frequency in which something occurs or with
which it is associated with something else.
• To test hypothesis of causal relation ship between variables.
PURPOSE OF RESEARCH
• Research extends knowledge of human beings social life and environment.
• Research reveals the mysteries of nature.
• Research establishes generalizations and general laws and contributes to theory building in
various fields of knowledge.
• Research verifies and tests existing facts and theory.
• Research helps us to improve our knowledge and ability to handle situation.
• Research aims to analyze inter-relationship between variables and to derive causal
explanations, which help us to better understanding of the world in which we live.
• Research aims to finding solutions to the problem, e.g.:- socio-economic problems, health
problems, organizational and human relational problems and so on…
• Research also aims at developing new tools, concepts and theories for better understanding
to unknown phenomena.
• Research helps national planning board to focus our national development. It enables the
planners to evaluate alternative strategies, on-going programs and evaluation etc.,
• Research provides functional data for rational decision making and formulation of strategies
and policies.
TYPES OF RESEARCH
FIELD SETTING PURE
RESEARCH

LONGITUDINAL
APPLIED
RESEARCH

ONE TIME RESEARCH EXPLORATORY

DEICSION ORIENTED RESEARCH DESCRIPTIVE

CONCLUSION ANALYTICAL
ORIENTED

EVALUATION EXPERIMENTAL

ACTION HISTORICAL
DIAGNOSTIC
• PURE RESEARCH : It is conducted for the purpose of developing
scientific theories, by discovering basic principles / broad generalization of a
discipline rather than for the purpose of solving some immediate problems.
• APPLIED RESEARCH: The purpose of applied research is to improve a
product or a process and to test theoretical concepts in actual problematic
situation . It seeks an immediate and practical results.
• EXPLORATORY RESEARCH: It is the preliminary study of an
unfamiliar problem about which the researcher has little or no knowledge.
Exploratory research is necessary to get initial insight into the problem for
the purpose of formulating more precise investigation.
• DESCRIPTIVE RESEARCH: It is a fact finding investigation describing,
recording, analyzing and interpreting conditions that exist. it gives proper
basis for understanding current problems, and guides in planning and
formulation of policies
• ANALYTICAL RESEARCH: It’s a system of procedures and techniques
of analysis applied to quantitaive data. This field is used in different fields
in which numerical data are engaged.
• EXPERIMENTAL – This method provides the best approach for the study of
cause and effect relationship under controlled conditions. This is popular in field
of natural sciences.
• HISTORICAL – It is concerned with some past phenomena, in this process
evidence about past is systematically collected, evaluated, verified and
synthesized.
• DIAGNOSTIC – Its is directed towards what is happening, why it is happening
and what can be done about it. It aims at a cause of a problem and the possible
solution for it.
• ACTION – The purpose of action research is to acquire new skill or new approach
to solve a certain problem. A test marketing research for a new product is good
example of action research.
• EVALUATION – Its is done for assessing the effectiveness of social or economic
programs implemented or for assessing the impact of developmental projects.
• CONCLUSION ORIENTED – Here the researcher is free to pickup a problem,
redesign the enquiry as he or she wants to proceed and is prepared the
conceptualization as he visualize.
• DECISION ORIENTED – It is always for the need of decision maker and the
researcher and here it is free to embark upon researchers inclination for his or her
research.
• ONE TIME RESEARCH – Here the research is confined to only a single period
of time.
• LONGITUIDINAL RESEARCH – Research is carried on over several times for
the purpose of getting a feasible solution.
• CASE STUDY - It is an in-depth comprehensive study of a person, an episode, a
program or a social unit.
• SURVEY RESEARCH - It is a method of research involving collection of data
directly from a population or a sample at a particular period.
APPROACHES TO RESEARCH

QUANTITATIVE QUALITATIVE
APPROACH APPROACH
QUALITATIVE APPROACH
QUANTITATIVE APPROACH It is embedded in the philosophy
of empiricism; follows an open ,
It is rooted in the philosophy of flexible and unstructured
rationalism , follows a rigid , approach to enquiry; aims at
structured and predetermined set of explore diversity rather than to
procedures to explore ; aims to quantify; emphasizes the
quantify the extent of variation in a description and narration of
phenomenon ; emphasis the feelings, perception and
measurement of variables and the experiences rather than their
objectivity of process; believes in measurement; and
substantiation on the basis of large communicates findings in a
sample size; gives importance to descriptive and narrative manner
validity and reliability of findings rather than analytical; placing
and communicate findings in no or less emphasis on
aggregate and analytical manner; generalization.
drawing conclusion and inferences
that can be generalized
RESEARCH PROCESS
A. Problem identification
Problem B. Consideration in selecting a research
identification Problem
C. Steps in formulating a research
problem

A. Need for Literature


Literature review B. Sources
C. Steps

A. General and Specific Objectives


Formulation of
B. Hypothesis
objectives
C. Variables
A. Research Design and Plan
B. Types of study
Research Design C. Data collection tools and techniques.
D. Sampling
E. Pilot study
F. Data collection
A. Editing
B. Categorizing
Data processing
C. Coding
D. Summarizing

A. Statistics
B. Uni-variate Analysis
Data analysis C. Parametric Measures
D. Non parametric Measures
E. Econometrics

A. Report writing

Report writing B. Stages


C. Content
PROBLEM IDENTIFICATION
Problem identification
• Problem is identified after narrowing down the broad area of topic
to highly specific research problem . Researcher normally selects a
single problem at time because of unique needs and purposes

Steps in formulating a research problem


• identify a broad field or subject area of interest of you
• Dissect the broad area into sub areas.
• Select what is of most interest to you
• Raise research question
Consideration in selecting a research problem
Each problem taken for research has to be judged on the basis of
some criteria
• Relevance
• Avoidance of duplication
• Feasibility
• Political acceptability
• Applicability
• Urgency of data needed
• Ethical acceptability
REVIEW OF LITERATURE
NEED FOR REVIEW OF
LITERATURE
• Preventing duplicating work that has been done before
• Know what others have learned and reported about the
problem.
• Become more familiar with the various types of
methodologies.
• Get good background knowledge about the problem and why
research is needed in this area.
• Helps to know the theoretical perspective of the problem.
SOURCES
• Subject catalogues of libraries.
• Documentation services.
• Bibliographies.
• List of Books and publishers bulletins.
• Journals
• Government reports.
• Research abstract.
• Information on research done.
STEPS IN REVIEWING THE
LITERATURE

• Searching for the existing literature in your area of study.


• Review the selected literature
• Developing a theoretical framework
• Developing a conceptual framework
OBJECTIVES
• General objectives : It states what is expected to be
achieved by the study. It’s the overall thrust of the study.
Its is concerned with the main association and
relationship that a person seeks to discover or establish.
• Specific objectives : it should be numerically listed,
worded clearly and unambiguously. It addresses the
various aspects of the problem and should specify what
will be done, where and for what purpose..
HYPOTHESIS
A hypothesis is a specific statement of prediction.
It describes in concrete terms what a researcher
expects to happen in his/ her study.
Good and Hatt defines it as “ a question which can
be put to test to determine validity “
In short hypothesis , is a tentative solution or
explanation or a guess or assumption or proposition
or a statement to the problem facing by the
researcher
TYPES OF HYPOTHESIS
• Descriptive hypothesis : It intends to describe some characteristics of an object ,
a situation ,an individual or even an organization.
• Relational Hypothesis : It intends to describe the relation ship between
variables.
• Empirical / Working Hypothesis : This is a hypothesis framed in early stages of
research. This maybe altered or modified as research proceeds.
• Null Hypothesis: This states that there is no significant difference between the
parameter and statistic that is being compared.
• Alternative hypothesis :they are the research hypothesis which involves the
claim to be tested
• Analytical hypothesis : These are used when one would specify the relationship
between changes in one property leading to change in other.
• Common sense Hypothesis : These are based on what is being observed with
common idea existing among people.
• Statistical hypothesis :These are developed from samples that measureable.
They are of two types:
1. Hypothesis which indicates difference
2. Hypothesis which indicates relationship
VARIABLES
A variable is a characteristics of a person , object or phenomenon that can take
on different values.
Variables are condition or characteristics that experimenter manipulates, control
or observes.
A variable is anything that change.

Types of Variables
• Numerical variables : when variables are expressed in numbers they are called
numerical variables.
• Categorical Variables : When the values of a variable are expressed in
categories, they are called Categorical variables.
• Dependent Variable & Independent Variable : the variable that is used to
measure the problem under study is called the dependent variable.
The variables that re used to describe or measure the factor that are assumed to
cause or at least to influence the problem are called independent variable.
• Active Variable: The variable that are directly manipulated by the experiment
are called active variables.
• Attribute Variable: they are those characteristics which
cannot be altered by the experiment.
• Intervening Variables : certain factors or variables may
influence the relationship even though they cannot be observed
directly and they are called intervening variables
• Extraneous variables : They are those uncontrolled variables
that may have significant influence upon the results of a study.
RESEARCH DESIGN
A research design a logical and systematic plan prepared for
directing a research study .
It constitutes the blueprint for the collection , measurement and
analysis of data.
It is the plan , structure , strategy of investigation conceived so as
to obtain answers to research question.
Essential of a good research design
• Plan
• Outline
• Blue print
• Scheme
CLASSIFICATION OF
DESIGNS
• Experimental
• Exploratory
• Descriptive
• Historical
• Case studies
• Survey
• Combination of any of these.
RESEARCHPLAN
• A research plan prescribes the boundaries of research activity and
enables the researcher to channel his energies in the right work.
• Various question are needed to be answered while preparing the plan
What the study is about?
Why the study is made?
What is it scope ?
What are the objectives of the study?
What kind of data are needed?
What are the sources ?
What is the sample size?
What are the techniques?
How the data should be processed?
What is the cost involved ? etc.
CONTENTS OF A
RESEARCH PLAN
• Introduction
• Statement of the problem
• Review of the previous studies
• Scope f the studies
• Objective of the study
• Conceptual model
• Hypothesis
• Operational definition of concepts
• Geographical area to be covered
• Reference period
• Methodology
• Sampling plan
• Tools for gathering data
• Plan of analysis
• Chapter scheme
• Time budget
• Financial budget
SAMPLING
Sampling is the statistical process of selecting a subset (called a
“sample”) of a population of interest for purposes of making
observations and statistical inferences about that population.
Sampling, therefore, is the process of selecting a few (a sample)
from a bigger group (the sampling population) to become the
basis for estimating or predicting the prevalence of an unknown
piece of information, situation or outcome regarding the
bigger group.
Characteristics of a good sample
Representativeness
Accuracy
Precision
Size
SAMPLING PROCESS
• Define the population or universe
• State the sampling frame
• Specify the sampling unit
• Selection of sampling method
• Determine the sample size
• Specify the sampling plan
• Select the sample
TECHNIQUES OF SAMPLING
SAMPLING

Non Probability
Probability sampling
sampling

Simple random sampling


Convenience sampling
Stratified random sampling
Systematic random sampling Judgment sampling

Cluster sampling Quota sampling


Multi stage sampling
Snowball sampling
Matched pair sampling
Probability sampling: It is a technique in which every unit in the
population has a chance (non-zero probability) of being selected in the
sample, and this chance can be accurately determined.
All probability sampling have two attributes in common:
• Every unit in the population has a known non-zero probability of
being sampled, and
• The sampling procedure involves random selection at some point.
The different types of probability sampling techniques include:
Simple random sampling. In this technique, all possible subsets of
a population are given an equal probability of being selected. Simple
random sampling involves randomly selecting respondents from a
sampling frame, but with large sampling frames, usually a table of
random numbers or a computerized random number generator is
used.
Stratified sampling. In stratified sampling, the sampling frame is
divided into homogeneous and non-overlapping subgroups (called
“strata”), and a simple random sample is drawn within each
subgroup.
• Systematic sampling (also known as interval sampling) relies on
arranging the study population according to some ordering scheme and
then selecting elements at regular intervals through that ordered list.
• Cluster sampling. If you have a population dispersed over a wide
geographic region, it may not be feasible to conduct a simple random
sampling of the entire population. In such case, it may be reasonable to
divide the population into “clusters” (usually along geographic
boundaries), randomly sample a few clusters, and measure all units
within that cluster.
• Multistage sampling can be a complex form of cluster sampling.
Pardo Fuccboi refers it to sampling plans where the sampling is carried
out in stages using smaller and smaller sampling units at each stage.
• Matched-pairs sampling. Sometimes, researchers may want to
compare two subgroups within one population based on a specific
criterion. matched-pairs sampling technique is often an ideal way of
understanding bipolar differences between different subgroups within a
given population.
Nonprobability sampling is a sampling technique in which some units
of the population have zero chance of selection or where the probability of
selection cannot be accurately determined. Typically, units are selected
based on certain non-random criteria, such as quota or convenience.
• Convenience sampling. Also called accidental or opportunity sampling,
this is a technique in which a sample is drawn from that part of the
population that is close to hand, readily available, or convenient.
• quota sampling, the population is first segmented into mutually exclusive
sub-groups, just as in stratified sampling. Then judgment is used to select
the subjects or units from each segment based on a specified proportion.
• Snowball sampling. In snowball sampling, you start by identifying a few
respondents that match the criteria for inclusion in your study, and then ask
them to recommend others they know who also meet your selection
criteria.
• Purposive sampling (also known as judgment, selective or subjective
sampling) is a sampling technique in which researcher relies on his or her
own judgment when choosing members of population to participate in the
study.
PILOT STUDY
• Pilot study is a small scale preliminary study conducted in
order to evaluate feasibility, time, cost, adverse events, and
effect size (Statistical variability) in an attempt to predict an
appropriate sample size and improve upon the study design
prior to performance of a full scale research project.
•  Although a pilot study cannot eliminate all systematic
errors or unexpected problems, it reduces the likelihood of
making a Type I or Type II error. Both types of errors make
the main study a waste of effort, time, and money.
SAMPLE SIZE
Before you can calculate a sample size, you need to determine a few things about
the target population and the sample you need:

Population Size — How many total people fit your demographic?


Margin of Error (Confidence Interval) — No sample will be perfect, so you need to
decide how much error to allow. The confidence interval determines how much
higher or lower than the population mean you are willing to let your sample mean
fall. If you’ve ever seen a political poll on the news, you’ve seen a confidence
interval. It will look something like this: “68% of voters said yes to Proposition Z,
with a margin of error of +/- 5%.”
Confidence Level — How confident do you want to be that the actual mean falls
within your confidence interval? The most common confidence intervals are 90%
confident, 95% confident, and 99% confident.
Standard of Deviation — How much variance do you expect in your responses?
Since we haven’t actually administered our survey yet, the safe decision is to use .5
– this is the most forgiving number and ensures that your sample will be large
enough.
• Your confidence level corresponds to a Z-score. This is a constant value
needed for this equation. Here are the z-scores for the most common
confidence levels:

• 90% – Z Score = 1.645


• 95% – Z Score = 1.96
• 99% – Z Score = 2.576
• If you choose a different confidence level, use this Z-score table* to find
your score.

• Next, plug in your Z-score, Standard of Deviation, and confidence interval


into this equation:**

• Necessary Sample Size = (Z-score)² * Std Dev*(1-StdDev) / (margin of


error)²
DATA COLLECTION
Data are the facts and figures collected for statistical investigation. Data
collection is the process of gathering and measuring information on targeted
variables in an established systematic fashion, which then enables one to
answer relevant questions and evaluate outcomes.
There are two types of data:
• 1. Primary data,
• 2. Secondary data (desk research)
The primary data are those which are collected afresh and for the first time,
and thus happen to be original in character or information collected or
generated by the researcher for the purpose of the project immediately at hand.
The secondary data are those which have already been collected by someone
else and which have already been passed through the statistical process.
Secondary data refer to the information that have been collected by someone
other than researcher for purposes other than those involved in the research
project at hand. Books, journals, manuscripts, diaries, letters, etc., all become
secondary sources of data as they are written or compiled for a separate purpose
METHOD OF
COLLECTING DATA
1. Observation method
2. Interview method
3. Survey method
4. Experimentation
6. Projective technique
7. Sociometry
8. Content analysis
Observation
Observation is one of the cheaper and more effective techniques of data
collection. Observation, in simple terms, is defined as watching the
things with some purpose in view. Observation, is a systematic and
deliberate study through eye of spontaneous occurrence at the time, they
occur.
Observation has mainly three components-Sensation, attention and
perception
Types of Observation
• Participant observation: In this observation, the observer is a part of
the phenomenon or group which is observed and he acts as both an
observer and a participant
• Non-Participant observation: In this type of observation, the
researcher does not actually participate in the activities of the group to
be studied. There is no emotional involvement on the part of the
observer
• Controlled observation: This type of observation is found
quite useful in either in the laboratory or in the field.
Controlled observation is carried out observational techniques
and exercise of maximum control over extrinsic and intrinsic
variables.
• Uncontrolled observation: If the observation takes place in
the natural settings, it may be termed as uncontrolled
observation. The main aim of this observation is get
spontaneous picture of life.
• Direct observation: In this type of observation, the event or
the behavior of the person is observed as it occurs. This
method is flexible and allows the observer to see and record
subtle aspects of events and behavior as they occur.
• Indirect observation; This does not involve the physical
presence of the observer , and the recording is done by
mechanical, photographic or electronic devices. This method is
less flexible than direct observation.
INTERVIEW
It may be defined as a two way systematic conversation between
an investigator and an informant, initiated for obtaining
information relevant to a specific study.
It involves not only conversation, but also leaning from the
respondents, gestures, facial expression, pauses and his
environment.
Interviewing process
• Preparation
• Introduction
• Developing rapport
• Carrying the interview forward
• Recording the interview
• Closing the interview
Types of interviews
• Structured or directive interview:
This is an interview made with a detailed standardized schedule. The same
questions are put to all the respondents and in the same order. This type of
interview is used for large-scale formalized surveys
• Unstructured or non-directive interview
In this type of interview, a detailed pre-planned schedule is used. Only a
broad interview guide is used. Questions are not standardized and not
ordered in a particular way. This technique is more useful in case studies
rather than large surveys
• Semi-structured or focused interview
The investigator attempt to focus the discussion on the actual effects of a
given experience to which the respondents have been exposed. The
situation is analyzed prior to the interview. An interview guide specifying
topics relating to the research hypothesis is used Interview is focused on
the subjective experiences of the respondent
• Clinical interview
It is concerned with broad underling feelings or motivations or with
the course of the individual’s life experiences. The ‘personal history’
interview used in social case work, prison administration, psychiatric
clinics and in individuals life history research is the most common type
of clinical interview
• Depth interview
This is an intensive and searching interview aiming at studying the
respondent’s opinion, emotions or convictions on the basis of an
interview guide. This deliberately aims to elicit unconscious as well as
extremely personal feelings and emotions
• Telephone interviews
It is a non-personal method of data collection. It may be used as a
major method or supplementary method
• Group interview
It is a method of collecting primary data in which a number of
individuals with a common interest interact with each other
EXPERIMENTATION
Experimentation is a research process used to observe cause and
effect relationship under controlled condition.
In other words it aims at studying the effect of an independent
variable on dependent variable by keeping other Independent
variable constant through some type of control.
There are broadly two types of experiment
• Laboratory experiment : here the investigator creates a
condition in which he wants to make his study through
manipulation of variables.
• Field experiment :it occurs in real life settings or natural
settings where less control can exerted.
SURVEY METHOD
A survey is a research method for collecting information from a selected group of people using
standardized questionnaires or interviews
It is a non-experimental, descriptive research methods which is used to study large and small
population.
Survey is fact finding study where there is critical inspection to gather information, often a
study of an area with respect to certain condition or its prevalence. There are two types of
survey
• Cross sectional survey are conducted to collect information from the population at a single
point of time. The purpose is to collect a body of data connection with two or more variables.
• Longitudinal survey : a longitudinal survey is one that takes place over a period of time. It
means the data is gathered over a period of time. there are three types of longitudinal survey
Trend studies The simplest type of longitudinal analysis of survey data is called trend analysis,
which examines overall change over time.
Cohort studies : A cohort study selects either an entire cohort of people or a randomly selected
sample of them as the focus of data collection.
Panel studies: here the same sample of the population are surveyed repeatedly. Panel studies
are very difficult to
• METHODS OF SURVEY
There are two methods
1. Census method: A complete survey of the population is
called census method. Here the entire population is a subject
matter for conducting survey.
2. Sampling method: a sample is representative of the
population only sample or sub select is selected for conducting
survey
PROJECTIVE TECHNIQUE
It involve presentation of ambiguous stimuli to the respondents for
interpretation. In doing so, the respondents reveal their inner characteristics.
This techniques for the collection of data have been developed by psychologists
to use projections of respondents for inferring about underlying motives, urges,
or intentions which are such that the respondent either resists to reveal them or
is unable to figure out himself.
These techniques play an important role in motivational researches or in
attitude surveys.
• Types of projective techniques
Projective techniques may be divided into three broad categories:
1. Visual: to show the respondent a picture and ask him to describe the persons
or objects in the picture.
2. Verbal: this techniques involve use of words both for stimulus and for
response.
3. Expressive: under this technique subjects are asked to improve or act out a
situation in which they have been assigned various roles.
SOCIOMETRY
Sociometry is a quantitative method for measuring social
relationships.
It was developed by psychotherapist Jacob L. Moreno in his studies
of the relationship between social structures and psychological
well-being.
The term sociometry relates to its Latin etymology, socius meaning
companion, and metrum meaning measure. Jacob Moreno defined
sociometry as "the inquiry into the evolution and organization of
groups and the position of individuals within them."
The basic technique in sociometry is the sociometric test . This is
the test under which each member of a group is asked to choose
from all other members those with whom he prefers to associate in
a specific situation.
CONTENT ANALYSIS
• Human beings communicate through language. Language
helps to convey our emotions, knowledge, opinions, attitudes
and values. Print media, television, radio; movies also
communicate ideas, beliefs and values. The analysis is of
communication content-written and pictorial- has now become
a methodological procedure for extracting data from a wide
range of communications.
• Content analysis is a method of social research that aims at the
analysis of the content qualitative and quantitative- of
documents, books, newspapers. magazines and other forms of
written material.
TOOLS FOR DATA COLLECTION
• The questionnaire
a questionnaire is a research instrument consisting of a set of questions (items) intended to
capture responses from respondents in a standardized manner.
Questions may be unstructured or structured. Unstructured questions ask respondents to
provide a response in their own words, while structured questions ask respondents to select an
answer from a given set of choices.
Characteristics of a Good Questionnaire:
1. It deals with an important or significant topic.
2. Its significance is carefully stated on the questionnaire itself or on its
covering letter.
3. It seeks only that data which cannot be obtained from the resources
like books, reports and records.
4. It is as short as possible, only long enough to get the essential data.
5. It is attractive in appearance, nearly arranged and clearly duplicated
or printed.
6. Directions are clear and complete, important terms are clarified.
7. The questions are objective, with no clues, hints or suggestions.
8. Questions are presented in a order from simple to complex.
9. Double negatives, adverbs and descriptive adjectives are avoided.
10. Double barreled questions or putting two questions in one question
are also avoided.
Response formats. questions may be structured or unstructured. Responses to
structured questions are captured using one of the following response formats:
• Dichotomous response, where respondents are asked to select one of two
possible choices, such as true/false, yes/no, or agree/disagree. An example
of such a question is: Do you think that the death penalty is justified under
some circumstances (circle one): yes / no
• Nominal response, where respondents are presented with more than two
unordered options, such as: What is your industry of employment:
manufacturing / consumer services / retail / education / healthcare / tourism
& hospitality / other.
• Ordinal response, where respondents have more than two ordered options,
such as: what is your highest level of education: high school / college degree
/ graduate studies.
• Interval-level response, where respondents are presented with a 5-point or
7-point Likert scale, semantic differential scale, or Guttman scale.
• Continuous response, where respondents enter a continuous (ratio-scaled)
value with a meaningful zero point, such as their age or tenure in a firm.
These responses generally tend to be of the fill-in-the blanks type.
Types of questions to be avoided.
• Leading questions
• Loaded questions
• Ambiguous questions
• Double barreled questions
• Long questions
• Avoid double negative
SCHEDULES
Schedule as a Data Collection Technique in Research. Schedule is the
tool or instrument used to collect data from the respondents while
interview is conducted. ... The schedule is presented by the
interviewer. The questions are asked and the answers are noted down
by him.

CHECKLIST
this is the simplest form of all devices . It consist prepared list of items
pertinent to an object or a particular task.
The presence or absence of each task my be indicated by checking yes
or no or multi point scale. It ensures complete consideration of all
aspects of an object.
OPINIONNAIRE
This is a list of questions or statements pertaining to an issue or a
program.it is used for studying the opinion of the people .
CHECKING THE VALIDITY AND
RELAIBILTY OF RESEARCH TOOL
Sound measurement must meet the tests of validity, reliability and
practicality. In fact, these are the three major considerations one
should use in evaluating a measurement tool
• Validity
It is the most critical criterion and indicates the degree to which an
instrument measures what it is supposed to measure. Validity can
also be thought of as utility. In other words, validity is the extent to
which differences found with a measuring instrument reflect true
differences among those being tested. But the question arises: how
can one determine validity without direct confirming knowledge?
The answer may be that we seek other relevant evidence that
confirms the answers we have found with our measuring tool. What
is relevant, evidence often depends upon the nature of the research
problem and the judgment of the researcher
• Test of Reliability
The test of reliability is another important test of sound
measurement. A measuring instrument is reliable if it provides
consistent results. Reliable measuring instrument does contribute
to validity, but a reliable instrument need not be a valid
instrument.
Two aspects of reliability viz., stability and equivalence deserve
special mention.
The stability aspect is concerned with securing consistent results
with repeated measurements of the same person and with the
same instrument
The equivalence aspect considers how much error may get
introduced by different investigators or different samples of the
items being studied
• Test of Practicality
The practicality characteristic of a measuring instrument can be
judged in terms of economy, convenience and interpretability.
From the operational point of view, the measuring instrument
ought to be practical i.e., it should be economical, convenient and
interpretable.
Economy consideration suggests that some trade-off is needed
between the ideal research project and that which the budget can
afford
Convenience test suggests that the measuring instrument should
be easy to administer. For this purpose one should give due
attention to the proper layout of the measuring instrument
Interpretability consideration is specially important when persons
other than the designers of the test are to interpret the results
MEASUREMENT AND
SCALING
Measurement
Measurement can be described as a way of obtaining symbols to
represent the properties of persons, objects, events or states under
study - in which the symbols have the same relevant relationship
to each other as do the things represented
Scaling
The ability to assign numbers to objects in such a way that:
• Numbers reflect the relationship between the objects with
respect to the characteristics involved
• It allows investigators to make comparison of amount and
change in the property being measured
Four (4) primary types of scales –
Nominal, Ordinal, Interval and Ratio
NOMINAL SCALE
• Least restrictive of all scales.
• Does not possess order, distance or origin
• Numbers assigned serve only as a label or tags for identifying
objects, properties or events
• Permissible mathematical operations: percentage, frequency,
mode, contingency coefficients
• ORDINAL SCALE
• Possess order but not distance or origin
• Numbers assigned preserve the order relationship (rank) and
the ability to distinguish between elements according to a
single attribute & element
• Permissible mathematical operations: (+) median, percentile,
rank correlation, sign test and run test
• INTERVAL SCALE
• Possess the characteristic of order and distance
• DOES NOT possess origin
• Numbers are assigned in such a way that they preserve both
the order and distance but do not have a unique starting point
• Permissible mathematical operations (+) Mean, average
deviation, standard deviation, correlation, t F
• RATIO SCALE
• Possess the characteristic of order distance and origin
• Numbers are assigned in such a way that they preserve both
the order distance and origin
• .Permissible mathematical operations: ALL
Other scaling techniques
RATING SCALES
In rating or ranking scales the respondent are assigns numerical
positions to an individual specify the degree of his observations
Following are the rating scales
Graphic rating scales
Here different points of the scale run from one extreme of the
attitude to the other . Considering the description of the points
along the scale the rater indicates his rating or preferences by
putting a tick mark on the point determined by him .
Itemized rating scale
It is also known as numerical scales generally 5 point or seven
point are given on the scale to represent different categories of
items. The respondent picks up one of those categories and mark
them on scale. The first point represent lower category and the
last point higher category.
Comparative rating scale
Here the comparative position of an individual is indicated
with reference to other individual .
Rank order scale
It is used for comparative or relative rating. Here an
individual position is indicated inn relation to others. In
case rater himself it is done then it is called as self rating .
Attitude scales
It is used to not to rate the individuals but to
examine their views , agreements or disagreements
of a particular subject . Following are the different
scales
Likert Scale
The Likert scale requires the respondents to indicate a degree of agreement
or disagreement with each of a series of statements about the stimulus
objects
The analysis can be conducted on an item-by-item basis (profile analysis),
or a total (summated) score can be calculated.

Semantic Differential Scale


The semantic differential is a seven-point rating scale with end points
associated with bipolar labels that have semantic meaning.
The negative adjective or phrase sometimes appears at the left side of the
scale and sometimes at the right.
This controls the tendency of some respondents, particularly those with
very positive or very negative attitudes, to mark the right- or left-hand
sides without reading the labels.
Individual items on a semantic differential scale may be scored on either
a -3 to +3 or a 1 to 7 scale.
Stapel Scale
The Stapel scale is a unipolar rating scale with ten
categories numbered from -5 to +5, without a neutral point
(zero). This scale is usually presented vertically.
The data obtained by using a Stapel scale can be analyzed
in the same way as semantic differential data.
Differential scale - Thurstone technique
Here attitude scaling is done with the help of judges
PROCESSING THE DATA
Editing
Editing is the first step in data processing. Editing is the
process of examining the data collected in
questionnaires/schedules to detect errors and omissions and
to see that they are corrected and the schedules are ready
for tabulation. Mainly two types of editing are there
Field editing
Central editing
• Classification of Data
Classification or categorization is the process of grouping the
statistical data under various understandable homogeneous
groups for the purpose of convenient interpretation
Classification becomes necessary when there is a diversity in the
data collected for meaningless for meaningful presentation and
analysis. However, it is meaningless in respect of homogeneous
data. A good classification should have the characteristics of
clarity, homogeneity, equality of scale, purposefulness and
accuracy.
Coding of Data
Coding is the process/operation by which data/responses
are organized into classes/categories and numerals or other
symbols are given to each item according to the class in
which it falls. In other words, coding involves two
important operations;
(a) deciding the categories to be used and
(b) allocating individual answers to them.
• Tabulation of Data
Tabulation is the process of summarizing raw data and displaying it in
compact form for further analysis. Therefore, preparing tables is a very
important step. Tabulation may be by hand, mechanical, or electronic.
The choice is made largely on the basis of the size and type of study,
alternative costs, time pressures, and the availability of computers, and
computer programmes. If the number of questionnaire is small, and their
length short, hand tabulation is quite satisfactory.
Table may be divided into:
• (i) Frequency tables,
• (ii) Response tables,
• (iii) Contingency tables
• (iv) Uni-variate tables,
• (v) Bi-variate tables,
• (vi) Statistical table and
• (vii) Time series tables
Data Diagrams
Diagrams are charts and graphs used to present data. These facilitate getting
the attention of the reader more. These help presenting data more effectively.
Creative presentation of data is possible. The data diagrams classified into:
• Charts: A chart is a diagrammatic form of data presentation. Bar charts,
rectangles, squares and circles can be used to present data. Bar charts are
uni-dimensional, while rectangular, squares and circles are two-
dimensional.
• Graphs: The method of presenting numerical data in visual form is called
graph, A graph gives relationship between two variables by means of either
a curve or a straight line. Graphs may be divided into two categories. (1)
Graphs of Time Series and (2) Graphs of Frequency Distribution. In graphs
of time series one of the factors is time and other or others is / are the study
factors. Graphs on frequency show the distribution of by income, age, etc.
of executives and so on.
DATA ANALYSIS
The purpose of analysis is to summarize and organize the
collected data with a view to solve variety of social , economic
and developmental problem which help researcher to bring new
ideas and creative thinking into research investigation and to
draw conclusion and make suggestion for future course of action.
Objects of analysis
• Simplification & summarization
• Comparison
• Forecasting
• Policy formulation
STATISTICS
• It is the science of collecting , organizing , analyzing and
interpreting data
Statistics are of two types
Descriptive
Inferential
Descriptive statistics uses the data to provide descriptions of the
population, either through numerical calculations or graphs or
tables
inferential statistics makes inferences and predictions about a
population based on a sample of data taken from the population
in question.
Probability distribution
They are such distribution which are not obtained by actual
observation or experiments but are mathematically deducted
on certain assumption.
Classification of theoretical distributions.
They are classified into two categories
1. Discrete theoretical distribution
2. Continuous theoretical probability distribution.
Discrete again is divided into two
3. Binomial distribution
4. Poisson distribution
And continuous distribution includes
5. Normal Distribution
Discrete
• Binomial distribution
It is also known as Bernoulli distribution
It is associated with Swiss mathematician James Bernoulli
It is the probability distribution expressing the probability
of one set of dichotomous variables.
That is success or failure
They are used in business decision making situation also in
quality control etc.
There are only two possible outcome in a trail
The trails are independent .
• Poisson distribution
• It was originated by French mathematician Simeon Denis
Poisson
• This is limiting form of binomial distribution
• Binomial can only be used if trails are previously known
• In real life situation one cannot analyze the possible
number of trials
• The Poisson distribution is employed in situation where
the number of success is relatively small
• All Poisson distribution are skewed to right
Continuous Distribution
Normal distribution
• it was described by Abraham De
Moivre
• In a ND Mean=median=mode
• It is a bell shaped curve
• Total area under the curve is 1
• 50% of the values are less than the
mean and50 %of values are above the
mean
• It is symmetrical about the center
• We could use normal curve to predict
the chance of happening something.
• It gives us the idea the what the data
actually look like.
• It also describes that 68.26% of all
observation are within ±1 standard
deviation and95 % are within ± 2std
deviation and 99 % are in ±3 Std
deviation.
UNIVARIATE ANALYSIS
It deals with simple data set pertaining to a single variable .
It includes
• Measures of central tendency
• Measures of dispersion
Measures of central tendency
A measure of central tendency (also referred to as measures of
center or central location) is a summary measure that attempts to
describe a whole set of data with a single value that represents
the middle or center of its distribution. Following are the
different measure of central tendency
• Mean
• Median
• Mode
• Geometric mean
• Harmonic mean
• Quadratic mean
• Mean :The mean is the sum of the value of each observation in a
dataset divided by the number of observations. This is also known
as the arithmetic average.
• Median :The median is the middle value in distribution when the
values are arranged in ascending or descending order.
• Mode :The mode is the most commonly occurring value in a
distribution.
• Geometric mean – the nth root of the product of the data values,
where there are n of these items. This measure is valid only for
data that are measured absolutely on a strictly positive scale
• Harmonic mean – the reciprocal of the arithmetic mean of the
reciprocals of the data values. This measure too is valid only for
data that are measured absolutely on a strictly positive scale
• The Quadratic mean (often known as the root mean square) is
useful in engineering, but is not often used in statistics. This is
because it is not a good indicator of the center of the distribution
when the distribution includes negative values.
MEASURES OF
DISPERSION
Dispersion in statistics is a way of describing how spread out a set of data
is. When a data set has a large value, the values in the set are widely
scattered; when it is small the items in the set are tightly clustered.
• Range: the difference between the smallest and largest number in a set
of data.
• Standard deviation: It is the probably the most common measure. It
tells you how spread out numbers are from the mean,
• Interquartile range (IQR): It describes where the bulk of the data lies
(the “middle fifty” percent).
• Interdecile range: The difference between the first decile (10%) and
the last decile (90%).
• Variance : It is the expectation of the squared deviation of a random
variable from its mean, and it informally measures how far a set of
(random) numbers are spread out from their mean
Two sets of data
-10, 0 ,10,20,30 8,9,10,11,12
Range = 40 Range = 4
Variance = 200 Variance = 2
SD = 10 SD=
Parametric and Non Parametric
measures
Parametric Measures
Conventional statistical procedures are also called as parametric
tests.
In a parametric test sample statistic is used to estimate
population parameter
The main assumption relying behind parametric testing are the
samples are drawn from normally distributed population.
Testing of Hypothesis
The various steps involved in testing are
• Select a data sample from the population
• Make an assumption that whether the data is normally distributed or not
• Set up a null hypothesis that is H0: µ= specified value
• Set up an alternative Hypothesis H1 : µ ≠specified value
µ > specified value
µ< specified value
• Choose an alpha or significance level at 5% or 1%
alpha is the probability of having a null hypothesis that is indeed true but our
data says that it is wrong
• Select the test statistic
• Decide the critical value : critical value is the value of test statistics which
separates acceptance region from rejection region.
• Form a decision rule computation of test statistic value
• Conclusion or decision
Here while testing there are two types of hypothesis.
1. Directional
2. Non directional
Directional hypothesis are those type in which the data are either
positively related or negatively related , i.e.; the one tailed test
Non directional hypothesis are the hypothesis used in two tailed test
were we say as there is no significant difference between observed
and expected frequencies.
Also to mention two types of errors can also commit while testing the
hypothesis i.e.,
Type 1 error
Type 2 error
Type 1 error occurs when rejecting the null hypothesis when it is true
Type 2 error occurs when accepting null hypothesis when it is false.
In order to minimize both the errors we are fixing the confidence
level as 95 %
Testing normality

Normality: This assumption is only broken if there are


large and obvious departures from normality
• This can be checked by
• Inspecting a histogram
• Skewness and kurtosis ( Kurtosis describes the peak of the curve
Skewness describes the symmetry of the curve.)
• Kolmogorov-Smirnov (K-S) test (sample size is ≥50 )
• Shapiro- Wilk test (if sample size is <50)
(Sig. value >0.05 indicates normality of the distribution)
Parametric measures
• Z test
The Z score is a test of statistical significance that helps you decide whether or
not to reject the null hypothesis. The p-value is the probability that you have
falsely rejected the null hypothesis. Z scores are measures of standard
deviation.
A z-test is a statistical test used to determine whether two population means are
different when the variances are known and the sample size is large. The test
statistic is assumed to have a normal distribution, and nuisance parameters such
as standard deviation should be known for an accurate z-test to be performed.
The formula for calculating Z value
=X

Uses
• Testing of hypothesis for means
• Testing significance between the mean of the two samples
• Testing significance of difference between two standard deviation
Assumption
• The random distribution of a statistic is normal
• Sample values are close to parameter values
• Students t test
A t-test is any statistical hypothesis test in which the test statistic follows
a Student's t-distribution under the null hypothesis. It can be used to
determine if two sets of data are significantly different from each other.
Formula for calculating t is as follows
T=X-
S
Uses of t test
• It is used to test whether the two samples have the same mean when
the samples are small
• It is used to test the significance of mean of a random sample
• It is used to test difference between the means of two dependent
sample
• It is used to test the significance of an observed correlation coefficient
Assumptions
• Normal distribution
• The population standard deviation is not known
• Sample size is less than 30
ANOVA
• The term variance was introduced in the statistical analysis by R.A.Fisher
• F test is the name introduced to honor R.A.Fisher
• F test is used to determine whether the two independent estimates f population
variance significantly differ between themselves or to establish whether both variables
have come from the same universe
Uses of F distribution
• It can be used to test the hypothesis
• It can be used to test the equality of variances of two population when samples are
drawn
• To test the equality of means of three or more population
• It is used for testing the significance of an observed sample multiple correlation
• It is used to test the linearity of regression
Assumption
• Sample follow a normal distribution
• All observation are randomly selected
• the ratio of greater variance and smaller variance should equal to or greater than one
• F distribution is always formed by the ratio of squared values , therefore it can never
be a negative number
F = Greater variance
Smaller variance
Non parametric
• Non parametric test are used when assumption required by the
parametric test are not met
• All test involving rank data are non parametric
• Non parametric test are distribution free
Assumption of non parametric test
• Sample observation are independent
• The variables are continuous
• Sample drawn is a random sample
• Observation are measured o ordinal scale
Non parametric
test

One sample Two sample K samples

Wilcoxon signed-rank
Chi Square
test Kruskal Wallis test
Sign test
Mann–
Kolmogorov Smirnov Whitney U test Median test
Median test
test
Run test The Wald–Wolfowitz
runs test
Non Parametric tests
Chi square test
• The chi square test was first introduced by Karl Pearson.
• It is a test which explains the magnitude of difference between
observed frequencies and expected frequencies under certain
assumptions.
• Greater the discrepancy b/w observed & expected frequencies, greater
shall be the value of χ2.

Assumptions
• The observation are always assumed to be independent of each other.
• All the events must be mutually exclusive
• A sample with sufficiently large size is assumed
• It look like normal distribution but it starts with zero and is skewed
with long tail to the right
By using this test we can find out the deviation between the
observed values and expected values
It is used when the variable is categorical or ordinal
It is a type of binomial test in which we determine who is
different from whom.I.e.. the post hoc test.
• As a test of independence
χ2 is used to find whether one or more attributes are associated or
not
Here the variables are independent or not are tested
• χ2 test a test for homogeneity
It is an extension of test of independence
Here it determines whether the two or more independent random
samples are drawn from the same population or from different
population
SIGN TEST
It is to be applied in case the sample is drawn from a continuous
symmetrical population.
Here the mean is expected to be lied at the center and equal
number of units are to be lied above and below the mean value.
Simple and easy to interpret
Makes no assumptions about distribution of the data
Not very powerful
To evaluate H0 we only need to know the signs of the differences
If half the differences are positive and half are negative, then the
median = 0 (H0 is true).
If the signs are more unbalanced, then that is evidence against
H0.
• Kolmogorov Smirnov test
For testing the relationship between an empirical
distribution and some theoretical distribution or between
two empirical distribution goodness of fit test are employed
K-S can be applied to test the relationship between a
theoretical and a sample frequency distribution for one
sample test or between two sample distributions.
RUN TEST for randomness
The run test has been decided to determined whether the
sample is random or not.
The total no. of runs in a sample indicate whether the
sample is random or not.
Median test
The median test is used to determine the significance of difference
between median of two or more independent groups .
The object is to find out whether the median of different sample drawn
randomly are alike or can be taken as drawn from the same population.
It is an application of Chi square test for two variables each having two
subgroups.
Mann–Whitney U test
In statistics, the Mann–Whitney U test (also called the Mann–
Whitney–Wilcoxon (MWW), Wilcoxon rank-sum test, or Wilcoxon–
Mann–Whitney test) is a nonparametric test  designed to test the
significance of difference between the result of two samples drawn at
random from the same population but administered differently .
It can be used as an alternative to t test when parametric assumptions
are not met. It is nearly as efficient as the t-test on normal distributions
Here the observation are at least expressed in ordinal scale .
Wilcoxon signed-rank test
The Wilcoxon signed-rank test is a non-parametric statistical
hypothesis test used when comparing two related samples,
matched samples, or repeated measurements on a single sample
to assess whether their population mean ranks differ (i.e. it is
a paired difference test). It can be used as an alternative to
the paired Student's t-test, t-test for matched pairs, or the t-test
for dependent samples when the population cannot be assumed
to be normally distributed.
Run test
The Wald–Wolfowitz runs test , named after Abraham
Wald and Jacob Wolfowitz, is a non-parametric statistical test
that checks a randomness hypothesis for a two-valued data
sequence. More precisely, it can be used to test whether the two
samples were drawn from the same population.
• K sample test
Kruskal – Wallis test
The Mann Whitney U test is used to test the significance of
difference between the result of two independent samples
where dependent variable is measured on ordinal scale .the
K-W extent the use of Mann Whitney U test to three or
more independent groups

Median test
It has already been discussed in two sample test . The same
can be extended to meet further requirement of K samples
ECONOMETRICS
• In narrow sense
Econometrics means Economic Measurement.
• In Broader sense
It may be defined as the social science in which the tool of economics
theory , mathematics and statistical inferences are applied to the analysis of
economic phenomena
Types of Econometrics
• Theoretical
Theoretical Econometrics is concerned with the development of appropriate
methods for measuring economic relationships specified by
econometric models.
• Applied
In applied econometrics, we use the tools of theoretical econometrics to
study some special fields of economics and business, such as production
function, investment function, demand and supply function.
Methodology of Econometric
1. Statement of theory or hypothesis
2. Specification of the mathematical model of the theory
3. Specification of the Statistical or Econometric model
4. Obtaining Data
5. Estimation of the parameters of the Econometric Model
6. Hypothesis testing
7. Forecasting or Prediction
8. Using the model for control or policy purpose
Types of Data
• Time Series Data
• Cross Sectional Data
• Pooled Data
• Time Series Data
Time series is a sequence of data points, measured typically at
successive time instants spaced at uniform time intervals. Time
series data have a natural temporal ordering.
• Daily- Weather, Stock Price
• Monthly- Unemployment rate
• Quarterly- GDP
• Yearly- National Budgets
• Decennially- Population Census
• Cross Sectional Data
Cross-sectional data or cross section is a type of one-dimensional data
set. It refers to data collected by observing many subjects such as
individuals, firms or countries/regions at the same point of time, or
without regard to differences in time.
For example, we want to measure the mobile uses for a particular brand
in this campus. We could draw a sample of 100 students randomly from
the population, measure their mobile use, and calculate what percentage
of that sample is used of that brand. For example, 60% of our samples
were used that particular branded mobile. This cross-sectional sample
provides us with a snapshot of that population, at that one point in time.
Note that we do not know based on one cross-sectional sample if the uses
of this brand are increasing or decreasing; we can only describe the
current proportion.

Pooled Data
In Pooled or combined data are the element of both time series and
cross-sectional data.
CORRELATIONAL
ANALYSIS
• Correlation analysis is an attempt to determine the degree of relationship
between variables. It is the analysis of co variation between two variables.
• The coefficient of correlation ranges between -1 and +1 and quantifies the
direction and strength of the linear association between the two variables. 
• The correlation between two variables can be positive (i.e., higher levels
of one variable are associated with higher levels of the other) or negative
(i.e., higher levels of one variable are associated with lower levels of the
other).
Significance of correlational analysis
• It is used as basis for the study of regression
• In business it helps the management to estimate costs, sales, price, and
other variables.
• It helps to reduce the range of uncertainty associated with decision
making
Assumption of correlation
• Cause and effect relationship exist between the variables .
• The relation ship between the variable is linear
• The variables follows a normal distribution.
Classification of correlation

Correlation

On the basis of On the basis of On the basis of


Direction linearity variables

Positive correlation Linear Simple


correlation
Partial
Negative Non linear correlation
correlation Multiple correlation
• Positive correlation
If the variables are moving and varying in the same direction. It is
called positive correlation. I.e.. increase in value of one variable lead
to increase in other variable.
E.g.
P : 5 10 15 20 25 30
Q: 15 20 25 30 35 40
Negative correlation
Here the variables are moving in the opposite direction .
E.g.
X: 234567
Y: 654321
Linear correlation and non linear correlation
The distinction between linear and non linear correlation is based upon
the consistency of the ratio of changes between the variable understudy.
If the amount of change in one variable follows a constant change of
other variable then the correlation is said to be linear
Simple correlation
An analysis were relationship exist between two variables ;one
independent ad other dependent is known as simple correlation
analysis.
Simple correlation measures strength and type of the relationship
between two variables on the assumption that no other variable
come into play as such and it is need not to be taken.
It is also called as ‘Zero order correlation’
The statistical measure of simple correlation is known as ‘
Coefficient of Linear correlation’ with symbol ‘r’.
It can be either positive or negative.
Coefficient of simple determination with symbol r 2 gives the
proportion of variation in the dependent variable (y) accounted for
the repressor (x).
For e.g. if the value of r 2 = .81 , this means 81 % of the variation
in dependent variable has been explained by repressor.
• Partial correlation
It represent the relationship between two variables after the
effect of one or more other distracting variable , if any has
been eliminated.
Determination of partial correlation is essential to
understand the cause effect relationship between variables
under observation.
For e.g. ,
In a study it was observed that the correlation between
education and income was positive. But it might be
entirely due to a third variable say the persons economic
status . People with higher economic status earns more
money.
Accordingly education and income may have high
correlation .
• Multiple correlation

Coefficient of multiple correlation determines the nature


and extent of proximity in the relationship between one
dependent variable and two or more independent variable.
The statistical measure of such a relationship is known as
coefficient of multiple correlation, with a symbol R.
METHODS OF STUDYING
CORRELATION
a) Scatter diagram
b) Karl Pearson's coefficient of correlation
c) Spearman’s Rank correlation coefficient
d) Method of least squares
Karl Pearson's Coefficient of Correlation
„ Pearson’s ‘r’ is the most common correlation coefficient. „ Karl Pearson’s
Coefficient of Correlation denoted by- ‘r’ The coefficient of correlation ‘r’
measure the degree of linear relationship between two variables say x & y.
Karl Pearson's Coefficient of Correlation „
When deviation taken from actual mean:
r(x, y)= Σxy / √ Σx² Σy² „
When deviation taken from an assumed mean:
r = N Σdxdy - Σdx Σdy
√N Σdx²-( Σdx)² √N Σdy²-( Σdy)²
• Spearman’s Rank Coefficient of Correlation „
When statistical series in which the variables under study
are not capable of quantitative measurement but can be
arranged in serial order, in such situation Pearson's
correlation coefficient can not be used in such case
Spearman Rank correlation can be used.
„ R = 1- (6 ∑ D2 ) / N (N 2 – 1) „
R = Rank correlation coefficient
„ D = Difference of rank between paired item in two series.
„
N = Total number of observation.
• Scatter Diagram Method „
Scatter Diagram is a graph of observed plotted points where
each points represents the values of X & Y as a coordinate. It
portrays the relationship between these two variables graphically.
REGRESSION ANALYSIS
In statistical modeling, regression analysis is a statistical process
for estimating the relationships among variables.
More specifically, regression analysis helps one understand how
the typical value of the dependent variable (or 'criterion
variable') changes when any one of the independent variables is
varied, while the other independent variables are held fixed.
Regression analysis is widely used for prediction and forecasting
Regression line is the line which gives the best estimate of one
variable from the value of any other given variable. „
The regression line gives the average relationship between the
two variables in mathematical form
Regression can be simple linear regression or multiple linear
regression
Simple linear regression
It is a causal relation in which it describe how does a dependent
variable changes because of a change independent variable while
all other variables are held constant
Simple linear regression is representing a set of clustered data
points with best fit line
The line of best fit which represent the data set with the smallest
distance between the line and each of the data points.
For a linear regression to work the data set must have two
variables that are correlated.
• Simple linear regression has 2main objectives
1. Establish if there is a relationship between variables
2. Forecast new observation.
• Standard form for simple linear regression
y=
Y = dependent variable

= error term
Line of best fit line
Multiple linear regression model
It is about modeling a data a set with two or more independent
variable and one dependent variable .
Here the dependent variable is expressed as a function of two or
more independent variables in a single equation.
• Assumption of multiple linear regression
1. Only relevant variables are included
2. A linear relationship is required
3. Causality of variables
4. All variables are normally distributed
5. Homoscedasticity is assumed.
6. Absence of multicollinearity is assumed in the model.
• Standard form for multiple regression model is
Y= +
MULTICOLLINEARITY
Multicollinearity refers to a situation in which two or more
explanatory variables in a multiple regression model are highly
linearly related. We have perfect multicollinearity if, for example
as in the equation above, the correlation between two independent
variables is equal to 1 or −1.
a multiple regression model with correlated predictors can indicate
how well the entire bundle of predictors predicts the outcome
variable, but it may not give valid results about any individual
predictor, or about which predictors are redundant with respect to
other
Multicollinearity. It's good to have a relationship between
dependent and independent variables, but it's bad to have a
relationship between independent variables. Effect of single
variable hard to measure.
Heteroskedasticity
• Heteroskedasticity, in statistics, is when the standard deviations of a
variable, monitored over a specific amount of time, are non constant.
Heteroskedasticity often arises in two forms: conditional and
unconditional
Conditional Heteroskedasticity identifies non constant volatility when
future periods of high and low volatility cannot be identified.
Unconditional Heteroskedasticity is used when futures periods of high and
low volatility can be identified.
• Unconditional Heteroskedasticity
Unconditional Heteroskedasticity is predictable, and most often relates to
variables that are cyclical by nature. This can include higher retail sales
reported during the traditional holiday shopping period, or the increase in
air conditioner repair calls during warmer months.
.
In finance, conditional Heteroskedasticity is often seen in the prices of
stocks and bonds. The level of volatility of these equities cannot be
predicted over any period of time. Unconditional Heteroskedasticity can
be used when discussing variables that have identifiable
seasonal variability, such as electricity usage.

As it relates to statistics, Heteroskedasticity, also spelled


Heteroskedasticity, refers to the error variance, or dependence of scatter,
within a minimum of one independent variable within a particular
sample. These variations can be used to calculate the margin of error
between data sets, such as expected results and actual results, as it
provides a measure for the deviation of data points from the mean value.
FACTOR ANALYSIS
• Factor analysis identifies correlation between and among
variables to bind them into one underlying factor
• Factor analysis reduces larger number of variables into
smaller amount of factors.
• E.g. , in a set of variables (V1,V2,V3,V4,V5,V6)
• A correlational relationship may be found between
V1,V2,V3
• So these variables can be identified as factor because
there is higher degree of relationship between these three
things.
• Accordingly large no. of variables will be reduced to
several small no.of factors.
• Factor analysis is also referred to as data reduction.
• Factor analysis consider either pairs of responses or pairs of
variables I.e. Q type and R type factor.
• The important terminology used in factor analysis is a factor which
is the weighted linear combination of the variables understudy.
• The factor loading in factor analysis indicates the extent of
closeness of relationship among variables constituting a factor
• Another term that is needed to be pointed out in factor analysis is
Commonality which indicates the extent of a variable has been
accounted for by underlying factor taken together. Higher the value
of commonality the variable has been considered by the factor and
lower if it left out.
• Eigen value : the sum of squares of factor loading relating to factor
is called as eigenvalue . It indicates the relative importance of factor
in account for the set of variables considered.
• Factor rotation: it is done to reveal different structures in data.
Different structures give different results but they are statistically
equal. There 2 types of rotation Orthogonal and oblique.
• Assumption of factor analysis
No outliers in data set
Adequate sample size
The data set must posses no perfect multicollinearity
Homoscedasticity is not required
Linearity of variables
The data must be at least interval data
CLUSTER ANALYSIS
• Cluster analysis is a process of identifying natural
homogenous group existing in data , so that similarity within
group and difference among group may be used for
understanding the basic character of the data.
• It is applied to large set of data which may consist of many
variables.
• It is applied to data recorded on interval scale
• Here internal homogeneity and external heterogeneity is
determined
• There are basically two types of clusters
Hierarchical cluster
Non hierarchical cluster
• Hierarchical cluster : here first two closest objects are
grouped and treated as single cluster . Then the same
process is carried out until there is a single cluster
containing all the items .
• Non hierarchical clusters.: here the items are disbursed
into predetermined groups successively in integrative
process finally some defined group emerges.

Linkage function of clustering : it is used to find out the


distance between two clusters there are two types of
linkages
Simple linkage
Complete linkage
CONJOINT ANALYSIS
• It is a technique useful in determining relative value of
different attributes of an item
• In marketing research it helps to find out most desirable
combination of a product or service that is existing or proposed
to be introduced in the market.
• Conjoint analysis is applied to categorical variables
• It is done to analyze most important feature of a product.
• It gives relative importance to the factor that are taken for
consideration.
• It helps us to develop alternative sets of combination of
different levels of product.
• The respondents are given a chance to rate or rank accordingly
It is applied in the following fields
• New product development
• Transport industry
DISCRIMINANT ANALYSIS
• It is a statistical technique useful in classification of individuals or
observation into two or more mutually exclusive groups, on the basis of set
of predictor variables.
• In DA there is one nominal dependent variable and two or more interval
scaled independent variables.
• IV have certain common characteristic features which are useful in
discriminating among individuals
• The main object of DA is to classify the observed cases into two or more
groups.
• DA is applied in following areas
1. Credit rating
2. Prediction of sickness
3. Portfolio selection
4. Market research
5. Classification of various attributes
• Discriminant function
• Linear discriminant function
It is a linear function of predictive variables weighted in such a
way that it will discriminate among groups minimizing errors. In
case the dependent variable is classified into only two groups
this is known as simple discriminant analysis
In case dependent variable is classified into more than two
groups it is termed as multiple discriminant function
Bi variate discriminate analysis for two groups
If the no.of variables included in the discriminant function is 2 ,
there is a straight line classification boundary . An individual on
one side belong to group 1 and on the other side belong to group
2
DECOMPOSTION ANALYSIS
• It means analysis of as set of data to reveal its composition and thereby
express it in terms of extent of change over time in its components.
• It reveals the extent of change in structure , the composition and the
intensity of a set of data
• It is suitable for large mass of data such as financial statements,
performance reports , budget etc.
• It reveals significant changes in the structure of data over a period of
time or from one organization to another.
• It pinpoints the area of change
• With availability of computers now large data based statements can be
subjected to decomposition analysis
DA can be applied in the following areas
Business data analysis
Prediction of financial distress
REPORT WRITING
• Research report is a research document that contains
basic aspects of the research project
• Research report is the systematic, articulate, and orderly
presentation of research work in a written form.
• It may be in form of hand-written, typed, or
computerized.
Report writing stages
•  Understanding the report brief
•  Gathering and selecting information
• Organizing your material
• Analyzing your material
• Writing the report
• Reviewing and redrafting
• Presentation
Content of research report
Research report is divided into three parts as:
I. First Part (Formality Part):
(i) Cover page
(ii) Title page
(iii) Certificate or statement
(iv) Index (brief contents)
(v) Table of contents (detailed index)
(vi) Acknowledgement
(vii) List of tables and figures used
(viii) Preface/forwarding/introduction
(ix) Summary report
II. Main Report (Central Part of Report):
(i) Statement of objectives
(ii) Methodology and research design
(iii) Types of data and its sources
(iv) Sampling decisions
(v) Data collection methods
(vi) Data collection tools
(vii) Fieldwork
(viii) Analysis and interpretation (including tables, charts, figures, etc.)
(ix) Findings
(x) Limitations
(xi) Conclusions and recommendations
(xii) Any other relevant detail
III. Appendix (Additional Details):
(i) Copies of forms used
(ii) Tables not included in findings
(iii) A copy of questionnaire
(iv) Detail of sampling and rate of response
(v) Statement of expenses
(vi) Bibliography – list of books, magazines, journals, and
other reports
(vii) Any other relevant information
References
• Research methodology - K.R Sharma
• Methodology of research in social science –
-Dr O R Krishnaswamy , Dr M Ranganathan
• Business research methods – Naval Bajpai
• Research methodology , A step by step guide fro beginners –
Ranjith Kumar
• Introduction to Econometrics –G S Maddala & Kajal Lahiri
• Quantitative techniques Dr K venugopalan
• www.wikipedia.org
THANK YOU

You might also like