Unit Iv (Research Methods in Business)
Unit Iv (Research Methods in Business)
Unit Iv (Research Methods in Business)
PROCESSING OF DATA
1. Editing of Data
2. Coding of Data
3. Classification of Data
4. Tabulation of Data
5. Data Diagrams
1. Editing of Data
Editing is the first step in data processing. Editing is the process of examining
the data collected in questionnaires/schedules to detect errors and omissions
and to see that they are corrected and the schedules are ready for tabulation.
When the whole data collection is over a final and a thorough check up is
made. Mildred B. Parten in his book points out that the editor is responsible
for seeing that the data are;
Accurate as possible,
1
Consistent with other facts secured,
Uniformly entered,
As complete as possible,
Acceptable for tabulation and arranged to facilitate coding tabulation.
TYPES OF EDITING
1. Editing for quality asks the following questions: are the data forms
complete, are the data free of bias, are the recordings free of errors, are the
inconsistencies in responses within limits, are there evidences to show
dishonesty of enumerators or interviewers and are there any wanton
manipulation of data.
2. Editing for tabulation does certain accepted modification to data or even
rejecting certain pieces of data in order to facilitate tabulation. or instance,
extremely high or low value data item may be ignored or bracketed with
suitable class interval.
3. Field Editing is done by the enumerator. The schedule filled up by the
enumerator or the respondent might have some abbreviated writings, illegible
writings and the like. These are rectified by the enumerator. This should be
done soon after the enumeration or interview before the loss of memory. The
field editing should not extend to giving some guess data to fill up omissions.
4. Central Editing is done by the researcher after getting all schedules or
questionnaires or forms from the enumerators or respondents. Obvious
errors can be corrected. For missed data or information, the editor may
substitute data or information by reviewing information provided by likely
placed other respondents. A definite inappropriate answer is removed and
2
“no answer” is entered when reasonable attempts to get the appropriate
answer fail to produce results.
Editors must keep in view the following points while performing their
work:
2. Coding of Data
Coding is necessary for efficient analysis and through it the several replies
may be reduced to a small number of classes which contain the critical
information required for analysis. Coding decisions should usually be taken at
the designing stage of the questionnaire. This makes it possible to pre-code
the questionnaire choices and which in turn is helpful for computer tabulation
as one can straight forward key punch from the original questionnaires. But in
case of hand coding some standard method may be used. One such standard
method is to code in the margin with a colored pencil. The other method can
be to transcribe the data from the questionnaire to a coding sheet. Whatever
3
method is adopted, one should see that coding errors are altogether
eliminated or reduced to the minimum level.
4
answer/codes of all the respondents. Transcription may not be necessary
when only simple tables are required and the number of respondents are few.
3. Classification of Data
Classification or categorization is the process of grouping the statistical data
under various understandable homogeneous groups for the purpose of
convenient interpretation. A uniformity of attributes is the basic criterion for
classification; and the grouping of data is made according to similarity.
Classification becomes necessary when there is a diversity in the data
collected for meaningless for meaningful presentation and analysis. However,
it is meaningless in respect of homogeneous data. A good classification should
have the characteristics of clarity, homogeneity, equality of scale,
purposefulness and accuracy.
Objectives of Classification
1. The complex scattered and haphazard data is organized into concise,
logical and intelligible form.
2. It is possible to make the characteristics of similarities and dis –
similarities clear.
3. Comparative studies is possible.
4. Understanding of the significance is made easier and thereby good deal
of human energy is saved.
5. Underlying unity amongst different items is made clear and expressed.
6. Data is so arranged that analysis and generalization becomes possible.
5
classification according to attributes. The former is the way of, grouping the
variables, say, quantifying the variables in cohesive groups, while the latter
groups the data on the basis of attributes or qualities. Again, it may be
multiple classification or dichotomous classification. The former is the way of
making many (more than two) groups on the basis of some quality or
attributes while the latter is the classification into two groups on the basis of
presence or absence of a certain quality. Grouping the workers of a factory
under various income (class intervals) groups come under the multiple
classification; and making two groups into skilled workers and unskilled
workers is the dichotomous classification. The tabular form of such
classification is known as statistical series, which may be inclusive or
exclusive.
4. Tabulation of Data
Tabulation is the process of summarizing raw data and displaying it in
compact form for further analysis. Therefore, preparing tables is a very
important step. Tabulation may be by hand, mechanical, or electronic. The
choice is made largely on the basis of the size and type of study, alternative
costs, time pressures, and the availability of computers, and computer
programmes. If the number of questionnaire is small, and their length short,
hand tabulation is quite satisfactory.
Table may be divided into: (i) Frequency tables, (ii) Response tables, (iii)
Contingency tables, (iv) Uni-variate tables, (v) Bi-variate tables, (vi) Statistical
table and (vii) Time series tables.
6
Generally a research table has the following parts: (a) table number, (b) title
of the table, (c) caption (d) stub (row heading), (e) body, (f) head note, (g) foot
note.
7
14. Source : Source of data must be given. For primary data, write primary
data.
It is always necessary to present facts in tabular form if they can be presented
more simply in the body of the text. Tabular presentation enables the reader
to follow quickly than textual presentation. A table should not merely repeat
information covered in the text. The same information should not, of course
be presented in tabular form and graphical form. Smaller and simpler tables
may be presented in the text while the large and complex table may be placed
at the end of the chapter or report.
5. Data Diagrams
Diagrams are charts and graphs used to present data. These facilitate getting
the attention of the reader more. These help presenting data more effectively.
Creative presentation of data is possible. The data diagrams classified into:
1. Charts: A chart is a diagrammatic form of data presentation. Bar charts,
rectangles, squares and circles can be used to present data. Bar charts are uni-
dimensional, while rectangular, squares and circles are two-dimensional.
2. Graphs: The method of presenting numerical data in visual form is called
graph, A graph gives relationship between two variables by means of either a
curve or a straight line. Graphs may be divided into two categories. (1) Graphs
of Time Series and (2) Graphs of Frequency Distribution. In graphs of time
series one of the factors is time and other or others is / are the study factors.
Graphs on frequency show the distribution of by income, age, etc. of
executives and so on.
8
CONCEPT OF STANDARD ERROR
Standard error (SE) is a statistic that reveals how accurately sample data
represents the whole population. It measures the accuracy with which a
sample distribution represents a population by using standard deviation.
Step 1: Note the number of measurements (n) and determine the sample
mean (μ). It is the average of all the measurements.
Step 2: Determine how much each measurement varies from the mean.
Step 3: Square all the deviations determined in step 2 and add altogether: Σ(xi
– μ)²
Step 4: Divide the sum from step 3 by one less than the total number of
measurements (n-1).
Step 5: Take the square root of the obtained number, which is the standard
deviation (σ).
Step 6: Finally, divide the standard deviation obtained by the square root of
the number of measurements (n) to get the standard error of your estimate.
9
CRITERIA FOR JUDGING SIGNIFICANCE AT VARIOUS LEVELS
p-value
The p-value helps to quantify the proof against the null hypothesis:
a large p-value suggests that the observed effect is very likely if the null
hypothesis is true.
a small p-value (equal to or less than the significance level) suggests that the
observed evidence is not very likely if the null hypothesis is true – i.e. either a
very unusual event has happened or the null hypothesis is incorrect.
The p-value is compared with a pre-defined cut-off for the test (significance
level). If it is smaller than this value, the estimated effect is considered to be
significant. Often a p-value of 0.05 or 0.01 (written ‘p ≤ 0.05’ or ‘p ≤ 0.01’) are
chosen as cut-offs.
10
HYPOTHESIS
11
remember that simplicity of the hypothesis has nothing to do with its
significance.
6. Hypothesis should be consistent with most known facts i.e., it must be
consistent with a substantial body of established facts. In other words, it
should be one which judges accept as being the most likely.
7. Hypothesis should be amenable to testing within a reasonable time.
One should not use even an excellent hypothesis, if the same cannot be tested
in a reasonable time for one cannot spend a lifetime collecting data to test it.
8. Hypothesis must explain the facts that gave rise to the need for
explanation. This means that by using the hypothesis plus other known and
accepted generalizations, one should be able to deduce the original problem
condition. Thus hypothesis must actually explain what it claims to explain; it
should have the empirical reference.
IMPORTANCE OF HYPOTHESIS
1. Helps in the testing of the theories.
2. Serves as a great platform in the investigation activities.
3. Provides guidance to the research work or study.
4. Hypothesis sometimes suggests theories.
5. Helps in knowing the needs of the data.
6. Explains social phenomena.
7. Develops the theory.
8. Also acts as a bridge between the theory and the investigation.
12
FORMULATION OF HYPOTHESIS
(i) Making a formal statement: The step consists in making a formal
statement of the null hypothesis (H0) and also of the alternative hypothesis
(Ha). This means that hypotheses should be clearly stated, considering the
nature of the research problem. For instance, Mr. Mohan of the Civil
Engineering Department wants to test the load bearing capacity of an old
bridge which must be more than 10 tons, in that case he can state his
hypotheses as under:
Null hypothesis H0 : m = 10 tons
Alternative Hypothesis Ha: m > 10 tons
Take another example. The average score in an aptitude test administered at
the national level is 80.
To evaluate a state’s education system, the average score of 100 of the state’s
students selected on random basis was 75. The state wants to know if there is
a significant difference between the local scores and the national scores. In
such a situation the hypotheses may be stated as under:
Null hypothesis H0: m = 80
Alternative Hypothesis Ha: m ¹ 80
The formulation of hypotheses is an important step which must be
accomplished with due care in accordance with the object and nature of the
problem under consideration. It also indicates whether we should use a one-
tailed test or a two-tailed test. If Ha is of the type greater than (or of the type
lesser than), we use a one-tailed test, but when Ha is of the type “whether
greater or smaller” then we use a two-tailed test.
(ii) Selecting a significance level: The hypotheses are tested on a pre-
determined level of significance and as such the same should be specified.
13
Generally, in practice, either 5% level or 1% level is adopted for the purpose.
The factors that affect the level of significance are: (a) the magnitude of the
difference between sample means; (b) the size of the samples; (c) the
variability of measurements within samples; and (d) whether the hypothesis
is directional or non-directional (A directional hypothesis is one which
predicts the direction of the difference between, say, means). In brief, the level
of significance must be adequate in the context of the purpose and nature of
enquiry.
(iii) Deciding the distribution to use: After deciding the level of significance,
the next step in hypothesis testing is to determine the appropriate sampling
distribution. The choice generally remains between normal distribution and
the t-distribution. The rules for selecting the correct distribution are similar to
those which we have stated earlier in the context of estimation.
(iv) Selecting a random sample and computing an appropriate value:
Another step is to select a random sample(s) and compute an appropriate
value from the sample data concerning the test statistic utilizing the relevant
distribution. In other words, draw a sample to furnish empirical data.
(v) Calculation of the probability: One has then to calculate the probability
that the sample result would diverge as widely as it has from expectations, if
the null hypothesis were in fact true.
TYPES OF HYPOTHESIS
1. Working Hypothesis
Working hypothesis is a preliminary assumption of the researcher about the
research topic, particularly when sufficient information is not available to
establish a hypothesis, and as a step towards formulating the final research
14
hypothesis. Working hypotheses are used to design the final research plan, to
place the research problem in its right context and to reduce the research
topic to an acceptable size.
2. Scientific Hypothesis
Scientific hypothesis contains statement based on or derived from sufficient
theoretical and empirical data.
3. Alternative Hypothesis
Alternative hypothesis is a set of two hypothesis (research and null) which
states the opposite of the null hypothesis. In statistical tests of null hypothesis,
acceptance of Ho (null hypothesis) means rejection of the alternative
hypothesis; and rejection of Ho means similarly acceptance of the alternative
hypothesis.
4. Research Hypothesis
Research hypothesis is a researcher’s proposition about some social fact
without reference to its particular attributes. Researcher believes that it is
true and wants that it should be disproved, e.g., Muslims have more children
than Hindus, or drug abuse is found among upper-class students living in
hostels or rented rooms. Research hypothesis may be derived from theories
or may result in developing of theories.
5. Null Hypothesis
Null hypothesis is reverse of research hypothesis. It is a hypothesis of no
relationship. Null hypothesis does not exist in reality but are used to test
research hypothesis.
6. Statistical Hypothesis
Statistical hypothesis, according to winter, is a statement/observation about
statistical populations that one seeks to support or refute. The things are
15
reduced to numerical quantities and decisions are made about these
quantities, e.g., income difference between two groups: group A is richer than
group B. Null hypothesis will be: group A is not richer than group B. Here,
variables are reduced to measurable quantities.
BASIC CONCEPTS CONCERNING TESTING OF HYPOTHESES
16
the lot and plan our decision saying that if there are none or only 1 defective
item among the 10, we will accept H0 otherwise we will reject H0 (or accept
Ha). This sort of basis is known as decision rule.
(d) Type I and Type II errors: In the context of testing of hypotheses, there
are basically two types of errors we can make. We may reject H0 when H0 is
true and we may accept H0 when in fact H0 is not true. The former is known
as Type I error and the latter as Type II error. In other words, Type I error
means rejection of hypothesis which should have been accepted and Type II
error means accepting the hypothesis which should have been rejected. Type I
error is denoted by (alpha) known as error, also called the level of
significance of test; and Type II error is denoted by (beta) known as error,
17
FLOW DIAGRAM FOR HYPOTHESIS TESTING
18