CH6 @data Analysis @new

CHAPTER SIX
DATA PROCESSING AND ANALYZING

6.1 Introduction
After the data collection, the data has to be processed and analyzed for the purpose of attaining the objectives the
study or to answer the research question/s developed at the time of the research plan. Technically speaking,
processing implies editing, coding, classification and tabulation of collected data so that they are amenable to
analysis. The term analysis refers to the computation of certain measures along with searching for patterns of
relationship that exist among data-groups. Thus, “in the process of analysis, relationships or differences supporting
or conflicting with original or new hypotheses should be subjected to statistical tests of significance to determine
with what validity data can be said to indicate any conclusions”. But there are persons who do not like to make
difference between processing and analysis. They discourse that analysis of data in a general way involves a number
of closely related operations which are performed with the purpose of summarizing the collected data and
organizing these in such a manner that they answer the research question(s).
Data analysis and interpretation refers the application of deductive and inductive logic to the research process. The
data are often classified by division into, subgroups, and are then analyzed and synthesized in such a way that
hypothesis may be verified or rejected. The final result may be a new principle or generalization. Data are examined
in terms of comparison between the more homogeneous segments within the group any by comparison with some
outside criteria.
Each statistical method is based upon its own or specific assumptions regarding the sample, population and research
conditions. Unless these factors are considered in advance the researcher may find that it is impossible to make
valid comparison for purpose of inferences.
6.2 Data Preparation

After collecting data from the field, the researcher has to process and analyze them in order to arrive at certain
conclusions which may or may not support the hypothesis which he/she had formulated towards the beginning of
his research work. Planning for data processing must be done well in advance of field work as an integral part of the
research design.
The followings are the stages through which the raw data must be processed in order to ultimately deliver the final
products: Editing, Coding, Classification, and Tabulation.
6.2.1 Editing
Editing refers to finding and removing any errors, incompleteness or inconsistency of the data. If the raw data are
erroneous in complete, or inconsistent, these deficiencies will be carried through all subsequent stages of processing
and will greatly distort the results of any inquiry. Therefore, at this stage, certain questions are specified for 100
percent editing because they are known to be especially troublesome or particularly critical to study objectives. The
editor is responsible for seeing that the data are:
(i) as accurate as possible;
(ii) consistent with other facts secured;
(iii) uniformly entered;
Page 1 of 5
(iv) as complete as possible;
(v) acceptable for tabulation; and
(vi) arranged to facilitate coding and tabulation with regard to points or stages at which editing should be done,
one can talk of field editing and control editing.
6.2.2 Coding
Coding refers to the process of assigning numerals; or other symbols to answers, so that responses can be put in to a
limited number of categories or classes. Coding is necessary for efficient analysis and through it the several replies
may be reduced to a small member of classes which contain the critical information required for analysis. Coding
decisions should usually be taken at the designing stage of the questionnaire.
The followings are important guidelines for coding
a) Codes should be mutually exclusive

b) Set of categories should be collectively exhaustive so that all responses should be classified in one or the
other category
c) Separate categories should be created for recording ‘non-response’ and ‘no knowledge’ response.
d) Inter coder and intra-coder agreement tests should be conducted through out the entire coding process to
check its reliability.
e) To help ensure that responses are being coded systematically.
6.2.3 Classification
Most research studies result in a large volume of raw data which must reduced in to homogenous groups for getting
meaningful relationships. In this step data having common characteristics are placed in one class and in this way
the entire data get divided into a number of groups or classes. Classification can be of like following two types,
depending upon the nature of the phenomenon involved:
a) Classification according to Attributes: Data are classified on the basis of common characteristics which can
either be descriptive or numerical. Descriptive characteristics refer to qualitative phenomenon which cannot be
measured quantitatively; only their presence or absence in an individual item can be noticed. Data obtained this
way on the basis of certain attributes are known as statistics of attributes and their classification is said to be
classification according to attributes. Such classifications can be simple or manifold classification.
 In simple classifications we consider only one attribute and divide the universe in to two classes- one
consisting of items possessing attributes and the other class consisting of items which do not possess
the given attribute
 Manifold classification we consider two or more attributes simultaneously, and divide the data in to
number of classes.
b) Classification according to class-intervals: Unlike descriptive characteristics, the numerical characteristics
refer to quantitative phenomena which can be measured through some statistical units. Data relating to income
production, age, weight etc. come under this Category. Such data are known as statistics of variables and are
classified on the basis of class intervals.
Page 2 of 5
6.2.4 Tabulation
When a mass data has been assembled, it becomes necessary for the researcher to arrange the same in some kind of
concise and logical order. This procedure is referred to as tabulation. Thus, tabulation is the process of summarizing
raw data and displaying the same in compact form for the further analysis. In the broader sense, tabulation is an
orderly arrangement of data in columns and rows.
Tabulation is essential because of the following reasons
a) It conserves space and reduces explanatory and descriptive statement to a minimum

b) It facilitates the process of comparison
c) It facilitates the summation of items and the detection of errors and omissions
d) It provides a basis for various statistical computations.
Tabulation can be done by hand or by mechanical or electronic devices. The choice depends on the size and
type of study, cost conditions time pressures and the availability of tabulating machines or computers.
Generally accepted principles of tabulation are:
(i) Every table should have clear, concise and adequate title so as to make the table intelligible without
reference to the text.
(ii) Every table should be given distinct number to facilitate easy reference.
(iii) The column headings and row headings of the table should be clear and brief.
(iv) Units of measurement under each heading or sub heading must always be indicated
(v) Explanatory foot notes, if any, concerning the table should be placed directly beneath the table.
(vi) Source of data must be indicated below the table.
(vii) It is generally considered better to approximate figures before tabulation as the same would reduce
unnecessary details in the table itself.
(viii) In order to emphasize the relative significance of certain categories, different kinds of type, spacing and
indentations may be used.
(ix) Abbreviations should be avoided to the extent possible and bullet marks should not be used in the table
(x) Table should be made as logical, clear, accurate and simple as possible .Very large data should not be
crowded in a single table
(xi) The arrangement of the categories in the table may be chronological, geographical, alphabetical or according
to magnitude to facilitate comparison.
6.3 DATA ANALYSIS
Data analysis means studying the tabulated material in order to determine inherent facts or meanings. It involves
breaking down existing complex factors into simpler parts and putting the parts together in new arrangements for
the purpose of interpretation.
A plan of analysis can and should be prepared in advance before the actual collection of material. A preliminary
analysis on the skeleton plan should as the investigation proceeds, develop into a complete final analysis enlarged
Page 3 of 5
and reworked as and when necessary. This process requires an alert, flexible and open mind. Caution is necessary at
every step. In case where a plan of analysis has not been made before hand. The four helpful modes to get started on
analyzing the gathered data:
(i) To think in terms of significant tables that the data permit.
(ii) To examine carefully the statement of the problem and the earlier analysis and to study the original
records of the data.
(iii) To get away from the data and to think about the problem in layman’s terms.
(iv) To attack the data by making various simple statistical calculations.
In the general process of analysis of research data, statistical method has contributed a great deal.
Simple statistical calculation finds a place in almost any research study dealing with large or even small groups of
individuals, while complex statistical computations form the basis of many types of research. It may not be out of
place, therefore to enumerate some statistical methods of analysis used in educational research.
6.3.1 Need for Data Analysis
After administering and scoring research tools scripts, data collected and organized. The collected data are known
as ‘raw data.’ The raw data are meaningless unless certain statistical treatment is given to them. Analysis of data
means to make the raw data meaningful or to draw some results from the data after the proper treatment. The ‘null
hypotheses’ are tested with the help of analysis data so to obtain some significant results. Thus, the analysis of data
serves the following main functions:
(i) To make the raw data meaningful,
(ii) To test null hypothesis,
(iii) To obtain the significant results,
(iv) To draw some inferences or make generalization, and
(v) To estimate parameters.
There are two approaches which are employed in analysis of data: Parametric analysis of data and non-parametric
analysis of data.
6.3.2 STATISTICAL DATA ANALYSIS
The terms "statistics" and "data analysis" mean the same thing. It is the study of how we describe, combine, and
make inferences from numbers. A lot of people are scared of numbers (quant phobia), but statistics has got less to
do with numbers, and more to do with rules for arranging them. It even lets you create some of those rules yourself,
so instead of looking at it like a lot of memorization, it's best to see it as an extension of the research mentality,
something researchers do (crunch numbers) to obtain complete and total power over the numbers. After awhile, the
principles behind the computations become clear, and there's no better way to accomplish this than by
understanding the research purpose of statistics.
Without statistics, all you're doing is making educated guesses. In social science, that may seem like all that's
necessary, since we're studying the obvious anyway. However, there's a difference between something socially, or
Page 4 of 5
meaningfully significant, and something statistically significant. Statistical significance is first of all, short and
simple. You communicate as much with just one number as a paragraph of description. Some people don't like
statistics because of this reductionism, but it's become the settled way researchers communicate with one another.
Secondly, statistical significance is what policy and decision making is based on. Policymakers will dismiss
anything non-statistical as anecdotal evidence.
Finally, just because something is statistically significant doesn't make it true. It's better than guessing, but you can
lie and deceive with statistics. Since they can mislead you, there's no substitute for knowing something about the
topic so that, as is the most common interpretative approach, the researcher is able to say what is both meaningful
and statistically significant.
Methods of Statistical Data Analysis

There are three general areas that make up the field of statistics: descriptive statistics, relational statistics, and
inferential statistics.
1) Descriptive statistics fall into one of two categories: measures of central tendency (mean, median, and mode)
or measures of dispersion (standard deviation and variance). Their purpose is to explore hunches that may have
come up during the course of the research process, but most people compute them to look at the normality of
their numbers. Examples include descriptive analysis of sex, age, race, social class, and so forth.
2) Relational statistics: The most commonly used relational statistic is correlation; and it's a measure of the
strength of some relationship between two variables, not causality. Interpretation of a correlation coefficient
does not even allow the slightest hint of causality. The most a researcher can say is that the variables share
something in common; that is, are related in some way. The more two things have something in common, the
more strongly they are related. There can also be negative relations, but the important quality of correlation
coefficients is not their sign, but their absolute value. Relational analysis fall into one of three categories:
univariate, bivariate, and multivariate analysis. Univariate analysis is the study of one variable for a
subpopulation, for example, age of murderers, and the analysis is often descriptive. Bivariate analysis is the
study of a relationship between two variables, for example, murder and meanness, and the most commonly
known technique here is correlation. Multivariate analysis is the study of relationship between three or more
variables, for example, murder, meanness, and gun ownership, and for all techniques in this area, you simply
take the word "multiple" and put it in front of the bivariate technique used, as in multiple correlation.
3) Inferential statistics, also called inductive statistics, fall into one of two categories: tests for difference of
means and tests for statistical significance, the latter which are further subdivided into parametric or
nonparametric, depending upon whether you're inferring to the larger population as a whole (parametric) or the
people in your sample (nonparametric). The purpose of difference of means tests is to test hypotheses, and the
most common techniques are called Z-tests. The most common parametric tests of significance are the F-test,
t-test, and regression. Regression is the closest thing to estimating causality in data analysis, and that's
because it predicts how much the numbers "fit" a projected straight line.
Page 5 of 5

CH6 @data Analysis @new

Uploaded by

Copyright:

Available Formats

CH6 @data Analysis @new

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH6 @data Analysis @new

Uploaded by

Copyright:

Available Formats

CHAPTER SIX

DATA PROCESSING AND ANALYZING

6.2 Data Preparation

The followings are important guidelines for coding

a) Codes should be mutually exclusive

Tabulation is essential because of the following reasons

a) It conserves space and reduces explanatory and descriptive statement to a minimum

Generally accepted principles of tabulation are:

6.3.1 Need for Data Analysis

6.3.2 STATISTICAL DATA ANALYSIS

Methods of Statistical Data Analysis

You might also like