CH6 @data Analysis @new
CH6 @data Analysis @new
CH6 @data Analysis @new
Data analysis and interpretation refers the application of deductive and inductive logic to the research process. The
data are often classified by division into, subgroups, and are then analyzed and synthesized in such a way that
hypothesis may be verified or rejected. The final result may be a new principle or generalization. Data are examined
in terms of comparison between the more homogeneous segments within the group any by comparison with some
outside criteria.
Each statistical method is based upon its own or specific assumptions regarding the sample, population and research
conditions. Unless these factors are considered in advance the researcher may find that it is impossible to make
valid comparison for purpose of inferences.
The followings are the stages through which the raw data must be processed in order to ultimately deliver the final
products: Editing, Coding, Classification, and Tabulation.
6.2.1 Editing
Editing refers to finding and removing any errors, incompleteness or inconsistency of the data. If the raw data are
erroneous in complete, or inconsistent, these deficiencies will be carried through all subsequent stages of processing
and will greatly distort the results of any inquiry. Therefore, at this stage, certain questions are specified for 100
percent editing because they are known to be especially troublesome or particularly critical to study objectives. The
editor is responsible for seeing that the data are:
(i) as accurate as possible;
(ii) consistent with other facts secured;
(iii) uniformly entered;
Page 1 of 5
(iv) as complete as possible;
(v) acceptable for tabulation; and
(vi) arranged to facilitate coding and tabulation with regard to points or stages at which editing should be done,
one can talk of field editing and control editing.
6.2.2 Coding
Coding refers to the process of assigning numerals; or other symbols to answers, so that responses can be put in to a
limited number of categories or classes. Coding is necessary for efficient analysis and through it the several replies
may be reduced to a small member of classes which contain the critical information required for analysis. Coding
decisions should usually be taken at the designing stage of the questionnaire.
6.2.3 Classification
Most research studies result in a large volume of raw data which must reduced in to homogenous groups for getting
meaningful relationships. In this step data having common characteristics are placed in one class and in this way
the entire data get divided into a number of groups or classes. Classification can be of like following two types,
depending upon the nature of the phenomenon involved:
a) Classification according to Attributes: Data are classified on the basis of common characteristics which can
either be descriptive or numerical. Descriptive characteristics refer to qualitative phenomenon which cannot be
measured quantitatively; only their presence or absence in an individual item can be noticed. Data obtained this
way on the basis of certain attributes are known as statistics of attributes and their classification is said to be
classification according to attributes. Such classifications can be simple or manifold classification.
In simple classifications we consider only one attribute and divide the universe in to two classes- one
consisting of items possessing attributes and the other class consisting of items which do not possess
the given attribute
Manifold classification we consider two or more attributes simultaneously, and divide the data in to
number of classes.
b) Classification according to class-intervals: Unlike descriptive characteristics, the numerical characteristics
refer to quantitative phenomena which can be measured through some statistical units. Data relating to income
production, age, weight etc. come under this Category. Such data are known as statistics of variables and are
classified on the basis of class intervals.
Page 2 of 5
6.2.4 Tabulation
When a mass data has been assembled, it becomes necessary for the researcher to arrange the same in some kind of
concise and logical order. This procedure is referred to as tabulation. Thus, tabulation is the process of summarizing
raw data and displaying the same in compact form for the further analysis. In the broader sense, tabulation is an
orderly arrangement of data in columns and rows.
Tabulation can be done by hand or by mechanical or electronic devices. The choice depends on the size and
type of study, cost conditions time pressures and the availability of tabulating machines or computers.
(i) Every table should have clear, concise and adequate title so as to make the table intelligible without
reference to the text.
(ii) Every table should be given distinct number to facilitate easy reference.
(iii) The column headings and row headings of the table should be clear and brief.
(iv) Units of measurement under each heading or sub heading must always be indicated
(v) Explanatory foot notes, if any, concerning the table should be placed directly beneath the table.
(vi) Source of data must be indicated below the table.
(vii) It is generally considered better to approximate figures before tabulation as the same would reduce
unnecessary details in the table itself.
(viii) In order to emphasize the relative significance of certain categories, different kinds of type, spacing and
indentations may be used.
(ix) Abbreviations should be avoided to the extent possible and bullet marks should not be used in the table
(x) Table should be made as logical, clear, accurate and simple as possible .Very large data should not be
crowded in a single table
(xi) The arrangement of the categories in the table may be chronological, geographical, alphabetical or according
to magnitude to facilitate comparison.
6.3 DATA ANALYSIS
Data analysis means studying the tabulated material in order to determine inherent facts or meanings. It involves
breaking down existing complex factors into simpler parts and putting the parts together in new arrangements for
the purpose of interpretation.
A plan of analysis can and should be prepared in advance before the actual collection of material. A preliminary
analysis on the skeleton plan should as the investigation proceeds, develop into a complete final analysis enlarged
Page 3 of 5
and reworked as and when necessary. This process requires an alert, flexible and open mind. Caution is necessary at
every step. In case where a plan of analysis has not been made before hand. The four helpful modes to get started on
analyzing the gathered data:
(i) To think in terms of significant tables that the data permit.
(ii) To examine carefully the statement of the problem and the earlier analysis and to study the original
records of the data.
(iii) To get away from the data and to think about the problem in layman’s terms.
(iv) To attack the data by making various simple statistical calculations.
In the general process of analysis of research data, statistical method has contributed a great deal.
Simple statistical calculation finds a place in almost any research study dealing with large or even small groups of
individuals, while complex statistical computations form the basis of many types of research. It may not be out of
place, therefore to enumerate some statistical methods of analysis used in educational research.
After administering and scoring research tools scripts, data collected and organized. The collected data are known
as ‘raw data.’ The raw data are meaningless unless certain statistical treatment is given to them. Analysis of data
means to make the raw data meaningful or to draw some results from the data after the proper treatment. The ‘null
hypotheses’ are tested with the help of analysis data so to obtain some significant results. Thus, the analysis of data
serves the following main functions:
(i) To make the raw data meaningful,
(ii) To test null hypothesis,
(iii) To obtain the significant results,
(iv) To draw some inferences or make generalization, and
(v) To estimate parameters.
There are two approaches which are employed in analysis of data: Parametric analysis of data and non-parametric
analysis of data.
The terms "statistics" and "data analysis" mean the same thing. It is the study of how we describe, combine, and
make inferences from numbers. A lot of people are scared of numbers (quant phobia), but statistics has got less to
do with numbers, and more to do with rules for arranging them. It even lets you create some of those rules yourself,
so instead of looking at it like a lot of memorization, it's best to see it as an extension of the research mentality,
something researchers do (crunch numbers) to obtain complete and total power over the numbers. After awhile, the
principles behind the computations become clear, and there's no better way to accomplish this than by
understanding the research purpose of statistics.
Without statistics, all you're doing is making educated guesses. In social science, that may seem like all that's
necessary, since we're studying the obvious anyway. However, there's a difference between something socially, or
Page 4 of 5
meaningfully significant, and something statistically significant. Statistical significance is first of all, short and
simple. You communicate as much with just one number as a paragraph of description. Some people don't like
statistics because of this reductionism, but it's become the settled way researchers communicate with one another.
Secondly, statistical significance is what policy and decision making is based on. Policymakers will dismiss
anything non-statistical as anecdotal evidence.
Finally, just because something is statistically significant doesn't make it true. It's better than guessing, but you can
lie and deceive with statistics. Since they can mislead you, there's no substitute for knowing something about the
topic so that, as is the most common interpretative approach, the researcher is able to say what is both meaningful
and statistically significant.
1) Descriptive statistics fall into one of two categories: measures of central tendency (mean, median, and mode)
or measures of dispersion (standard deviation and variance). Their purpose is to explore hunches that may have
come up during the course of the research process, but most people compute them to look at the normality of
their numbers. Examples include descriptive analysis of sex, age, race, social class, and so forth.
2) Relational statistics: The most commonly used relational statistic is correlation; and it's a measure of the
strength of some relationship between two variables, not causality. Interpretation of a correlation coefficient
does not even allow the slightest hint of causality. The most a researcher can say is that the variables share
something in common; that is, are related in some way. The more two things have something in common, the
more strongly they are related. There can also be negative relations, but the important quality of correlation
coefficients is not their sign, but their absolute value. Relational analysis fall into one of three categories:
univariate, bivariate, and multivariate analysis. Univariate analysis is the study of one variable for a
subpopulation, for example, age of murderers, and the analysis is often descriptive. Bivariate analysis is the
study of a relationship between two variables, for example, murder and meanness, and the most commonly
known technique here is correlation. Multivariate analysis is the study of relationship between three or more
variables, for example, murder, meanness, and gun ownership, and for all techniques in this area, you simply
take the word "multiple" and put it in front of the bivariate technique used, as in multiple correlation.
3) Inferential statistics, also called inductive statistics, fall into one of two categories: tests for difference of
means and tests for statistical significance, the latter which are further subdivided into parametric or
nonparametric, depending upon whether you're inferring to the larger population as a whole (parametric) or the
people in your sample (nonparametric). The purpose of difference of means tests is to test hypotheses, and the
most common techniques are called Z-tests. The most common parametric tests of significance are the F-test,
t-test, and regression. Regression is the closest thing to estimating causality in data analysis, and that's
because it predicts how much the numbers "fit" a projected straight line.
Page 5 of 5