Data Editing and Validation
Data Editing and Validation
Preeti Shukla
Data
Editing
Technically speaking, Data editing is a processing of data involving text editing , coding, classification and tabulation so that it can be analyzed and further validated.
Students Weight 20
Students Age 10
25 23
12 8
A B A
Analysis of data
(Searching patterns of relationship among the groups of data)
Discriptive Analysis
(Studies of distributions of variables)
Inferential Analysis
(Hypotheses Testing for drawing inference)
One-dimensional Analysis
Parametric tests Bivariate analysis
Multivariate Analysis
One-dimensional Analysis
Involves calculation of several measures mostly concerning one variable. These measures include: 1. Measures of central tendency ( Mean , Median and
Mode)
number of variables.
Xi X = n
series.
Measures of dispersion
It gives the idea about how the values of variables are scattered around the mean value of a series. These measures are: 1. Range: Simplest possible measure of dispersion, defined as difference between the two extreme values of a variable.
Range = (Highest value
of variable)
_
1.
Mean deviation: It is the average of the difference of values of variables from average of the series. Mean deviation =
| Xi X | n
of the average of squares of deviation obtained from arithmetic mean. Standard deviation = (Xi X )2
In case of Positive skewness, we have Mode < Median < Mean and in case of negative skewness we have Mean < Median < Mode.
Bivariate Analysis
It involves analysis of two variables or attributes in a two way classification. It involves the following methods: 1. Simple regression and correlation 2. Association of attributes 3. Two-way ANOVA (Software- SPSS)
Height Weight Altitude
Multi-variate analysis
It is the simultaneous analysis of more than two variables/ attributes in a multiway classification. It involves :
1. 2. 3. 4. Multiple regression and multiple correlation Multi-ANOVA Canonical analysis Cluster analysis
Height Weight Fathers income
Inferential analysis
It is done through Parametric tests of Hypotheses and Nonparametric Tests of hypotheses.
2 test
For comparing the sample variance to theoretical variance.
Nonparametric Tests
(No assumptions)
1. Sign Test 2. Fisher-Irwin Test 3. Rank correlation 4. One sample runs Test 5. Chi-square Test
Smallest difference is ranked as 1 and so on. First ,absolute vales are taken (ignoring signs), then signs are restored.
1. The ranks with negative signs total 15 and with positive signs are 40.
2. From the theoretical value we should have a rank sum of 8 for 10 pairs. 3. Since 15 > 8 , the data supports null hypotheses that elongation is unaffected by electric current treatment.
References:
en.wikipedia.org Research Methodology, C. R. Kothari Statistical Methods, Snedecor & Cochran
Acknowledgement
Dr. L. M. Tewari sir Dr. Ashish Tewari sir Dr. Chitra Pandey mam Dr. Geeta Tewari mam Respected Teachers and dear friends