Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
14 views

Quantitative Research Methods - Data Processing and Analysis

Uploaded by

Semalign
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Quantitative Research Methods - Data Processing and Analysis

Uploaded by

Semalign
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Quantitative Research Methods

Data processing and Analysis

Zelalem G. Terfa (PhD)


Canter for Environment and Development
College of Development Studies
Addis Ababa University
Steps on data processing

1 2 3 4 5 6 7 8
Editing Coding Entering Managing Cleaning Recoding Handling Generating
data data data data data data missing descriptive
data statistics
Steps on ▪ Editing: Editing is a process of examining the collected
raw data to detect errors and omissions and to correct
data these when possible.
– To assure that the data are accurate, consistent with
processing other facts gathered,
– We need to do both field editing and central
editing

▪ Field editing: the review of forms by the investigator for


completing what the interviewer
– Done as soon as possible after the interview,
preferably on the very day.

▪ Central editing: when all forms or questionnaires have


been completed and returned to the office.
– The editor must strike out the answer if the same is
inappropriate and he has no basis for determining
the correct answer or the response
Steps on data processing
▪ Data Coding: refers to the process of assigning
numerals to answers.

– Coding decisions should usually be taken at


the designing stage of the questionnaire

– These classes must possess the


characteristic of exhaustiveness and
mutual exclusivity category
Steps on data processing
▪ Data entry: involves from paper to data management software. It is
less common these days
▪ Data managing: involves data collection, organizing data, protecting
and storing data.
▪ Dealing with missing data
– Sample unit non-response
• Can use population weighting that can adjust for selection
bias and unit response bias
– Item non-response
• Two ways of dealing with missing data:
– Case deletion : observation/cases with missing data
points are removed from analysis altogether. This is the
most common method of handling missing data.
– Imputation methods: Impute values for missing cases,
such as recoding them to the mean of the valid cases
and perhaps making some other adjustments
Steps on data processing
Imputation methods:

▪ Unconditional Mean Imputation: replacing missing value of


variables with average of observed or non-missing observations for
that variable.

▪ Imputation from Unconditional Distribution: this imputation


methods replace the missing values with random draws from the
observed values. It preserve the distribution shape.

▪ Conditional Mean Imputation: this method fits a regression model


for the case with non-missing observations. Then, the estimated
equation with the parameter estimates and known or non-missing
covariates is used to predict the missing values of the dependent
variable.
Steps on data processing

Imputation methods:

▪ Imputation from conditional distribution: this is a simple extension over regression


imputation method that prevent distortion of co-variances in conditional mean
imputation by adding a stochastic term with mean 0 and variance equal to the
residual variance from the regression.

▪ Multiple Imputation: the previous discussed imputation methods fill in missing


data gaps with a single value and therefore referred as single imputation method.
– Multiple imputation incorporate a random components in each imputed value
rules out the uniqueness of data set. Here a vector of imputed values, rather
than a single value, is generated for each missing datum.
Quantitative Data
Analysis
Quantitative Data Analysis
Types of variables – a revisit

• Interval/ratio variables: Variables where the


distances between the categories are identical
across the range
• Ordinal variables: Variables whose categories can
be rank ordered but the distances between the
categories are not equal across the range
• Nominal variables: Variables whose categories
cannot be rank ordered; also known as categorical
▪ Dichotomous variables: Variables containing data
that have only two categories

It is important to understand types of variables before


starting the analysis
• The data produced from quantitative research can be
analysed using both descriptive and inferential
statistics.

Descriptive statistics

• These involve summarising and depicting data.


Quantitative • One type of descriptive statistics are average values, or
measures of central tendency.
Data Analysis • Mean: sum of all values divided by the number (or
frequency) of the values.
• Mode: most common value. For example, the most
common test results amongst a cohort of students.
• Median: value mid-way through the whole range, with
50% of values below, and 50% above. For example, the
height that is mid-way through the range of heights in
a class of children
Quantitative Data
Analysis
• Measures of dispersion: These indicate
how spread out the data are.

• Range: largest value minus the smallest.

• Standard deviation: the average amount


by which sample values vary from the
mean.

• It may also be useful to show data in


diagrammatic form within your Results,
and charts and plots
Quantitative Data
Analysis
Inferential Statistics

▪ It is important for estimation of population values and


testing statistical hypothesis

▪ It is concerned with making generalization based on


sample data by using some concepts of probability.

▪ In general, it consists of generalizing from sample to


population, performing hypothesis testing,
determining relationships among variables, making
predictions, etc.

• It may involve two variable (bivariate analysis) or


multiple variables (multivariate analysis)
Quantitative Data
Analysis
• Which statistical test(s) should you use?

• That depends on your research question, and you


should aim to decide which statistical tests you will
use as part of your Research Design, before your start
your research.

• It also depends on types of variables involved in the


analysis.

• Summarises some of the tests that are commonly


used in quantitative research in the social sciences are
given, next.
Quantitative
Data
Analysis
Quantitative
Data
Analysis
Correlation and regression
• These terms refer to the relationship between two scale
variables which can be represented by a straight line (the line
of best fit, or trendline).
• A number called the ‘correlation coefficient’ (sometimes
referred to as ‘r’) indicates how closely the two variables are
Quantitative linked (in terms of the line of best fit) and has values from +1
to -1.
Data Types of correlation:
• Simple correlation: between two variables (one is
Analysis considered as dependent and the other one as independent)

• Multiple correlation: more than two variable (one


dependent variable and the others as independent variables)

• Partial correlation: measure the association between an


independent and a dependent variable. It allow for the
variation associated with specified other independent
variables.
• Regression analysis: provides further information
about the line of best fit, and also indicates, via the
usual p value, whether the relationship between the
variables is significant.
• Often called econometric analysis
• It is estimating response variable as a
function of set of independent variables.
Quantitative • Choice of appropriate econometric model is guided
by:
Data Analysis – Economic theory
• Nature of the data at hand
– Previous studies
• Designing regression equation for Testing different
hypotheses relating to various issues of the research:
• Non-parametric
• Parametric
– OLS, logit, probit, tobit, etc
The Regression Model: example

Costs = α + β1  Mileage + β2  Age + β3  Make + ε


the explanatory variables residual term
dependent or
variable independent variables

coefficients linear
mathematical
structure

intercept
Statistical significance
Data analysis and presentation
• After the data have been collected and organized
the next step is presentation

• The tools of classification of data are frequency


distribution, cumulative frequency distribution,
relative frequency distribution and charts.

• Charts are graphical representation of data.

• Charts are of different types: pie chart, bar chart,


histogram,...
Data analysis and
presentation
Frequency of distribution:
• Continuous distribution
– Normal curve
Data analysis
and presentation
Frequency of distribution:
Discrete distribution
• Binomial distribution: It is used
when there are exactly two
mutually exclusive outcomes of
a trial. These outcomes are
appropriately labeled "success"
and "failure“.
Data analysis
and presentation
Frequency of distribution:

• Discrete distribution
Poisson distribution: It is used
to model the number of events
occurring within a given time
interval.
• Validation of results

• Comparing your results with the previous


empirical works
Data analysis and
• Testing of hypotheses are already built in
interpretaion through a significance level
– Parametric

• Interpretation of result and conclusion


• The last stage in statistical investigation is
interpretation, i.e., drawing conclusions from
the analyzed data.

• The interpretation of data is a difficult task


and needs a high degree of skill and
experience.
Data analysis and
• If the data that have been analyzed are not
interpretaion properly interpreted, the whole objective of
the investigation may be crushed, and
fallacious conclusions may be drawn.

• Correct interpretation may lead to a valid


conclusion of the study, and thus can aid in
decision making.

You might also like