Reliability and Validity in Research

Fidelity, Reliability and Validity in Research
Fidelity = a measure of the realism of a model or simulation, the degree to which a model reproduces the state and
behavior of a real world object, feature, phenomenon etc. Can be appreciated by theoretical analysis and expert
judgment of goodness of fit.
Reliability = the extent to which a measure (an instrument) will produce consistent results on similar subjects under
similar conditions. It can be assimilated with the precision of a certain measurement.
Internal consistency reliability = how well the individual measures included in the research are converted into a
composite measure. It represents, in other words, the degree of correlation between our research instrument that is
supposed to measure what we want to measure and an hypothetical instrument (scale or construct), ideal, which
measures exactly what we want, but which does not exist in reality.
Types of reliability testing:

- Internal consistency reliability - Cronbach alpha coefficient, believed to indirectly indicate the degree to
which a set of items measures a single unidimensional latent construct. It is a measure of squared
correlation between observed scores and true scores; reliability is measured in terms of the ratio of true
score variance to observed score variance. The theory behind it is that the observed score is equal to the
true score plus the measurement error (Y = T + E). For example, one student knows 80% of the materials
but his score is 85% because of lucky guessing. In this case, the observed score is 85 while the true score is
80. The additional five points are due to the measurement error. A reliable test should minimize the
measurement error so that the error is not highly correlated with the true score. On the other hand, the
relationship between true score and observed score should be strong. Cronbach Alpha examines this
relationship
- Equivalent reliability - Split-half reliability or Spearman Brown coefficient; sometimes you will find
Parallel forms reliability, although this is slightly different. In split-half reliability we randomly divide all
items that purport to measure the same construct into two sets. We administer the entire instrument to a
sample of people and calculate the total score for each randomly divided half. The split-half reliability
estimate is simply the correlation between these two total scores. For the parallel forms, first we create a
large set of questions that address the same construct and then randomly divide the questions into two sets.
We administer both instruments to the same sample of people. The correlation between the two parallel
forms is the estimate of reliability
- Stable reliability Test-retest reliability. We estimate test-retest reliability when we administer the same
test to the same sample on two different occasions. This approach assumes that there is no substantial
change in the construct being measured between the two occasions. The amount of time allowed between
measures is critical. We know that if we measure the same thing twice that the correlation between the two
observations will depend in part by how much time elapses between the two measurement occasions. The
shorter the time gap, the higher the correlation; the longer the time gap, the lower the correlation. This is
because the two observations are related over time -- the closer in time we get the more similar the factors
that contribute to error. Since this correlation is the test-retest estimate of reliability, you can obtain
considerably different estimates depending on the interval.
- Homogeneous reliability inter-rater or inter-observer reliability; if your measurement consists of
categories (the raters are checking off which category each observation falls in) - you can calculate the
percent of agreement between the raters; when the measure is a continuous one, all you need to do is
calculate the correlation between the ratings of the two observers. For instance, they might be rating the
overall level of activity in a classroom on a 1-to-7 scale. You could have them give their rating at regular
time intervals (e.g., every 30 seconds). The correlation between these ratings would give you an estimate of
the reliability or consistency between the raters. This type of reliability is considered as "calibrating" the
observers.
Reliability can be increased by:

- increasing the sample size
- triangulation (several different research methods, in order to reduce systematic errors)
- calibration (an increase of the homogeneity of answers, through repeated discussions of the terms,
concepts, questionnaire pretesting etc.)
Validity = the extent to which the instrument measured what we intended to measure. It can be assimilated to the
accuracy of a measurement or research. Types of validity internal and external. Internal validity can be:
- content (face) validity = the content of research is related to the variables to be studied, has a logic;
- criterion validity (concurrent validity) = how meaningful are the chosen research criteria relative to other
possible criteria; predictive validity is a variant of criterion validity;
- construct validity (factorial validity) = checks what underlying construct is being measured, has three
parts:
- convergent validity the degree to which two measures designed to measure the same construct
are related; convergence is found if the two measures are highly correlated
- discriminant validity the degree to which two measures designed to measure similar, but
conceptually different constructs are related; a low to moderate correlation is considered evidence
of discriminant validity
- nomological validity the degree to which predictions from a formal theoretical network
containing the concept under scrutiny are conformed; that is, constructs that are theoretically
related are actually empirically related, as well.
External validity checks if the results of the research can be generalized, extrapolated for a whole population, for
all similar situations etc. Externally valid results can be extended or applied to contexts outside those in which the
research took place.
Validity, in general, is an indication of how sound a research is and applies to both the design and the methods of a
research. Validity implies reliability, but the reciprocal is not true; this means that a valid measurement is reliable,
but a reliable measurement isnt necessarily valid.
Internal validity is affected by subject variability, size of subject population, time given for the data collection,
history, attrition, maturation, instrument sensitivity.
External validity is affected by population characteristics, interaction of subject selection and research, descriptive
explicitness of the independent variable, the effect of the research environment, researcher or investigator effects,
data collection methodology, time effects.
Particularities for qualitative research
1) Instead of internal validity we speak of credibility built up through prolonged engagement in the field,
persistent observation and triangulation of data
2) Instead of external validity we speak of transferability possible when we provide detailed portrait of the
setting in which the research is conducted, aiming of giving the readers enough information for them to
judge the applicability of the findings to other settings
3) Instead of reliability we speak of dependability it encourages researchers to provide an audit trail (the
documentation of data, methods and decisions about the research) which can be laid open to external
scrutiny; researcher triangulation is also needed, if possible
A good research, quantitative or qualitative, has to be objective. In its purest sense, the idea of objectivity assumes
that a truth or independent reality exists outside of any investigation or observation. The researcher's task in this
model is to uncover this reality without contaminating it in any way. This notion - that a researcher can observe or
uncover phenomena without affecting them - is increasingly rejected, especially in the social sciences but also in
the natural sciences. In qualitative research, a realistic aim is for the researcher to remain impartial; that is, to be
impartial to the outcome of the research, to acknowledge their own preconceptions and to operate in as unbiased
and value-free way as possible. So, instead of objectivity we speak of confirmability possible through audit and
reflexivity the researcher can offer a self critically reflexive analysis of the research methodology and experts will
judge this; triangulation of data, researcher and context is also a good way of increasing confirmability.

Reliability and Validity in Research

Uploaded by

Copyright:

Available Formats

Reliability and Validity in Research

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Reliability and Validity in Research

Uploaded by

Copyright:

Available Formats

Fidelity, Reliability and Validity in Research

Types of reliability testing:

Reliability can be increased by:

Particularities for qualitative research

You might also like