It refers to the degree to which the measurement procedure measures
the variable that it claims to measure (strength, usefulness, quality, appropriateness, etc).
Does the measurement process accurately capture the
variable/construct that it is supposed to measure?
Basically, it is the agreement between a test score and the characteristic
it is believed to measure. Aspects of Validity Face Validity – is the simplest and least scientific form of validity and it is demonstrated when the face value or superficial appearance of a measurement measures what it is supposed to measure.
Do the test items appear related to the perceived purpose of the
test? Examples of Face Validity An IQ test containing items which measure memory, mathematical ability, verbal reasoning and abstract reasoning has a good face validity. An IQ test containing items which measure depression and anxiety has a bad face validity. A self-esteem rating scale which has items like “I know I can do what other people can do.” and “I usually feel that I would fail on a task.” has a good face validity. Content Validity - is concerned with the extent to which the test is representative of a defined body of content consisting of topics and processes.
Content validation is not done by statistical analysis but by the
inspection of items. A panel of experts can review the test items and rate them in terms of how closely they match the objective or domain specification. This considers the adequacy of representation of the conceptual domain the test is designed to cover. If the test items adequately represent the domain of possible items for a variable, then the test has adequate content validity. Determination of content validity is often made by expert judgment. Construct underrepresentation Failure to capture important components of a construct (e.g. An English test which only contains vocabulary items but no grammar items will have a poor content validity.) Construct-irrelevant variance Happens when scores are influenced by factors irrelevant to the construct (e.g. test anxiety, reading speed, reading comprehension, illness) Quantification of Content Validity – Lawshe (1975) proposed a structured and systematic way of establishing the content validity of a test – He developed the formula content validity ratio (CVR)
Criterion Validity - involves the relationship or correlation between the
test scores and scores on some measurement representing an identical criterion. The correlation coefficient can be computed between the scores on the test being validated (predictor) and the scores on the criterion. The correlation coefficient (Pearson r) used is called validity coefficient. Types of Criterion Validity Predictive Validity It is demonstrated when scores obtained from a measure accurately predict behavior (criterion) according to a theory. Examples: College entrance tests can predict whether a student can meet the demands and standards of the college/university. These tests are good correlates of academic performance. Job application exams can predict the job performance and attitude of applicants. Concurrent Validity It is established when the scores of a measure (predictor) is correlated with the scores of a different measure (criterion) taken at the same time. The two measures may be measuring the same construct, but often times they measure two different yet related constructs. A newly created psychological test (predictor) must correlate with existing and well-established psychological tests (criterion) measuring a related construct. Examples: A test which measures learning disabilities should be significantly and negatively correlated with a test measuring school performance. A test which measures anger is expected to be significantly and positively correlated with a test measuring violent and aggressive behavior. An individual who got a high score on a newly constructed test which measures depression is expected to get a high score in Beck Depression Inventory-II. Construct Validity A test has a good construct validity if there is an existing psychological theory which can support what the test items are measuring. Establishing construct validity involves both logical analysis and empirical data. Example: In measuring aggression, you have to check all past research and theories to see how the researchers measure that variable/construct. Types of Construct Validity Convergent Validity It involves comparing two different methods to measure the same construct and it is demonstrated by a strong relationship between the scores obtained from the two methods. This can be demonstrated through: A test measuring the same things as other tests used for the same purpose. Demonstration of specific relationships that we can expect if the test is really doing its job. Examples: Your newly created psychological test measuring life satisfaction should be strongly and positively correlated with “Satisfaction with Life Scale” by Ed Deiner, Ph,D. In measuring children’s aggression, you may observe their behavior directly and you may also ask their parents to accomplish an aggression rating scale. Divergent Validity or Discriminant Validity This refers to the demonstration of the uniqueness of that test. It is effectively demonstrated when a test has a low correlation with measures of unrelated constructs. It could simply mean that the measure does not represent a construct other than the one for which it was devised. Examples: In measuring children aggression, you have to distinguish what is the kids’ general activity and what is real aggression. Your newly constructed psychological test about optimism should have a weak correlation with a test which measures gender identity. A test which measures spelling ability should have a low correlation with a test which abstract reasoning.
Relationship between Reliability and Validity
• Reliability and validity are partially related and partially independent. • Reliability is a prerequisite for validity, meaning a measurement cannot be valid unless it is reliable. • It is not necessary for a measurement to be valid for it to be considered reliable.