Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Guidance Reviewer - CARINAL

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

University of the Philippines Visayas

College of Arts and Sciences Division of Professional Education


General Luna St., Iloilo City, Iloilo, 5000

EDCO 215: PSYCHOLOGICAL TESTING

GUIDANCE REVIEWER

Prepared by:

ROBE ANN J. CARINAL


M.ED GUIDANCE II

Submitted to:

HELEN GRACE CONCEPCION Q. FERNANDEZ


Course Instructor
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Course Outline
I. Introduction
a. History of Psychological Testing
b. Defining Psychological Testing
c. Basic Concepts of Psychological Testing

II. The Science of Psychological Measurement


a. Norms and Basic Statistics for Testing
b. Reliability and Validity
c. Writing and Evaluating Test Items (Test Construction)
d. Test Administration

III. Types of Psychological Tests and Their Uses


a. Intelligence, Aptitude, Achievement Tests
b. Personality Tests
c. Other Tests

IV. Applications and Issues in Psychological Testing


a. Application of Psychological Testing in Various Settings
b. Testing and the Law
c. Ethics and the Future of Psychological Testing
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Unit I Introduction

HISTORY OF PSYCHOLOGICAL TESTING

1000 BC Testing in Chinese civil service


1850 -1900 Civil service examinations in the United States
1900 -1920 Development of individual and group tests of cognitive
ability, development of Psychometric theory
1920 -1940 Development of factor analysis, development of projective
tests and standardized personality inventories
1940 -1960 Development of vocational interest measures,
standardized measures of psychopathology
1960 - 1980 Development of item response theory, neuropsychological
testing
1980 - to present Large-scale implementation of computerized adaptive
tests

i. DEFINING PSYCHOLOGICAL TESTING

PSYCHOMETRICS

The branch of psychology concerned with the quantification and measurement


of mental attributes, behavior, performance, and the like, as well as with the design,
analysis, and improvement of the tests, questionnaires, and other instruments used
in such measurement.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

PSYCHOLOGICAL TESTING

The process of measuring psychology-related variables by means of devices or


procedures designed to obtain a sample of behavior.

PSYCHOLOGICAL ASSESSMENT
Gathering and integration of psychology-related data for the purpose of making a
psychological evaluation that is accomplished through the use of tools such as tests,
interviews, case studies, behavioral observation, and specially designed apparatuses
and measurement procedures.

ii. BASIC CONCEPTS OF PSYCHOLOGICAL TESTING

TESTING
• Objective
Typically, to obtain some gauge, usually numerical in nature, with regard to an
ability or attribute.
• Process

Testing may be individual or group in nature. After test administration, the


tester will typically add up “the number of correct answers or the number of
certain types of responses with little if any regard for the how or mechanics of
such content” (Maloney & Ward, 1976, p. 39).

• Role of Evaluator

The tester is not key to the process; practically speaking, one tester may be
substituted for another tester without appreciably affecting the evaluation .
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

• Skill of Evaluator

Testing typically requires technician-like skills in terms of administering and


scoring a test as well as in interpreting a test result.
• Outcome
Typically, testing yields a test score or series of test scores.

ASSESSMENT

• Objective
Typically, to answer a referral question, solve a problem, or arrive at a decision
through the use of tools of evaluation.
• Process
Assessment is typically individualized. In contrast to testing, assessment more
typically focuses on how an individual processes rather than simply the results
of that processing (Cohen and Swerdlik, 2009).
• Role of Evaluator
The assessor is key to the process of selecting tests and/or other tools of
evaluation as well as in drawing conclusions from the entire evaluation.
• Skill of Evaluator
Assessment typically requires an educated selection of tools of evaluation, skill
in evaluation, and thoughtful organization and integration of data.
• Outcome
Typically, assessment entails a logical problem-solving approach that brings to
bear many sources of data designed to shed light on a referral question.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Unit II THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT

a) PSYCHOLOGICAL TESTING NORMS & BASIC STATISTICS

STATISTICS
A branch of mathematics dedicated to organizing, depicting, summarizing,
analysing, and otherwise dealing with numerical data.

WHY DO WE NEED STATISTICS?


Statistics are used for purposes of description. Use statistics to make
inferences, which are logical deductions about events that cannot be observed directly.

• Descriptive statistics
are methods used to provide a concise description of a collection of quantitative
information; numbers and graphs used to describe, condense, or represent
data.
• Inferential statistics
are methods used to make inferences from observations of a small group of
people known as a sample to a larger group of individuals known as a
population; to estimate population values based on sample values or to test
hypotheses.

TYPES OF SCALES

• Nominal scales
Are really not scales at all; their only purpose is to name objects. Nominal scales
are used when the information is qualitative rather than quantitative.

• Ordinal scale
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

This scale allows you to rank individuals or objects but not to say anything
about the meaning of the differences between the ranks.
• Interval scale
Has the property of magnitude and equal intervals but not absolute 0.
• Ratio scale
A scale that has all three properties (magnitude, equal intervals, and an
absolute 0) any mathematical operation is permissible.

FREQUENCY DISTRIBUTION
A distribution of scores summarizes the scores for a group of individuals. The
frequency distribution displays scores on a variable or a measure to reflect how
frequently each value was obtained.

MEASURES OF CENTRAL TENDENCY

Mean
• arithmetic average
Median
• the value that divides a distribution that has been arranged in order of
magnitude into two halves
Mode
• most frequently occurring value in a distribution, is useful primarily when
dealing with qualitative or categorical variables

MEASURES OF VARIABILITY

Range
• distance between two extreme points—the highest and lowest values in a
distribution

Variance
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

• the sum of the squared differences or deviations between each value (X) in
a distribution and the mean of that distribution (M), divided by N

Standard deviation
• the square root of the variance; it provides a single value that is
representative of the individual differences or deviations in a data set
computed from a common reference point, namely, the mean

THE NORMAL CURVE


• also known as the bell curve. Its baseline, equivalent to the X-axis of the
distribution, shows the standard deviation (σ) units; its vertical axis, or
ordinate, usually does not need to be shown because the normal curve is
not a frequency distribution of data but a mathematic al model of an ideal
or theoretical distribution.

PROPERTIES OF THE NORMAL CURVE MODEL

• It is bell shaped, as its nickname indicates.


• It is bilaterally symmetrical, which means its two halves are identical (if we split
the curve into two, each half contains 50% of the area under the curve).
• It has tails that approach but never touch the baseline, and thus its limits
extend to ± infinity (±∞), a property that underscores the theoretical and
mathematical nature of the curve.
• It is unimodal; that is, it has a single point of maximum frequency or maximum
height.
• It has a mean, median, and mode that coincide at the center of the distribution
because the point where the curve is in perfect balance, which is the mean, is
also the point that divides the curve into two equal halves, which is the median,
and the most frequent value, which is the mode.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

USES OF THE NORMAL CURVE

• The normal curve model is used descriptively to locate the position of scores
that come from distributions that are normal. In a process known as
normalization, the normal curve is also used to make distributions that are not
normal—but approximate the normal—conform to the model, in terms of the
relative positions of scores.
• The normal curve model is applied inferentially in the areas of
(a) Reliability, to derive confidence intervals to evaluate obtained
scores and differences between obtained scores, and
(b) Validity, to derive confidence intervals for predictions or estimates
based on test scores.

SHAPE OF DISTRIBUTIONS

Kurtosis
• refers to the flatness or peakedness of a distribution

Platykurtic
• distributions have the greatest amount of dispersion, manifested in tails that
are more extended, and leptokurtic distributions have the least. The normal
distribution is mesokurtic, meaning that it has an intermediate degree of
dispersion.

Skewness (Sk)
• of a distribution refers to a lack of symmetry. As we have seen, the normal
distribution is perfectly symmetrical, with Sk = 0; its bulk is in the middle
and its two halves are identical.

A skewed distribution is asymmetrical. If most of the values are at the top end
of the scale and the longer tail extends toward the bottom, the distribution is
negatively skewed (Sk < 0); on the other hand, if most of the values are at the
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

bottom and the longer tail extends toward the top of the scale, the distribution is
positively skewed (Sk > 0).

ESSENTIALS OF CORRELATION AND REGRESSION

The degree of relationship between two variables is indicated by the number in

the coefficient, whereas the direction of the relationship is indicated by the sign. A

correlation coefficient of –0.80, for example, indicates exactly the same degree of

relationship as a coefficient of +0.80. Whether positive or negative, a correlation is

low to the extent that its coefficient approaches zero.

Correlation, even if high, does not imply causation. If two variables, X andY,

are correlated, it may be because X causesY, because Y causes X, or because a third

variable, Z, causes both X andY. This truism is also frequently ignored; moderate to

high correlation coefficients are often cited as though they were proof of a causal

relationship between the correlated variables.

High correlations allow us to make predictions. While correlation does not imply

causation, it does imply a certain amount of common or shared variance. Knowledge

of the extent to which things vary in relation to one another is extremely useful.

Through regression analyses we can use correlational data on two or more variables

to derive equations that allow us to predict the expected values of a dependent

variable (Y), within a certain margin of error, based on the known values of one or

more independent variables (X1, X2 , . . . Xk ), with which the dependent variable is

correlated.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

RELIABILITY AND VALIDITY

RELIABILITY

A good test or, more generally, a good measuring tool or procedure is reliable. The

criterion of reliability involves the consistency of the measuring tool: the precision with which

the test measures and the extent to which error is present in measurements. In theory, the

perfectly reliable measuring tool consistently measures in the same way. As you might expect,

however, reliability is a necessary but not sufficient element of a good test. In addition to

being reliable, tests must be reasonably accurate. In the language of psychometrics, tests

must be valid.

In its broadest sense, error refers to the component of the observed test score that

does not have to do with the test taker’s ability. If we use X to represent an observed score,

T to represent a true score, and E to represent error, then the fact that an observed score

equals the true score plus error may be expressed as follows:

X=T+E

A statistic useful in describing sources of test score variability is the variance (σ2)—

the standard. Variance from true differences is true variance, and variance from irrelevant,

random sources is error variance.

SOURCES OF ERROR VARIANCE

Test construction. One source of variance during test construction is item sampling

or content sampling, terms that refer to variation among items within a test as well as to

variation among items between tests. From the perspective of a test creator, a challenge in

test development is to maximize the proportion of the total variance that is true variance and

to minimize the proportion of the total variance.


University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Test administration. Sources of error variance that occur during test administration

may influence the test taker’s attention or motivation. The test taker’s reactions to those

influences are the source of one kind of error variance. Examples of untoward influences

during administration of a test include factors related to the test environment: the room

temperature, the level of lighting, and the amount of ventilation and noise, for instance.

Test scoring and interpretation. The advent of computer scoring and a growing

reliance on objective, computer-scorable items virtually have eliminated error variance caused

by scorer differences in many tests. If subjectivity is involved in scoring, then the scorer (or

rater) can be a source of error variance.

Other sources of error. Females, for example, may underreport abuse because of

fear, shame, or social desirability factors and over report abuse if they are seeking help.

Males may underreport abuse because of embarrassment and social desirability factors and

over report abuse if they are attempting to justify the report.

RELIABILITY ESTIMATES

• Test-Retest Reliability Estimates

One way of estimating the reliability of a measuring instrument is by using the


same instrument to measure the same thing at two points in time. In psychometric
parlance, this approach to reliability evaluation is called the test-retest method,
and the result of such an evaluation is an estimate of Test-Retest Reliability.

• Parallel-Forms and Alternate-Forms Reliability Estimates

The degree of the relationship between various forms of a test can be evaluated
by means of an alternate-forms or parallel-forms coefficient of reliability, which is often
termed the coefficient of equivalence. Parallel forms of a test exist when, for each
form of the test, the means and the variances of observed test scores are equal. In
theory, the means of scores obtained on parallel forms correlate equally with the true
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

score. More practically, scores obtained on parallel tests correlate equally with other
measures.

Alternate forms are simply different versions of a test that have been

constructed so as to be parallel. Although they do not meet the requirements for the

legitimate designation “parallel,” alternate forms of a test are typically designed to be

equivalent with respect to variables such as content and level of difficulty. Obtaining

estimates of alternate-forms reliability and parallel-forms reliability is similar in two

ways to obtain an estimate of test-retest reliability:

1. Two test administrations with the same group are required, and

2. test scores may be affected by factors such as motivation, fatigue, or

intervening events such as practice, learning, or therapy (although not as much as

when the same test is administered twice).

An additional source of error variance, item sampling, is inherent in the

computation of an alternate- or parallel-forms reliability coefficient. Test takers may

do better or worse on a specific form of the test not as a function of their true ability

but simply because of the particular items that were selected for inclusion in the test.

• Split-Half Reliability Estimates

An estimate of split-half reliability is obtained by correlating two pairs of scores

obtained from equivalent halves of a single test administered once. The computation

of a coefficient of split-half reliability generally entails three steps:

Step 1. Divide the test into equivalent halves.

Step 2. Calculate a Pearson r between scores on the two halves of the test.

Step 3. Adjust the half-test reliability using the Spearman-Brown formula.

Simply dividing the test in the middle is not recommended because it’s likely this

procedure would spuriously raise or lower the reliability coefficient. Different amounts of
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

fatigue for the first as opposed to the second part of the test, different amounts of test anxiety,

and differences in item difficulty as a function of placement in the test are all factors to

consider.

• The Spearman-Brown formula

The Spearman-Brown formula allows a test developer or user to estimate

internal consistency reliability from a correlation of two halves of a test. The general

Spearman-Brown ( rSB ) formula i

• Other Methods of Estimating Internal Consistency

Inter-item consistency refers to the degree of correlation among all the


items on a scale. A measure of inter-item consistency is calculated from a single
administration of a single form of a test. An index of interim consistency, in
turn, is useful in assessing the homogeneity of the test. Tests are said to be
homogeneous if they contain items that measure a single trait.
In contrast to test homogeneity, heterogeneity describes the degree to
which a test measures different factors. A heterogeneous (or
nonhomogeneous) test is composed of items that measure more than one trait.
A test that assesses knowledge only of color television repair skills could be
expected to be more homogeneous in content than a test of electronic repair.

MEASURES OF INTER-SCORER RELIABILITY

Variously referred to as scorer reliability, judge reliability, observer reliability,


and inter-rater reliability, inter-scorer reliability is the degree of agreement or
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

consistency between two or more scorers (or judges or raters) with regard to a
particular measure. If, for example, the problem is a lack of clarity in scoring criteria,
then the remedy might be to rewrite the scoring criteria section of the manual to
include clearly written scoring rules.

USING AND INTERPRETING A COEFFICIENT OF RELIABILITY


We have seen that, with respect to the test itself, there are basically three
approaches to the estimation of reliability:
1. test-retest
2. alternate or parallel forms, and
3. internal or inter-item consistency
Another question that is linked in no trivial way to the purpose of the test is,
“How high should the coefficient of reliability be?” Perhaps the best “short answer” to
this question is: “On a continuum relative to the purpose and importance of the
decisions to be made on the basis of scores on the test”.

THE PURPOSE OF THE RELIABILITY COEFFICIENT


• If a specific test of employee performance is designed for use at various
times over the course of the employment period, it would be reasonable to
expect the test to demonstrate reliability across time. It would thus be
desirable to have an estimate of the instrument’s test-retest reliability.
• For a test designed for a single administration only, an estimate of internal
consistency would be the reliability measure of choice.
• It is possible, for example, that a portion of the variance could be accounted
for by transient error, a source of error attributable to variations in the test
taker’s feelings, moods, or mental state over time.

VALIDITY
Validity, as applied to a test, is a judgment or estimate of how well a test
measures what it purports to measure in a particular context. More specifically, it is a
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

judgment based on evidence about the appropriateness of inferences drawn from test
scores.

Validation

is the process of gathering and evaluating evidence about validity. Both the test

developer and the test user may play a role in the validation of a test for a specific purpose.

Such local validation studies may yield insights regarding a particular population of test takers

as compared to the norming sample described in a test manual. Local validation studies are

absolutely necessary when the test user plans to alter in some way the format, instructions,

language, or content of the test.

One way measurement specialists have traditionally conceptualized validity is

according to three categories (Trinitarian view):

• content validity
• criterion validity
• construct validity

In this classic conception of validity, referred to as the trinitarian view, it might be

useful to visualize construct validity as being “umbrella validity” since every other variety of

validity falls under it. Three approaches to assessing validity—associated, respectively, with

content validity, criterion-related validity, and construct validity—area:

1. scrutinizing the test’s content

2. relating scores obtained on the test to other test scores or other measures

3. executing a comprehensive analysis of

a. how scores on the test relate to other test scores and measure

b. how scores on the test can be understood within some theoretical framework

for understanding the construct that the test was designed to measure.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

FACE VALIDITY

- relates more to what a test appears to measure to the person being tested than

to what the test actually measures. Face validity is a judgment concerning how

relevant the test items appear to be.

- In contrast to judgments about the reliability of a test and judgments about the

content, construct, or criterion-related validity of a test, judgments about face

validity are frequently thought of from the perspective of the test taker, not the

test user.

- A test’s lack of face validity could contribute to a lack of confidence in the

perceived effectiveness of the test—with a consequential decrease in the test

taker’s cooperation or motivation to do his or her best.

CONTENT VALIDITY

• Content validity describes a judgment of how adequately a test samples

behavior representative of the universe of behavior that the test was designed

to sample.

• From the pooled information (along with the judgment of the test developer),

a test blueprint emerge for the “structure” of the evaluation; that is, a plan

regarding the types of information to be covered by the items, the number of

items tapping each area of coverage, the organization of the items in the test,

and so forth.

THE QUANTIFICATION OF CONTENT VALIDITY


University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

The measurement of content validity is important in employment settings,

where tests used to hire and promote peopl are carefully scrutinized for their relevance

to the job, among other factors. Courts often require evidence that employment tests

are work related.

Lawshe. One method of measuring content validity, developed by C. H.

Lawshe, is essentially a method for gauging agreement among raters or judges

regarding how essential a particular item is. Lawshe proposed that each rater respond

to the following question for each item: “Is the skill or knowledge measured by this

item

1. Essential

2. useful but not essential

3. not necessary, to the performance of the job?

CULTURE AND THE RELATIVITY OF CONTENT VALIDITY

As incredible as it may sound to Westerners, students in Bosnia and

Herzegovina are taught different versions of history, art, and language depending

upon their ethnic background. Such a situation illustrates in stark relief the influence

of culture on what is taught to students as well as on aspects of test construction,

scoring, interpretation, and validation.

CRITERION-RELATED VALIDITY

Criterion-related validity is a judgment of how adequately a test score can be

used to infer an individual’s most probable standing on some measure of interest the

measure of interest being the criterion. Concurrent validity is an index of the degree
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

to which a test score is related to some criterion measure obtained at the same time

(concurrently). Predictive validity is an index of the degree to which a test score

predicts some criterion measure.

CONCURRENT VALIDITY

If test scores are obtained at about the same time that the criterion measures

are obtained, measures of the relationship between the test scores and the criterion

provide evidence of concurrent validity. Statements of concurrent validity indicate the

extent to which test scores may be used to estimate an individual’s present standing

on a criterion.

A test with satisfactorily demonstrated concurrent validity may therefore be

appealing to prospective users because it holds out the potential of savings of money

and professional time.

PREDICTIVE VALIDITY

Measures of the relationship between the test scores and a criterion measure

obtained at a future time provide an indication of the predictive validity of the test;

that is, how accurately scores on the test predict some criterion measure.

Test scores may be obtained at one time and the criterion measures obtained

at a future time, usually after some intervening event has taken place. The intervening

event may take varied forms, such as training, experience, therapy, medication, or

simply the passage of time.

THE VALIDITY COEFFICIENT

• The validity coefficient is a correlation coefficient that provides a measure of

the relationship between test scores and scores on the criterion measure.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

INCREMENTAL VALIDITY

• an aspect of validity that refers to what an additional assessment or predictive

variable can add to the information provided by existing assessments or

variables

CONSTRUCT VALIDITY

• Construct validity is a judgment about the appropriateness of inferences drawn

from test scores regarding individual standings on a variable called a construct.

C. WRITING AND EVALUATING TEST ITEMS (TEST CINSTRUCTION)

Importance of Well-Constructed Tests

The importance of well-constructed tests in the field of psychology and education is

substantial. Here are key reasons highlighting their significance:

1.Reliability 6. Efficiency

2. Validity 7. Predictive Ability

3.Fairness 8. Applicability Across Settings

4. Objectivity 9. Legal and Ethical Compliance

5. Useful Feedback 10. Continuous Improvement

11. Credibility and Trustworthiness

In summary, well-constructed tests are fundamental in providing accurate, fair,

and reliable assessments in various fields, contributing to informed decision-making,

effective learning, and personal and professional development.


University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Key Principles in Test Writing

By adhering to these key principles, test writers can contribute to the creation

of fair, valid, and reliable assessments that effectively measure the intended

knowledge, skills, or behaviors.

• Clarity. Test items and instructions should be clear and easily understood by

the test-takers. Ambiguous or confusing language can lead to

misinterpretation and inaccurate responses.

• Relevance. Test items should directly measure the targeted knowledge, skills,

or behaviors. Irrelevant or off-topic questions can undermine the validity of

the test.

• Fairness. Tests should be fair to all individuals, regardless of their

background, culture, or socioeconomic status. Avoiding language or content

biases ensures equitable assessment.

• Objectivity. Test items and scoring should minimize subjective judgments.

Objectivity in scoring enhances reliability, ensuring that different scorers arrive

at similar results.

• Consistency. Test items should be consistent in terms of difficulty and

relevance. Consistency helps maintain reliability, allowing for accurate

comparisons between different test-takers.


University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

• Variety. Include a variety of question types (e.g., multiple-choice, short

answer, essay) to assess different cognitive skills. This variety provides a more

comprehensive evaluation of the test-taker's abilities.

• Avoiding Ambiguity. Test items should be free from ambiguity or

vagueness. Ambiguous questions can lead to confusion and varied

interpretations, compromising the validity of responses.

• Avoiding Double-Barreled Questions. Questions should focus on a single

idea or concept to avoid confusion. Double-barreled questions that address

multiple issues can lead to unclear or inaccurate responses.

GUIDELINES IN WRITING TEST (KAPLANSACUZZO, PSYCHOLOGICAL


TESTING 9TH EDITION)

1. Define clearly what you want to measure. To do this, use substantive theory as a

guide and try to make items as specific as possible.

2. Generate an item pool. Theoretically, all items are randomly chosen from a universe

of item content. In practice, however, care in selecting and developing items is

valuable. Avoid redundant items. In the initial phases, you may want to write three or

four items for each one that will eventually be used on the test or scale.

3. Avoid exceptionally long items. Long items are often confusing or misleading.

4. Keep the level of reading difficulty appropriate for those who will complete the

scale.

5. Avoid “double-barreled” items that convey two or more ideas at the same time. For

example, consider an item that asks the respondent to agree or disagree with the
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

statement, “I vote Democratic because I support social programs.” There are two

different statements with which the person could agree: “I vote Democratic” and “I

support social programs.”

6. Consider mixing positively and negatively worded items. Sometimes, respondents

develop the “acquiescence response set.” This means that the respondents will tend

to agree with most items. To avoid this bias, you can include items that are worded

in the opposite direction.

ITEM FORMATS

Dichotomous Format

Description: This format presents items with two response options, typically

"yes/no" or "true/false." It is commonly used for straightforward and clear-cut

assessments.

Polytomous Format

Description: Unlike dichotomous, polytomous formats have more than two

response options for each item. It allows for a graded response, providing a range of

choices.

Likert Format

Description: Likert scales involve presenting a statement and asking

respondents to indicate their level of agreement or disagreement on a scale. It is

widely used in attitudinal assessments.

Category Format
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Description: This format involves categorizing items based on predefined

criteria. Respondents choose the category that best fits their response.

Checklists and Q-sorts

Description: A checklist involves marking items on a list that are present or

observed. It's a simple way to record the presence or absence of specific behaviors or

characteristics.

TEST ADMINISTRATION

The Relationship between Examiner and Test Taker

Both behavior of the examiner and their relationship to the test taker can affect test scores

Rapport of the examiner with test taker can influence results

For younger children, a familiar examiner may make difference

Fuchs and Fuchs found out that test performance was approximately .28 standard deviation
(roughly 4 IQ points) higher when the examiner was familiar with the test taker than when
not.

Familiarity with the test taker, and perhaps preexisting notions about the test taker’s ability
can either positively or negatively bias test results.

Attitudinal surveys - respondents may give the response that they perceived to be expected
the by interviewer

Rapport might be influenced by subtle processes such as the level of performance expected
by the examiner.

THE RACE OF THE TESTER

Some groups feel that their children should not be tested by anyone EXCEPT a member of
their own race.

According to Sattler there is little evidence that the race of the examiner significantly affects
intelligence test score.

Race of the examiner has nonsignificant effects on test performance for both African American
and white children.

Early results occurred both the Stanford-Binet scale and the Peabody Picture Vocabulary Test.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Few studies have shown effect attributed to the race of the examiner. There were only 4 of
29 studies found.

However, procedures for properly administering an IQ test are so specific. Regardless of race,
test administrators should act almost identically.

Deviation from procedures might produce differences in performance attributed to their race.

Sattler has shown that the race of the examiner affects the scores in some situations .

Examiners effects tend to increase when examiners are given more discretion about the use
of the tests.

TRAINING OF TEST ADMINISTRATORS

Different assessment procedure require different levels of training.

Many behavioral assessment procedure require training and evaluation but not a formal
degree or diploma.

Structured Clinical Interview for DSM-IV is used for psychiatric diagnosis.

SCID users are licensed psychiatrists or psychologist with additional training on the test.

No standardized protocols for training people to administer complicated tests such as the
Wechsler Adult Intelligence Scale-Revised.

EXPECTANCY EFFECTS / ROSENTHAL EFFECTS

A well-known line of research in psychology has shown that data sometimes can be affected
by what an experimenter expects to find.

Results show that subjects actually provide data that confirm the experimenter’s expectancies.

Study in Israel, women supervisors were told that some women officer cadets offered
exceptional potential. Selection was made randomly instead of on the basis of any evidence.
There were no expectancy effect shown.

Follow-up study shows that expectancy effect show up for men and women supervised by
men but no women led by women.

Expectancy effect exists in some but not all situations.

Expectancy shape our judgements.

Two Aspects of Expectancy Effect with Standardized Tests

Expectancy effects (Rosenthal’s) obtained when experiments followed standardized scripts.

• may come from subtle nonverbal communication between experimenter and subject
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

• experimenter may not even notice

Expectancy effects - small, subtle effect on scores; occurs in some situations and not
others.

• careful studies on particular tests needed

Expectancy effect may impact intelligence test scoring

Expectancy effect can also occur for non-ambiguous responses.

Expectancies in test administrators (e.g. more than just scores given) have yielded somewhat
inconsistent results. Some with expectancy effect, some with none.

In spite of inconsistent results, you should pay attention to potentially biasing effect of
expectancy.

Rosenthal’s critics do not deny the possibility of this effect.

EFFECT OF REINFORCING RESPONSES

Positive reinforcement is a process that strengthens the likelihood of a particular response


by adding a stimulus after the behavior is performed.

Negative reinforcement also strengthens the likelihood of a particular response, but by


removing an undesirable consequence.

✓ Several studies show that reward can significantly affect test performance
✓ Reinforcement and feedback guide the examinee toward a preferred response.
✓ Random reinforcement destroys the accuracy of performance and decreases the
motivation to respond (Eisenberger & Cameron, 1998)

Computer-assisted Test Administration

Interactive testing involves the presentation of test items on a computer terminal or personal
computer and the automatic recording of test responses. The computer can also be
programmed to instruct the test taker and to provide instruction when parts of the testing
procedure are not clear.

As early as 1970, Cronbach recognized the value of computers as test administrators. Here
are some of the advantages that computers offer:

• excellence of standardization,
• individually tailored sequential administration,
• precision of timing responses,
• release of human testers for other duties,
• patience (test taker not rushed), and control of bias,
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Example of CATs

Conventional Testing - examinees receive the same test questions in the same order,
usually a question at a time.

Branched or Response-Contingent Testing -a problem situation is presented to the


examinee with a number of alternatives.

Sequential Testing -are typically used to make a classification decision (e.g., to hire or not
to hire, to graduate or not to graduate, or whether someone is or is not depressed) using one
or more prespecified cut off scores.Subject Variables

Refers to characteristics that vary across participants, and they can’t be manipulated by the
one administering the test. These are often serious source of error.

Illness affects test scores. When you have a cold or the flu, you might not perform as well as
when you are feeling well. Many variations in health status affect performance in behavior
and in thinking (Kaplan, 2004.)

Medical drugs are now evaluated according to their effects on the cognitive process (Spilker,
1996).

Behavioral Assessment Methodology

Facts about human’s behavior:

✓ Good morning and good night text messages activate the part of the brain responsible
for happiness.
✓ Feeling ignored causes the same chemical effect as that of an injury
✓ Some of us are actually afraid of being so happy because of the fear that something
tragic might happen next.

Behavioral traits are the observable patterns of behavior that are relatively consistent
across various situations.

Focuses on the interactions between situations and behaviors for the purpose of effecting
behavioral change.

Types of Behavioral Assessment

✓ Personality Assessments
✓ Situational Judgment Tests (SJTs)
✓ Behavioral Interviews
✓ Work Sample

Pros and Cons of Behavioral Assessments

Pros: Behavioral assessments can provide objective data to help hiring managers evaluate
their candidates.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Cons: While tests can help reduce bias in the hiring process, they are not immune to bias
themselves. Determining which traits are “valuable” or “risky” is not, itself, an objective
process.

Predictive Value

Pros: Behavioral assessments can be effective in predicting job performance and identifying
candidates who are likely to succeed in the role.

Cons: These tests are not foolproof. Why does an employee succeed at one company but fail
at another? The employee is the same but the company’s product, support, culture, territory,
etc. (and the economy in general) all serve to complicate employee success.

Time

Pros: Behavioral assessments can help filter out candidates who are not a good fit for the
job, saving time and resources in the hiring process. Cons: On the other hand, these
assessments take time to administer and evaluate which can bog down the hiring process.

Bias

Pros: Using behavioral assessments can help ensure that all candidates are evaluated on the
same criteria, which can help reduce bias and ensure fairness in the hiring process.

Cons: No assessment can be truly free from bias. It’s important for hiring managers to be
aware of any potential biases and to use assessments in conjunction with other evaluation
methods.

Costs/Benefits

Pros: Behavioral assessments can provide insight into a candidate’s work style,
communication skills, and problemsolving abilities, which can help managers make more
informed hiring decisions.

Cons: Some tests can be expensive, which may be a barrier for smaller companies or those
with limited budgets.

Reactivity

“observes the observers”

when individuals change their behavior due to awareness that their behavior is being or will
be measured. - Their behavior might become more positive or negative, depending on the
situation and the people involved.

Drift
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

-refers to the tendency for observers in behavioral studies to stray from the definitions they
learned during training and to develop their own idiosyncratic definitions of behaviors despite
observing the same behavior.

Expentancies

Another potential source of bias is the expectancies of the observers regarding the subject's
behavior and the feedback observers receive from the experimenter in relation to that
behavior.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Deception

The act of misleading or wrongly informing someone about the true nature of a situation.

Statistical Control of Rating Errors

Also known as the halo effect is the tendency to ascribe positive attributes independently of
the observed behavior. Some psychologists have argued that this effect can be controlled
through partial correlation in which the correlation between two variables is found while
variability in a third variable is controlled.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

UNIT III

TYPES OF PSYCHOLOGICAL TESTING

When administered and evaluated properly, psychological tests are accurate tools used

to diagnosis and treat mental health conditions. When you hear the words “psychological

testing,” all kinds of questions and thoughts may run through your mind.

Psychological testing is the basis for mental health treatment. These tools are often used
to measure and observe a person’s behaviors, emotions, and thoughts.Tests are performed
by a psychologist who will evaluate the results to determine the cause, severity, and duration
of your symptoms. This will guide them in creating a treatment plan that meets your needs.

Tests can either be objective or projective:

• Objective testing involves answering questions with set responses like yes/no or
true/false.

• Projective testing evaluates responses to ambiguous stimuli in the hopes of


uncovering hidden emotions and internal conflicts.

Psychologists use testing to examine a variety of factors, including emotional intelligence,


personality, mental aptitude, and neurological functioning.

Here’s a more in-depth look at the types of testing available and the most commonly used
tests for each category.

Personality tests

Measure behaviors, emotions, attitude, and behavioral and environmental characteristics


Test names: Basic Personality Inventory (BPI), 16 Personality Factor Questionnaire
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Achievement tests

Measure respondents’ intellectual interests, achievements, and cognitive abilities


Test names: Woodcock-Johnson Psychoeducational Battery, Kaufman Test of Education
Achievement (K-TEA)

Attitude tests

Measure views of respondents based on how much they agree or disagree with a statement
Test names: Likert Scale, Thurstone Scale

Aptitude tests

Measure capabilities, skill sets, and projection of future success


Test names: Visual Reasoning Test, Abstract Reasoning Test

Emotional Intelligence tests

Measure emotional responses such as anger, sadness, happiness, and impulsivity


Test names: Mayor-Salovey-Caruso El Test (MSCEIT), Emotional and Social Competence
Inventory.

There are a number of core principles that form the foundation for psychological assessment:

• Tests are samples of behavior.

• Tests do not directly reveal traits or capacities, but may allow inferences to be made
about the person being examined.

• Tests should have adequate reliability and validity.

• Test scores and other test performances may be adversely affected by temporary
states of fatigue, anxiety, or stress; by disturbances in temperament or personality; or
by brain damage.

A psychological evaluation can be a key part of your therapy journey. It gathers


information about how you think, feel, behave, and much more.

A psychological evaluation is often thought of as the first line of defense in diagnosing and
treating a mental health condition. Performed by a psychologist, it helps them gain an
understanding of the severity and duration of your symptoms.Tests and assessments are the
two main components used in anvaluation typically includes using formal tests, or “norm-
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

referenced” tests. These are standardized tests that measure an individual’s ability to learn
and understand several concepts.

Common components of an assessment include:

• psychological tests

• surveys and tests

• interviews

• observational data

• medical and school history

• medical evaluation

PERSONALITY TEST

Personality

McClelland (1951, p. 69) defined personality as “the most adequate conceptualization


of a person’s behavior in all its detail.”

Menninger (1953, p. 23) defined it as “the individual as a whole, his height and weight
and love and hates and blood pressure and reflexes; his smiles and hopes and bowed legs
and enlarged tonsils.

Cohen and Swerdlik define personality as an individual’s unique constellation of


psychological traits and states.

Personality Assessment

Personality assessment may be defined as the measurement and evaluation of psychological


traits, states, values, interests, attitudes, worldview, acculturation, personal identity, sense of
humor, cognitive and behavioral styles, and/or related individual characteristics.

Personality Traits, Types, and States

Guilford (1959, p. 6) defined personality trait as “Any distinguishable, relatively


enduring way in which one individual varies from another.”

A personality type as a constellation of traits and states that is similar in pattern to


one identified category of personality within a taxonomy of personalities.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

A personality state is an inferred psychodynamic disposition designed to convey the


dynamic quality of id, ego, and superego in perpetual conflict.

Personality Traits, Types, and States


Personality Trait

- Physical Aggression subscale of the Aggression Questionnaire

Personality Type

MyersBriggs Type Indicator (MBTI; Myers & Briggs, 1943/1962)

Personality State

- State-Trait Anxiety Inventory (STAI Spielberger et al., 1980),

Personality Assessment Methods


Objective Methods

- objective methods of personality assessment characteristically contain short answer items


for which the assessee’s task is to select one response from the two or more provided.

Projective Methods

- projective hypothesis holds that an individual supplies structure to unstructured stimuli in a


manner consistent with the individual’s own unique pattern of conscious and unconscious
needs, fears, desires, impulses, confLicts, and ways of perceiving and responding.

Inkblots as Projective Stimuli

The Rorschach

Hermann Rorschach ( Figure 13–1 ) developed what he called a “form interpretation test”
using inkblots as the forms to be interpreted.

Pictures as Projective Stimuli

Your story should have a beginning, a middle, and an end.

Pictures used as projective stimuli may be photos of real people, animals, objects, or anything.
They may be paintings, drawings, etchings, or any other variety of picture.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

The Thematic Apperception Test

In the TAT manual, Murray (1943) also advised examiners to attempt to find out the source
of the examinee’s story. It is noteworthy that the noun apperception is derived from the verb
apperceive, which may be defined as to perceive in terms of past perceptions.

Word association tests

-a word association test may be defined as a semistructured, individually administered,


projective technique of personality assessment that involves the presentation of a list
of stimulus words, to each of which an assessee responds verbally or in writing with
whatever comes to mind first upon hearing the word.

Sentence completion tests

-developed for use in specifi c types of settings (such as school or business) or for
specifi c purposes. Sentence completion tests may be relatively atheoretical or linked
very closely to some theory.

Projective
Sounds as Projective Stimuli
Auditory Projective Test

This inspired Skinner to think of an application for sound, not only in behavioral terms but in
the elicitation of “latent” verbal behavior that was significant “in the Freudian sense” (Skinner,
1979, p. 175).

Auditory Apperception Test (Stone, 1950)

-the subject’s task was to respond by creating a story based on three sounds played on a
phonograph record.

Auditory sound association test

Wilmer & Husni, 1951) and the other referred to as an auditory apperception test (Ball &
Bernardoni, 1953). Henry Murray also got into the act with his Azzageddi test (Davids &
Murray, 1955), named for a Herman Melville character. Unlike other auditory projectives, the
Azzageddi presented subjects with spoken paragraphs.

Projective
The Production of Figure Drawings

A relatively quick, easily administered projective technique is the analysis of drawings.


Drawings can provide the psychodiagnostician with a wealth of clinical hypotheses to be
confirmed or discarded as the result of other findings.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Projective
The Production of Figure Drawings

Figure-drawing tests

a figure drawing test may be defined as a projective method of personality assessment


whereby the assessee produces a drawing that is analyzed on the basis of its content and
related variables.

Personality Projection in the Drawing of the Human Figure by Karen Machover (1949).

Machover wrote that the human fi gure drawn by an individual who is directed to “draw a
person” [is] related intimately to the impulses, anxieties, confl icts, and compensations
characteristic of that individual. In some sense, the fi gure drawn is the person, and the paper
corresponds to the environment.

The House-Tree-Person test (HTP; Buck, 1948)

- is another projective figure-drawing test. As the name of the test implies, the testtaker’s task
is to draw a picture of a house, a tree, and a person.

OTHER TEST

Attitude Test

-Attitude testing is done to measure people's attitudes. The purpose is to quantify peoples'
beliefs and behaviors to inform decisions, understand human differences, and gain knowledge
about personality types. Attitude testing can be done directly or indirectly.

-is FUNDAMENTAL to the success or failure that we experience in our life. There is little
difference in people physically or intellectually. But what does make the difference is the
attitude.

Emotional Intelligence Tests

-is widely used to screen candidates for various jobs. Employers are often interested
in figuring out which applicants are likely to be resilient, self-motivated, and good at
cooperating with others, and many turn to EQ tests as a way to assess these traits.

-can significantly impact various aspects of your life, including behavior in family,
friendships, and workplace relationships.

Neuropsychological Tests

-refers to a number of tests that healthcare providers use to get information about
how your brain works.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Projective Tests

-are used to measure personality. Subjects are shown ambiguous images or asked
open-ended questions, and their answers give interviewers insights into the person's
unconscious attitudes and beliefs.

- projective test is a personality test in which subjects are shown ambiguous images
and asked to interpret them. The subjects are to project their own emotions, attitudes,
and impulses onto the image, and then use these projections to explain an image, tell
a story, or finish a sentence.

-The Rorschach Inkblot Test is the best known projective test, and it is also the
first test of its kind developed. Subjects are shown series of cards with inkblot images
and asked what the images could be

Direct Observation Test

-are a type of psychological test that involves observing people in a structured way,
either in a laboratory or natural setting, as they carry out various pre-determined
activities. These tests are used mainly to study children's behavior, including how they
interact with other family members.

-are some well-known examples of direct observation tests. One is the Parent-Child
Interaction Assessment (PCIA), which helps psychologists understand how parents
and children interact through language and behavior when they are playing.

The Parent-Child Early Relational Assessment is a direct observation test used


as a family assessment tool. It involves the video recording of parents with their
preschool age children in four distinct activities: feeding, a familiar task, a familiar
game, and a novel game that requires teaching.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Unit VI Applications and Issues in Psychological Testing

TESTING AND THE LAW

PSYCHOLOGICAL TESTING of government employees has become widespread


in recent years. The use by the federal government of these tests-particularly those
which purport to measure and categorize "personality"-poses a unique challenge to
both Congress and the courts in their effort to protect individual constitutional rights.
How the problems raised by these and other devices are dealt with may be an
important indicator of the effectiveness of judicial and legislative control over that
complex bureaucracy which is our federal government. Additionally, solutions to these
problems may measure the ability of Congress and the courts to cope with the
demands which technological advances have placed upon our system of government,
upon the very fabric of our cultural life, and upon the concept of individual rights in a
democratic society.

THE ROLE OF CONGRESS: INVESTIGATION BY THE SUBCOMMITTEE


A. Scope of the Subcommittee Inquiry
In the course of investigating the rights of federal employees, the Constitutional Rights
Subcommittee over a two-year period received and investigated numerous complaints that
federal employees were being subjected to mind-probing sessions with government
psychiatrists and psychologists in general screening programs-such as that used by the Peace
Corps-or for hiring, firing, and promotion purposes. The investigation shows that supervisors
may suggest or require "fitness for duty examinations" which may include psychological
testing, under subtle threats of disciplinary action for in subordination or loss of a job.

B. Alleged Authority to Test

It has been suggested that psychological testing and the procedures under which such
tests are administered violate the concept of the merit system, and may be used to circumvent
the procedural guarantees established by Congress in the basic civil service laws as interpreted
by the courts. Nevertheless, as authority for their procedure concerning mental fitness exams
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

and personality testing, government officials cite executive orders and civil service laws
recognizing presidential authority over selection procedures.

C. Current Practices and Uses of Tests

1. The Need for Testing


In defense of governmental use of psychological tests, officials argue that testing is
necessary, effective, and constitutes no real invasion of privacy. Civil Service Commission
Chairman John Macy testified that the Government is thereby able to screen its work force to
disqualify individuals with "demonstrable emotional or behavioral disorders that would create
a hazard both to the government and to the employee.
According to a representative of the State Department, "psychiatric evaluations and
psychological testing are two necessary and effective means of assuring that its employees
are fit for employment from both the medical and security standpoints. The Department
considers virtually every one of its position both in Washington and overseas to be a sensitive
job, requiring access to classified information.

2. Testing Procedures
The State Department representative described existing procedures under which
employees of the Department and twelve other agencies are examined. In each instance the
medical staff determines whether an individual should have a psychiatric evaluation or
undergo psychological testing, or both. These steps are taken whenever a staff physician
believes an employee may have an emotional or psychological problem which would require
treatment or impair his judgment and reliability or be aggravated by an overseas assignment.
If the employee agrees to a psychiatric examination, he is given a choice of one of the
Department's four consulting psychiatrists.

3. Agency Control of Testing


There is evidently a general disclaimer of responsibility by the agencies for fairness
and effectiveness of methods utilized in medical and psychiatric evaluations and for the
adequacy of qualifications of personnel involved. This is probably the primary reason for the
lack of uniformity of standards and procedures among agencies, and for variations in the way
different cases are handled in one department.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

4. Agency Uses of Testing


Psychological tests are given by the various agencies of the Government for a wide
variety of purposes. The Peace Corps, for example, makes use of the MMPI and other
personality tests as an integral part of the selection process. It considers the MMPI the "only
objective personality inventory which helps identify persons who may have or develop serious
personality disorders.

THE CONSTITUTIONAL CASE AGAINST TESTING

As we have pointed out, there exists no body of case law concerning psychological
testing as a condition or incident of government employment." Therefore, any guidelines
which the courts may in the future lay down in this area must evolve from one or more present
trends of constitutional development. The first of these is the law regulating the employment
relationship where the Government is the employer. The inquiry here relates to the
Government's power to impose conditions upon that relationship and the extent to which this
power is circumscribed by the due process clause of the fifth amendment."' The second trend
concerns recent developments which define, however vaguely, a constitutional right of
privacy.
A view widely held among psychologists, administrators and even members of
Congress is that federal employment is not a "right" but a "privilege." This leads to the
immediate and facile conclusion that personality testing-or any other requirement, for that
matter -may be made a condition of public employment regardless of any adverse
consequences to the individual.

A. Reasonably Related to the Desired Goal

In the first instance, it must be pointed out that the "reasonableness" test has
most frequently been applied to legislative action. However, where departments and
agencies rely upon general statutes for rule-making powers over their employees, it
would be logically inconsistent to suggest that the legislature is constrained by notions
of due process but that the various departments have a completely free hand to act.
If in accordance with traditional due process concepts the agencies may only act in a
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

manner reasonably calculated to achieve their legitimate ends, it could be argued that
psychological testing is purely arbitrary and therefore does not meet this criterion.
Even if some nexus can be shown between promoting the efficiency of the federal
service and the use of psychological tests, the serious infringement on personal liberty
which results from such tests would compel that the nexus be clearly indicated.

B. Right to Rebut Test Evidence

If it be argued that psychological testing may have some usefulness as a


screening device but that it is by no means an accurate indicator in every instance, the
question arises as to whether the employee should be able to present his own rebutting
psychological data-to "cross-examine" the tests. The Supreme Court has given rather
little guidance to indicate which procedures are necessary to insure that the
requirements of due process are met. Traditionally the courts have treated admission,
promotion, and dismissal from the civil service as matters to be dealt with by the
executive branch. Therefore, rather than set down standards of its own, the Supreme
Court in recent years has contented itself with scrutinizing the details of particular
cases to make certain that the various departments have rigidly adhered to whatever
procedural rules they may have enacted. Thus the constitutional issue has been
avoided.

C. The Analogy to Involuntary Confessions and Self-Incrimination

There are those who take an even dimmer view of psychological testing and
would ban it completely as a government personnel screening device. The argument
may be expressed in the following terms: Because of the social stigma attached to
adverse test results, the employee should be given "the same rights as he would have
in a criminal trial. The search and seizure of the contents of men's minds by a forced
submission to psychological testing should be denounced as offensive to those canons
of decency and fairness which express the notions of justice of English-speaking
peoples. A comparison can be made to the pumping of a man's stomach in order to
obtain evidence of illegal narcotics possession, a practice which was condemned by
the Court in Rochin v. California. To the extent that the analogy to criminal
proceedings can be maintained, it is obvious that there are also self-incrimination
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

objections to the utilization of test scores involuntarily received as a basis for adverse
action against the employe

D. The "Right of Privacy"


The final constitutional blow to be struck against psychological testing derives
from the evolving notion of a "right of privacy. This newest of constitutional rights
was initially an aspect of the fourth amendment's search and seizure clause and the
self-incrimination provision of the fifth amendment. However, it received an
independent status in Griswold v. Connecticut, grounded on the penumbras of the
specific guarantees of the Bill of Rights, the concept of "liberty" contained within the
due process clause of the fourteenth amendment, and the ninth amendment.

PROPOSED SOLUTIONS TO THE QUESTION OF TESTING

The possibility of a successful constitutional attack on psychological testing in a court


action appears to be a real possibility in the near future. The testimony received by the
Constitutional Rights Subcommittee shows that existing procedures for psychiatric evaluations
and psychological testing are deficient in terms of protection of employee rights. The necessity
for a court test, however, could be eliminated by changes in the current testing practices and
procedures used by the Government. Various alterations in the present situation were
suggested to the Subcommittee. One solution to at least part of the problem is to afford the
employee, and perhaps the applicant, an effective means of challenging the psychological
reports and the expertise of the psychologist.

In the final analysis, a thorough-going reform of existing procedures relating to


psychological testing is a matter which must be confronted by Congress. Congress must decide
whether, in light of the evolving law surrounding the right of privacy and the employment
relationship, a government employee's rights are inferior to those of any other citizen.
Congressional hearings on testing have pointed the way to solutions. They have, from all
indications, also initiated a much-needed dialogue between lawyers and others concerned
with individual rights and the scientists, technicians, and professional medical men responsible
for the new scientific instruments and devices. In the private sector, observance of the
individual's rights will depend to a very great extent upon the intensity and continuity of that
debate. However, insofar as a citizen's relations with his Government are concerned, Congress
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

has it within its power to insure that individual rights and liberties are not seconded to
technology.

c. Ethics and the Future of Psychological Testing


Theoretical Concerns
The dependability (reliability) of test results is one of the most significant considerations
underlying tests (Thomas & Selthon, 2003); Tryon & Bernstein, 2003). Reliability is defined as the
degree to which knowledge and skills learned are correct, accurate and up to date. That is the extent
to which tests are relatively free of measurement error (Abe, 2012). Reliability places an upper limit
on validity. A test that is totally unreliable is meaningless. There may be exception to this statement,
but general application demands that tests possess some form of reliability or stability. As a direct
consequence, whatever is being measured must itself have reliability. To say that a test has reliability
implies that tests results are attributable to a systematic source of variance which is reliable itself
(American Educational Research Association, American Psychological Association, and the National
Council on Measurement in Education (AERA, APA, & NCME, 1999; APA, 2002).
Most psychological tests today measure a presumably emotionally balanced entity-either the
person as he or she currently functions or some temporal emotional balanced characteristics of the
person. In providing current in-depth functioning, psychologists suggest that the individual functions
this way in a fairly stable, though perhaps short term manner that is free from outside control or
influence of the situation or environment. In other words, psychologists assume that they can give a
detailed account of the individual in absolute terms as if in a vacuum. Psychologists may opine that
the individual is emotionally unstable or that the individual is out of contact with the state of things
as they actually exist, or provide a diagnostic label such as Schizophrenic or neurotic

Moral Issues in Psychological Testing


The field of psychological testing is being shaped by moral issues such as human rights, labelling, and
privacy intrusion: Human Rights Several different kinds of human rights are recognized in
psychological testing. Test takers are usually treated with courtesy, irrespective of gender, ethnicity,
state or nation of origin, religious affiliation, and age, etc. Test takers are usually tested with measures
that meet professional benchmark or standards that are appropriate, receive explanation prior to test
examination on the kind(s) of tests to be conducted. Individuals who do not want to subject
themselves to testing should not and ethically cannot be forced to do so, hence, the individuals’
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

freedom to decline, and freedom to withdraw is highly respected unless situation(s) where the testing
is mandated by law or government (APA, 2002).
Labelling
There is nothing absolutely wrong in diagnosing people with kidney problem or disease, but labelling
people with certain Medical disease such as Acquired Immunodeficiency Syndrome (AIDs) and
psychiatric disorders can be damaging. For example, a reasonable percentage of the generality of the
public has little understanding of Schizophrenia. When diagnosing this kind of disease, it is advisable
to use least stigmatizing label consistent with accurate representation. Labels have the capacity to
affect one’s access to help. For instance, chronic Schizophrenia is not curable; as such, labelling
someone a chronic schizophrenia may be so harmful (McReynolds, Ward, & Singer, 2002).
Privacy Invasion
When people react or respond to psychological tests, they have little idea what is being
revealed. But in many cases they feel that their privacy has been invaded in a way that is not justified
by the tests benefits (Brayfield, 1965). Dahlstron (1969) stated that the issue of privacy invasion is
based on serious misunderstanding. He maintained that because tests have been oversold, the public
does not know their limits. Ambiguity of the motion of invasion of privacy is an important issue in
psychological tests. There is nothing absolutely wrong or detrimental in trying to find out about a
person. It is only the wrong application or use of the information gathered from the person that
amount to invasion of the person’s privacy.
Test Constructors and Test Users’ Responsibility
The testing profession has become increasingly stringent and precise in defining the ethics
and responsibility of test designers and test users. This is because even the best test can be misused.
In the right circumstance, almost any test can be useful, but when inappropriately used, even the best
test can be dangerous to the individual (APA, 2002). A major concern is the utilization of tests with
different populations. A test that is reliable and valid for group A may not be valid and reliable for
group B. In light of this issue, psychologists who administer tests are instructed to employ instruments
whose validity and reliability have been established for use with members of the population being
tested and to utilize assessment techniques that are most appropriate to a person’s best preferred
language.
Issues of Social Concern
In psychological testing, social issues such as dehumanization, usefulness of tests and access
to psychological testing services are of essential importance. This aspect will be limited to
dehumanization and usefulness of tests only.
Dehumanization
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Some forms of testing lurk any human from judgement making process. This is seen as
becoming more widespread with the increase in computer based testing. For instance, some
corporations provide computerized analysis of Minnesota Multiphasic Personality Inventory (MMPI-
2) and other test results (Kaplan & Saccuzzo, 2009). Such technology tends to reduce test takers’
freedom and uniqueness. With high speed information communication technology (computers) and
centralized data banks, the probability that computers will someday provide important evaluation
judgements about human lives is on the increase.
Usefulness of Tests
The important issue in testing is not whether the tests are perfect but whether they are useful
to the individual or the society. Tests need not be perfect in every area. Society often finds uses for
initial rough or simple instruments that have become precise with research and development
(McKnow, 2007; Meyer et al, 2003; Sawyer, 2007). For instance, scientists believed that the sun
revolved around the earth, the available methods and the principles were useful in that they led to
some precise predictions, even though the theories beneath were incorrect. In like manner, the
assumptions beneath today’s tests may be fundamentally incorrect and the resulting test instruments
far from perfect. The test however, may still be useful as long as they provide data that leads to better
predictions and understanding that can otherwise be obtained.
Current Fashions in Psychological Testing
Among the current fashions or issues in psychological testing are the development of new
tests (higher standards, improved technology, and objectivity), increase in public awareness and
influence, and computer and internet application.
The Development of new Tests
Studies have shown that hundreds of new tests are being published each year. The impetus
for developing these new tests comes from professional disagreement over the best strategies for
measuring human behaviour, the nature of these behaviours, and theories of these human
characteristics. An example is the 2004 modern version of the Kaufman Assessment Battery for
Children (KABC-11); this is an individual ability test for children between 3 and 18 years of age. The
test consists of 18 subsets combined into five global scales called sequential processing, simultaneous
processing, learning, planning, and knowledge (Kaufman & Kaufman, 2004a).
Increased Public Awareness and Influence on Testing
Increased public awareness of the nature and usefulness of tests has led to increasing external
influence on testing. Before this time, the public had little or no knowledge about psychological tests.
Today, there is wide spread awareness among the general public on the need and importance of
psychological tests and other forms of test.
University of the Philippines Visayas
College of Arts and Sciences Division of Professional Education
General Luna St., Iloilo City, Iloilo, 5000

Computer-Based Testing
One of the major trends in testing is the use of computers. Computers are being used in many
different ways. For example, in adaptive computerized testing, different sets of test questions are
administered through computer to different test takers, each depending on each of the traits being
measured (Mills, Potenza, Fremer, & Ward, 2003; Weiss, 1983, 1985). Likewise in ability testing, the
computer adjusts the level of item difficulty according to the test taker’s response. If the test taker’s
answer is incorrect, then an easier item is given; if correct, then a more difficult item appears next.
The Hope of new and Improved Tests
Psychologists believe that the dominant role of some of the popular tests such as Stanford-
Binet and Wechsler tests is far from secure. These two intelligence scales are probably technically
adequate as they will ever remain. They can be improved through minor versions to update test stimuli
and provide larger and even more representative normative samples with special norms for particular
groups via additional research to extend and support validity evidence.
All psychological tests are based on theories of human functioning. The validity of these
theories and the underlying assumption is far from proven. More so, there seem to be no consensus
or generally agreed assumption of the essence of human personality, normal or abnormal. With the
increase in the awareness of test users created by them for testing, the need for improving the existing
psychological test is necessary as some of the tests today may not be able to meet the psychological
needs of individuals considering the changes that take place in our body Chemistry which sometimes
may have some psychological implications on human personality or trait. As Kaufman Assessment
Battery for Children, Structured Personality Testing, and the MMPI-2 are already pioneering the 21st
century, Psychologist should be more creative in building new tests that will meet future testing needs
of the fast growing population and be persistent in modifying the existing tests while accomplishing
the goals of psychological testing.

You might also like