Salgado JOOP 2014 The Validity of Ipsative and Quasiã Â Â Ipsative FC Personality Inventories
Salgado JOOP 2014 The Validity of Ipsative and Quasiã Â Â Ipsative FC Personality Inventories
Salgado JOOP 2014 The Validity of Ipsative and Quasiã Â Â Ipsative FC Personality Inventories
Practitioner points
Personality inventories have been widely used in personnel selection, but it was thought that their
predictive validity was small. We found that they are substantially more valid than was previously
thought.
The traditional opinion among researchers in I/O psychology is that single-stimulus personality
inventories (e.g., normative Likert-type scales) have superior predictive validity to FC personality
questionnaires (e.g., ipsative inventories), but our research findings suggest that this is not true for
quasi-ipsative inventories.
In comparison with ipsative and normative personality inventories, quasi-ipsative personality
inventories showed higher predictive validity regardless of occupational group.
Based on our results, we recommend the use of quasi-ipsative FC personality measures in personnel
selection decisions regardless of the occupation group being recruited for.
Several meta-analyses conducted over the last 20 years have shown that the Five-Factor
Model (FFM) of personality predicts a wide range of performance outcomes, including job
performance, training proficiency, counterproductive behaviours, accidents, job satis-
faction, leadership, and innovative behaviours in the workplace (Barrick & Mount, 1991;
Bartram, 2005; Clarke & Robertson, 2005; Feist, 1998; Hough, 1992; H€ ulsheger,
*Correspondence should be addressed to Jesus F. Salgado, Facultad de Relaciones Laborales, University of Santiago de
Compostela, Campus Vida, 15782 Santiago de Compostela, Spain (email: jesus.salgado@usc.es).
DOI:10.1111/joop.12098
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
798 Jesus F. Salgado et al.
Anderson, & Salgado, 2009; Hurtz & Donovan, 2000; Judge & Bono, 2001; Judge, Rodell,
Klinger, Simon, & Crawford, 2013; Salgado, 1997, 2002, 2003; Tett, Rothstein, & Jackson,
1991). Research has also demonstrated that the FFM is a robust framework for grouping
the large variety of personality measures developed within the various theoretical
approaches (Barrick & Mount, 1991; Hough, 1992; Hurtz & Donovan, 2000; Salgado,
1997; Tett et al., 1991). Across these meta-analytic efforts, conscientiousness and
emotional stability were consistently found to be predictors of job performance for all
occupations and that the other three personality dimensions were valid predictors for
specific criteria and specific occupations (Barrick, Mount, & Judge, 2001).
All the meta-analyses mentioned above were carried out almost exclusively with
validity studies that used single-stimulus (SS) personality inventories, where individuals
evaluate one item at a time (i.e., Likert scales, yes/no, true/false items). In other words,
inventories where individuals evaluate each item separately from the other items.
Respondents therefore make absolute judgments about the extent to which the item
describes their personality (Brown & Maydeu-Olivares, 2013).
Contrasting with SS inventories, another type of personality inventories used in
personnel selection are forced-choice (FC) inventories. In connection with the latter, Tett,
Christiansen, Robie, and Simonet (2011) found that 30% of companies used FC personality
inventories. Despite the widespread use of this type of inventories among practitioners,
they have received relatively little attention from researchers in comparison with SS
inventories, in part due to the fact that many of them result in ipsative scores that produce
controversial statistical dilemmas (see, for instance, Bartram, 1996; Furnham, Steele, &
Pendleton, 1993; Johnson, Wood, & Blinkhorn, 1988; Saville & Willson, 1991; Tenopyr,
1988). Only recently has the validity of FC inventories begun to be investigated meta-
analytically (e.g., Bartram, 2005, 2007; Salgado & Tauriz, 2014).
The purpose of the present study is therefore to shed light on the validity of the FC
personality inventories, more specifically, on two important issues that have generally
been overlooked in existing research. The first is to examine whether occupational group
(i.e., job type) is a moderator of the validity of the two main types of FC inventories (i.e.,
ipsative and quasi-ipsative inventories). The second issue is to compare the overall results
of the current meta-analysis of ipsative and quasi-ipsative inventories with of the validity of
SS inventories as reported in the previous meta-analyses of Barrick and Mount (1991),
Salgado (1997), and Hurtz and Donovan (2000). Our main contribution lies in highlighting
the role that these two issues play in the validity of personality inventories for predicting
job performance. A final issue examined is whether the type of score (ipsative vs. quasi-
ipsative) has effects on the validity of FC personality inventories.
those measures that totally meet Clemans’ (1966) criterion of ipsativity, according to
which ‘any score matrix is said to be ipsative when the sum of the scores obtained over the
attributes measured for each respondent is constant’. The Occupational Personality
Questionnaire (OPQ; SHL, 2006), the Edwards Personal Preferences Schedule (EPPS;
Edwards, 1957), and the Description en Cinq Dimensions (D5D; Rolland & Mogenet,
2001) are three good examples of ipsative inventories. The third type, quasi-ipsative
scores, includes measures that do not totally meet the criterion of pure ipsativity
suggested by Clemans (1966), because, for example, not all alternatives ranked by
respondents are scored or the scales have different numbers of items. The Gordon
Personal Profile-Inventory (GPP-I, Gordon, 1993) and the IPIP-MFC (Heggestad, Morrison,
Reeve, & McCloy, 2006) are two examples of a quasi-ipsative inventory.
Each score type has important metric characteristics. For example, in the case of
normative scoring, the scores of an individual are statistically dependent on other
individuals in the population and independent of other scores of the assessed individual
(e.g., scores in other attributes) (see Bartram, 1996; Clemans, 1966; Hicks, 1970).
Consequently, normative scoring allows the comparison of individuals or groups on each
measured variable (i.e., they are interindividual scores).
In the case of ipsative measurement, the scores in a variable are dependent on the level
of the individual in other variables that are assessed and statistically independent of the
scores of other individuals in the population. In others words, a high score in one attribute
(e.g., conscientiousness) is necessarily accompanied by a low score in another attribute
(e.g., extroversion) due to the statistical dependence between the two scores. Therefore,
ipsative scores allow to compare the level of the individual across variables (i.e., they are
intra-individual scores), but they may be less appropriate for comparisons among
individuals (Cattell & Brennan, 1994; Clemans, 1966; Hicks, 1970). Additionally, ipsative
scores may have some characteristics which are seen as problematic from the
psychometric point of view, including negative correlations with other measures
(Meglino & Ravlin, 1998), inflated reliability coefficients (Johnson et al., 1988; Tenopyr,
1988; see also Bartram, 1996; and Thompson, Levitov, & Miederhoff, 1982; for a different
conclusion), and being inappropriate for factor analysis (Cattell & Brennan, 1994;
Cornwell & Dunlap, 1994; Dunlap & Cornwell, 1994; Meade, 2004).
According to Hicks (1970), Horn (1971), Gleser (1972), Gordon (1993), Cattell and
Brennan (1994), and Meade (2004), among others, quasi-ipsative scores are obtained if
any of the following conditions applies: (1) the summed attributes vary between
individuals over a certain range of score units of the tests, (2) the inventory yields attribute
scores that does not sum to the same constant for all individuals, even if the scores in these
tests posses properties in common with ipsative tests, and (3) the score elevation on one
attribute does not necessarily produce a score depression on other attributes. These
conditions for data to be quasi-ipsative may be achieved by means of several strategies. For
example, the following six strategies produce quasi-ipsativization: (1) individuals only
partially order the items, rather than ordering them completely; (2) scales have different
numbers of items; (3) not all alternatives ranked by respondents are scored; (4) scales are
scored differentially for individuals with different characteristics or involve different
normative transformations on the basis of respondent characteristics; (5) scored
alternatives are weighted differentially; and (6) the questionnaire has normative sections.
In view of this, quasi-ipsative scores share some psychometric characteristics with both
normative and purely ipsative scores. For example, they allow the comparison of
individuals and groups but simultaneously some degree of dependence can be found
among the scales of the questionnaire (Horn, 1971). Quasi-ipsative measures do not have
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Validity of forced-choiced inventories 801
the psychometric limitations of ipsative scores (Cattell & Brennan, 1994; Hicks, 1970;
Horn, 1971).
aim of the present meta-analysis is to examine and comprehensively report the moderator
effects of occupational group on the validity of FC inventories. Based on the previous
findings, we state the following hypothesis:
Hypothesis 1: The validity of personality measures will be moderated by occupational
category. More specifically, as measured by FC personality inventories,
Emotional Stability and Conscientiousness are valid predictors of job
performance in all occupational categories, Extraversion is a valid
predictor of job performance in sales and managerial occupations, and
Agreeableness is a valid predictor in customer service and health
occupations.
Hypothesis 2: The validity of SS inventories will be larger than the validity of quasi-
ipsative and ipsative inventories.
Method
Literature search and coding of studies
Computer-based and manual literature searches were conducted to identify published
and unpublished studies carried out up until 2012. To cover the literature on FC
personality measures as comprehensively as possible, and to prevent any bias in the
inclusion of studies, we adopted a series of search strategies. First, we identified the most
popular FC inventories. They included, for example, the OPQ, EPPS, MBTI, D5D, Survey of
Interpersonal Values Inventory (SIV), and GPP-I. Second, the PsycInfo, Social Sciences
Citation Index, and ABI/Inform databases were searched to identify studies on the
relationship between FC measures and organizational criteria. Several keywords were
used for the computer-based literature search (e.g., ipsative, FC, ipsativity, job
performance), as well as the acronyms of the most popular FC personality inventories.
Third, Internet searches using Google were carried out systematically to look for articles,
unpublished manuscripts, and master’s and doctoral dissertations not included in the
most common databases. Fourth, a manual article-by-article search was carried out in a
number of top-tier journals, including Applied Psychology, an International Review,
Educational and Psychological Measurement, the European Journal of Work and
Organizational Psychology, Human Performance, International Journal of Selection
and Assessment, Journal of Applied Psychology, Journal of Business and Psychology,
Journal of Occupational and Organizational Psychology, Journal of Organizational
Behavior, Personnel Psychology. Fifth, the reference sections of classic meta-analyses
were reviewed to identify articles not covered in our computer-based search. Sixth, we
contacted a number of researchers and asked for both published articles and unpublished
papers on the topic. Seventh, the technical manual of the most popular FC personality
inventories was examined to find validity coefficients. By means of these search strategies,
a preliminary database of 115 documents (i.e., articles, manuals, technical reports,
unpublished papers, dissertations, and so on) was established for further inspection. Of
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
804 Jesus F. Salgado et al.
these, 26 studies were excluded for various reasons: (1) some studies reported only the
significant correlations, (2) a number of studies only reported multiple correlation results,
(3) several of them did not report correlations or enough information to calculate the
effect size, and (4) several studies reported findings for the same data set. As a result of
these points, the present meta-analysis was conducted with 97 independent samples and a
total sample size of 18,593 individuals.
The next step was to classify the scales from the inventories into the Big Five
personality dimensions. A number of studies used a Big Five measure or estimates of the
Big Five (e.g., Bartram, 2007; McDaniel, Yost, Ludwick, Hense, & Hartman, 2004; Nyfield,
Gibbons, Baron, & Robertson, 1995; Robertson, Baron, Gibbons, MacIver, & Nyfield,
2000; SHL, 2006; Warr, Bartram, & Brown, 2005), and the coefficients of these studies
were used directly. With the rest of the studies, we used the following method for
classifying the coefficients within the FFM. First, an exhaustive description of the Big Five
was written and given to the coders (based on the definitions of the Big Five given by
Barrick & Mount, 1991; Hough, 1992; Hough & Ones, 2001; McCrae & Costa, 1990;
Salgado, 1997; among other sources). Next, a list and the definition of the personality
scales from each questionnaire were provided for each coder, with instructions to assign
each scale to the most appropriate factor. Furthermore, some studies reporting factor
analyses of the questionnaires were also used as a basis for the decision (e.g., Matthews,
Stanton, Graham, & Brimelow, 1990; McCrae & Costa, 1990; Piedmont, Costa, & McCrae,
1992; SHL, 2006) because these factor analyses were informative about the Five-Factor
structure of some FC inventories (e.g., OPQ, EPPS). Finally, we also checked the coding
list used by Ones (1993; Hough & Ones, 2001) and Salgado (2003). Two researchers
served as coders, working independently to code every study. One of them was a full
professor of work psychology with extensive experience in the area of personality at work
and meta-analysis. The second coder was a PhD student who is conducting research on
this topic. If the coders agreed on a dimension, the scale was coded in that dimension. The
disagreements (<10%) were solved by a discussion until the coders agreed on a dimension.
All the scales were assigned to a single dimension.
For each study, the following information was recorded, if available: (1) sample
characteristics, (2) occupation and related information, (3) personality measures used, (4)
criterion type, (5) reliability of personality measures, (6) criterion reliability, (7) range
restriction (RR) value or data for calculating this value, (8) statistics concerning the
relation between personality measures and criterion, and (9) correlation among the
personality measures when more than one was used. When a study contained conceptual
replications (i.e., two or more measures of the same construct were used in the same
sample), linear composites with unit weights for the components were formed. Linear
composites provide more construct valid estimates than the use of the average
correlation. As demonstrated by Warr, Bartram, and Brown (2005), the average validity
corrected with Mosier’s formula for composite reliability produces very accurate
estimates when the appropriate intercorrelations are used.
An important difference between this meta-analysis and previous meta-analyses is that
we used the type of FC measure (i.e., normative, ipsative, and quasi-ipsative) for grouping
the validity coefficients. This is especially relevant because different degrees of ipsativity
can result in different validity levels (Clemans, 1966; Hicks, 1970; Radcliffe, 1963). Using
Hicks’ (1970) taxonomy, we divided the personality inventories into three categories:
Purely ipsative, quasi-ipsative, and normative FC questionnaires. To classify the
questionnaires, each one was inspected in terms of the scoring method and the format
of items. Furthermore, we used the technical manuals of the questionnaires when
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Validity of forced-choiced inventories 805
available, and other articles which were not relevant for this meta-analysis because they
did not include validity data but did include relevant information about the questionnaire
characteristics and scoring system. The initial agreement level of the coders was 95%, and
the disagreements were solved by a discussion until the coders agreed on a questionnaire
category.
We classified the questionnaires as being purely ipsative if the sum of the scores
obtained over the scales measures was constant. Examples of purely ipsative question-
naires in our database are the EPPS, the OPQ, and the D5D.
We classified FC inventories as quasi-ipsative if any of the strategies and alternatives
mentioned by Hicks (1970) applied. Examples of quasi-ipsative questionnaires in our
database are the GPP-I, the Self-Description Inventory (Ghiselli, 1954), the ESQ (Jackson,
2002), and Assessment Individual Motivation (AIM; Knapp, Heggestad, & Young, 2004;
White, 2002).
Next, we classified a questionnaire as normative FC if it yielded scores which posses
the empirical properties of absolute measures. For example, this is the case of the
inventories in which items representing different degrees of a personality dimension are
never paired with items representing another personality dimension. The MBTI and the
questionnaire of Need of Achievement (Fineman, 1975) are representative examples of
normative FC questionnaires. The list of the FC inventories used in the validity studies
included in the database can be obtained from the first author upon request.
Finally, we examined whether the nature of studies (published vs. unpublished) and
publication year were potential moderators of the validity size. With regard to the nature
of the studies, comparing the overall results of ipsative measures, we found the following
results for published and unpublished studies: ES (.02 vs. 05), E (.06 vs. 06), O (.06 vs. 09),
A (.00 vs. .09), and C (.09 vs. 11). With regard to the quasi-ipsative studies, the results for
published and unpublished validities were: ES (.06 vs. .07), E (.14 vs. .14), A (.06 vs. .01),
and C (.16 vs. 14). Therefore, we can confidently conclude that the nature (published/
unpublished) of the studies has not practical effects on the overall results. With regard to
the second potential moderator, the publication year, we examined the correlation
between the validity size and the publication year using the totality of coefficients. We
found a very small correlation of .10, which indicates that the publication year explain
only 1% of the variance of the validities studies. We have also calculated the correlation
between the publication year and the validity for each personality dimension. The
correlations among the year and the validity size for the Big Five were .2, .15, .12, .18,
and .08 for ES, E, O, A, and C, respectively. In this last case, the average correlation was
.06. Therefore, our conclusion is that the publication year can be discarded as a
moderator of the validity.
Meta-analytic procedure
The following step was to apply the psychometric meta-analysis of Hunter and Schmidt
(1990, 2004). Psychometric meta-analysis estimates how much of the observed variance
of findings across studies is due to artefactual errors. The artefacts considered here were
sampling error, criterion and predictor reliability, and indirect range restriction (IRR) in
personality scores. To correct the observed validity for these last three artefacts, the most
common strategy was to develop specific distributions for each of them. Some of these
artefacts reduce the correlations below their operational value (e.g., criterion reliability
and RR), and all of them produce artefactual variability in the observed validity (Carreta &
Ree, 2001; Carretta & Ree, 2000). In this meta-analysis, we report the weighted average of
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
806 Jesus F. Salgado et al.
observed validity, the variance and SD of observed validity, the operational validity, the
theoretical validity, the SD of the theoretical validity, and the percentage of variance
accounted for by artefactual errors. We also calculated the 90% credibility value (lower
limit of the 80% credibility interval) and the 95% confidence interval of q. Both intervals
serve different purposes (Hunter & Schmidt, 2004; Judge & Bono, 2001). The confidence
interval provides an estimate of the variability of q that is if the 95% confidence interval
does not include zero then it can be 95% confident that q is nonzero. The credibility
interval is the estimate of the variability of individual correlations across studies, and its
lower limit means that 90% of individual correlations are equal to or greater than this value.
Predictor reliability
The reliability of the personality dimensions was estimated from the coefficients reported
in the studies included in the meta-analysis. As in previous meta-analyses, we used internal
consistency coefficients as estimates of reliability and a reliability distribution was
estimated for each personality dimension. We also examined the question of whether the
three types of FC scoring systems showed different levels of reliability, but they proved to
be very similar. For example, the average reliability estimates were .73, .81, and .81 for
normative, ipsative, and quasi-ipsative FC inventories. Based on the fact that ipsative and
quasi-ipsative inventories have the same average reliability, we pooled the coefficients for
these two formats, we create a distribution for each personality dimensions, and we used
these distributions in the meta-analysis. Table 1 presents a summary of these artefact
distributions. Our estimates of the average reliability for the three types of the FC
inventories are very similar to the estimates used in previous meta-analyses with SS
personality inventories (e.g., Barrick & Mount, 1991; Hurtz & Donovan, 2000; Salgado,
1997, 2002, 2003). For example, our figures are very similar to the estimates used the
values found by Viswesvaran and Ones (2000) for the reliability generalization study of the
FFM. Consequently, the empirical evidence we were able to find appears to suggest that
the internal consistency of FC and SS personality inventories is practically the same. This is
an important finding as some researchers had suggested that FC inventories would show a
different reliability size than SS inventories, with some authors claiming that in these cases
the reliability is overestimated (inflated) (Johnson et al., 1988; Tenopyr, 1988). Our
findings agree with Bartram’s (1996) perspective on the reliability of ipsative measures.
However, it should be taken into account that, to be totally conclusive about whether SS
and FC personality inventories have equal internal consistency, further studies are
needed, specifically comparing the number of items. Typically, FC inventories have a very
much larger number of items than SS inventories. The mean and the SD of reliability
distributions served different purposes in this meta-analysis. Reliability mean served for
correcting the observed validity coefficients, and the SD served for estimating the
corrected standard deviations of the operational validity and q.
Criterion reliability
The studies included in our database used four types of measures of job performance: (1)
job performance ratings, (2) productivity data (e.g., sales), (3) training proficiency, and (4)
overall job performance. From the literature (e.g., Hunter, 1986; Natham & Alexander,
1988; Ones, Viswesvaran, & Schmidt, 1993; Salgado & Moscoso, 1996; Salgado et al.,
2003; Schmidt & Zimmerman, 2004; Viswesvaran, Ones, & Schmidt, 1996), it is well
known that each type of job performance measure shows a different degree of reliability.
For example, Viswesvaran et al. (1996) found that the average inter-rater reliability of job
performance ratings was .52. Salgado et al. (2003) found exactly the same value with an
independent database. The reliability of productivity data was .80 in the meta-analysis by
Schmidt and Zimmerman (2004). For training proficiency, Salgado et al. (2003) found an
average reliability of .60 for the ratings of training performance and Hunter (1986)
reported a reliability of .80 for objective measures of training proficiency. In the current
study, the number of studies does not allow the calculation of separate meta-analyses for a
triple combination (FC type measure 9 occupational group 9 job performance measure
type) and not all studies provided information regarding the job performance reliability.
Therefore, we develop an empirical distribution for the job performance criterion for each
FC type measure 9 occupational group combination. To do this, we collapsed all studies
in a single job performance measure and estimated the average reliability for the total. If
the study used job performance ratings, the coefficient of interest was inter-rater reliability
when a meta-analysis of random effects is performed (Hunter, 1986; Sackett, 2003;
Schmidt & Hunter, 1996) because, if this type of reliability is used in the correction for
attenuation, it will correct most of the unsystematic errors in supervisor ratings (Hunter &
Hirsh, 1987), although not all researchers agree with this point of view (e.g., Murphy & De
Shon, 2000). We found 11 studies reporting inter-rater coefficients of job performance
ratings. The average coefficient was .52 (SD = 0.05) which is exactly the figure found by
Viswesvaran et al. (1996). For the studies using training success, we found two studies
reporting reliability. The average coefficient was .80 (SD = 0.09), which corresponds
with the estimate of Hunter (1986). For the studies using objective productivity measures,
seven studies reported reliability coefficients and the average reliability was .83, which is
practically the same as that found by Schmidt and Zimmerman (2004). Pooling together
the reliability coefficients and weighting for the number of studies using each criterion
type, we calculated the average reliability coefficient for job performance. This was .61
(SD = 0.13). Next, we estimated the criterion reliability to be used for each FC type
measure 9 occupational group combination. To do this, we used the number of studies
and the type of performance measures to calculate the average reliability for within each
combination. These estimates appear in Table 2. As a whole, the average reliability
estimates of job performance were .62, .58, and .59 for the ipsative, quasi-ipsative, and
normative personality measures. These values are in line with the meta-analyses of the
inter-rater reliability in existing personnel selection literature (i.e., Salgado & Moscoso,
1996; Salgado et al., 2003; Viswesvaran et al., 1996). Therefore, the differences in job
performance reliability across the types of FC inventories are minimal and without
practical effects.
RR distributions
The distributions for RR were based on the following three strategies: (a) some RR
values were obtained from the studies that reported both restricted and unrestricted
standard deviation data, (b) a second group of RR values was obtained using the
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
808 Jesus F. Salgado et al.
reported selection ratio (we applied the formula derived by Schmidt, Hunter, & Urry,
1976), and (c) another group of RR values were obtained using the SD reported in
the studies (SD restricted) and as SD unrestricted, we used the SD reported in the
manual of the specific inventory. This last strategy of using national norms of SD is
warranted, as Ones and Viswesvaran (2003) found that the SDs of the personality
measures of job applicants are about 2–9% less than those based on the national
norms. In the present study, the number of RR values based on the national norms is
small. We have obtained four norm-based RR values for EX, O, A, and C, and 5 values
for ES. Additionally, we have obtained 19 SR-based RR values for ES and A, 20 for EX,
16 for O, and 25 for C. Next, we compared the average RR values using national
norms with the average RR values obtained with the strategies ‘a’ and ‘b’ mentioned
above, and we did not find statistically significant differences. In the present case,
the differences were 6–10% less than those based on the national norms. Therefore,
the triple strategy produced a large number of RR estimates, and these were grouped
according to the personality dimensions. The average RRs (u) were .87 for emotional
stability, .90 for extraversion, .92 for openness to experience, .90 for agreeableness,
and .88 for conscientiousness. These RR values are very similar to the figures used in
previous meta-analysis (e.g., Barrick & Mount, 1991; Hurtz & Donovan, 2000;
Salgado, 1997, 2003) and are in accordance with the observation by Schmidt, Shaffer,
and Oh (2008) that the RR of personality measures is remarkably smaller than the RR
found in the validity studies of cognitive ability tests. A summary of these
distributions appears in Table 3.
Meta-analytic software
We used a software program developed by Schmidt and Le (2004). This is the only
available software which includes recent advances to correct for IRR. According to
Hunter, Schmidt, and Le (2006), IRR is the most common case of RR and it is present in all
concurrent validity studies and in practically all predictive validity studies conducted in
personnel selection (some studies in military selection research are the exception).
Consequently, correction for direct range restriction (DRR), rather than IRR, results in an
underestimation of the operational and true validity coefficients and in an overestimation
of the true variance. In a series of studies, Schmidt and his colleagues have demonstrated
the effects of IRR correction on the validity of the Big Five (Schmidt, Oh, & Le, 2006;
Schmidt et al., 2008). They found that IRR produces slightly larger validity sizes in the case
of personality measures.
We are interested in the relationship between the Big Five and performance, both as
theoretical constructs and as operational predictors, and therefore, we will report both
the operational validity and the true correlation. In summary, we will correct the observed
validity for criterion reliability and IRR for obtaining the operational validity, and for
predictor unreliability for obtaining the true correlation. The observed variance will be
corrected for by artefactual errors: Sampling error, criterion and predictor reliability, and
IRR.
Results
We carried out two groups of meta-analyses, one for ipsative personality inventories and
another for quasi-ipsative FC personality measures. We also conducted a third group of
meta-analyses with the studies that used normative FC personality inventories. However,
due to the fact that the number of available studies for this category is very small, the
findings can only serve informative purposes. The results for the normative FC can be
consulted in Appendix B. The results of the meta-analyses of personality–occupation
combinations appear in Tables 4 and 5. The validity coefficients were pooled across
personality dimensions and occupations. In the two tables, from left to right, the first four
columns represent the number of independent coefficients (K), the total simple size (N),
the observed validity weighted by the study simple size (rw), and the standard deviation of
the observed validity (SDr). The next three columns show the observed validity corrected
for the measurement error in criterion and IRR in predictor (operational validity, rc), the
full-corrected correlation (true validity, q), and the standard deviation of q. Finally, the last
four columns are the percentage of variance explained by the four artefactual errors
(sampling error, predictor reliability, criterion reliability, and IRR), the 90% credibility
value (90%CV), and the lower and upper limits of the 95% confidence interval of q. We
reported the rc and q in this table and in the next tables because they serve different
purposes. Operational validity is the coefficient used for predicting the criterion in
applied settings (e.g., making decisions about employees). True validity is the theoretical
correlation between the personality dimension and the criterion. This coefficient is used
for modelling the theoretical relationship between dependent and independent variables.
Although we are interested in both coefficients, we will concentrate in q in our comments.
We were able to create nine occupational categories, using the Dictionary of
Occupational Titles (D.O.T.) (USES, 1991) as a reference. However, not all were available
for the three types of FC measures. In the following two subsections, we comment on the
results separately for ipsative and quasi-ipsative FC personality measures.
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
810 Jesus F. Salgado et al.
Table 4. Results of the meta-analyses of the validity of ipsative measures of Big Five for occupational
groups
Personality
dimension K N rw SDr rc qi SDq %VE 90%CV CIU CIL
Emotional stability
Customer service 2 344 .04 0.03 .06 .07 0.00 100 .07 .18 .04
Managerial 21 4,222 .04 0.08 .07 .08 0.09 68 .03 .11 .05
Police 3 700 .04 0.05 .07 .08 0.00 100 .08 .15 .01
Sales occupations 4 401 .02 0.03 .03 .03 0.00 100 .03 .07 .13
Supervisory 4 423 .01 0.04 .01 .01 0.10 100 .01 .11 .09
Mean (across .03 0.07 .06 .07 0.07 94 .02 .10 .04
occupations)
Extraversion
Customer service 2 344 .05 0.02 .08 .09 0.00 100 .09 .20 .02
Managerial 20 4,154 .09 0.09 .14 .15 0.08 68 .04 .18 .12
Police 3 700 .02 0.07 .03 .04 0.02 96 .01 .11 .03
Sales occupations 5 594 .08 0.02 .10 .11 0.00 100 .11 .19 .03
Supervisory 4 423 .04 0.08 .07 .07 0.00 100 .07 .03 .17
Mean (across .07 0.08 .11 .12 0.06 93 .04 .14 .10
occupations)
Openness to experience
Customer service 2 344 .05 0.04 .07 .08 0.00 100 .08 .19 .03
Managerial 20 4,154 .05 0.08 .07 .08 0.07 72 .01 .11 .05
Police 3 700 .08 0.14 .11 .12 0.20 22 .13 .19 .05
Sales occupations 4 401 .02 0.15 .03 .03 0.15 44 .15 .07 .13
Supervisory 4 423 .01 0.06 .02 .02 0.00 100 .02 .12 .08
Mean (across .05 0.09 .06 .07 0.08 68 .02 .10 .04
occupations)
Agreeableness
Managerial 20 4,154 .00 0.09 .00 .00 0.09 60 .11 .03 .03
Police 2 475 .03 0.02 .04 .05 0.00 100 .05 .14 .04
Sales occupations 5 594 .15 0.13 .20 .21 0.12 52 .05 .29 .13
Supervisory 4 423 .06 0.09 .09 .10 0.00 100 .10 .19 .01
Mean (across .02 0.09 .02 .03 0.08 78 .02 .06 .00
occupations)
Conscientiousness
Customer service 3 551 .12 0.06 .19 .21 0.00 100 .21 .29 .13
Managerial 21 4,401 .08 0.10 .13 .14 0.12 54 .01 .17 .11
Police 3 700 .07 0.12 .12 .13 0.11 58 .00 .20 .06
Sales occupations 5 665 .18 0.06 .24 .27 0.00 100 .24 .34 .20
Supervisory 4 423 .08 0.09 .14 .16 0.00 100 .16 .25 .07
Mean (across .10 0.09 .13 .14 0.09 82 .02 .16 .12
occupations)
Note. K, number of independent samples; N, total sample size; rw, observed validity; SDr, standard
deviation of observed validity; rc, operational validity; qi, validity corrected for criterion reliability and
indirect range restriction in predictor; SDq, standard deviation of qi; %VE, percentage of variance
accounted for by artefactual errors; 90%CV, 90% credibility value based on q; CIU, upper limit of the 95%
confidence interval of q; CIL, lower limit of the 95% confidence interval of q.
Table 5. Results of the meta-analyses of the validity of quasi-ipsative measures of Big Five for occupational groups
Emotional stability
Health 4 803 .05 0.20 .08 .09 0.35 13 .36 .16 .02
Managerial 10 1,528 .01 0.10 .02 .02 0.10 69 .10 .03 .07
Military 9 2,799 .09 0.08 .14 .16 0.08 60 .05 .20 .12
Sales occupations 4 472 .11 0.12 .18 .21 0.13 65 .04 .30 .12
Skilled 5 1,350 .30 0.19 .46 .51 0.26 21 .18 .55 .47
Supervisory 3 171 .37 0.07 .61 .68 0.00 100 .68 .76 .60
Mean (across occupations) .11 0.12 .17 .20 0.13 54 .03 .22 .18
Extraversion
Clerical 2 288 .14 0.21 .23 .25 0.33 17 .18 .36 .14
Health 3 773 .10 0.11 .16 .18 0.16 36 .03 .25 .11
Managerial 11 1,685 .21 0.05 .31 .34 0.00 100 .34 .38 .30
Military 10 3,007 .07 0.10 .10 .11 0.12 37 .05 .15 .07
Sales occupations 4 472 .05 0.20 .07 .08 0.29 23 .28 .17 .01
Skilled 4 1,018 .17 0.13 .25 .28 0.17 32 .06 .34 .22
Mean (across occupations) .07 0.10 .11 .12 0.12 41 .03 .14 .10
Openness to experience
Clerical 2 288 .27 0.24 .41 .44 0.34 14 .01 .53 .35
Health 3 773 .17 0.04 .27 .29 0.00 100 .29 .35 .23
Managerial 8 1,085 .21 0.06 .29 .32 0.00 100 .32 .37 .27
Military 6 748 .15 0.05 .21 .23 0.00 100 .23 .30 .16
Sales occupations 2 236 .11 0.09 .16 .17 0.00 100 .17 .29 .05
Mean (across occupations) .14 0.07 .20 .22 0.02 83 .19 .25 .19
Agreeableness
Clerical 2 288 .25 0.01 .40 .44 0.00 100 .44 .53 .25
Health 3 773 .17 0.07 .28 .31 0.00 100 .31 .37 .25
Managerial 8 1,085 .04 0.14 .06 .07 0.17 38 .15 .13 .01
Military 8 1,760 .05 0.12 .07 .07 0.07 32 .11 .12 .02
Validity of forced-choiced inventories
Continued
811
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
812
Table 5. (Continued)
Jesus F. Salgado et al.
Sales occupations 2 236 .07 0.14 .11 .12 0.17 44 .10 .25 .01
Skilled 2 796 .28 0.03 .42 .45 0.00 100 .45 .51 .39
Mean (across occupations) .10 0.10 .15 .16 0.07 83 .07 .19 .13
Conscientiousness
Clerical 3 357 .16 0.09 .28 .31 0.00 100 .31 .40 .22
Customer service 2 398 .29 0.05 .45 .50 0.00 100 .50 .57 .43
Health 3 773 .21 0.09 .35 .40 0.05 88 .33 .46 .34
Managerial 9 1,132 .10 0.09 .15 .17 0.00 100 .17 .23 .11
Military 10 3,007 .12 0.10 .19 .21 0.12 46 .05 .24 .18
Sales occupations 4 472 .22 0.04 .35 .39 0.00 100 .39 .47 .31
Skilled 8 2,338 .43 0.09 .64 .71 0.00 100 .71 .73 .69
Supervisory 3 171 .09 0.10 .16 .18 0.00 100 .18 .33 .03
Mean (across occupations) .22 0.09 .34 .38 0.05 92 .32 .40 .36
Note. K, number of independent samples; N, total sample size; rw, observed validity; SDr, standard deviation of observed validity; rc, operational validity; qi, validity
corrected for criterion reliability and indirect range restriction in predictor; SDq, standard deviation of qi; %VE, percentage of variance accounted for by artefactual
errors; 90%CV, 90% credibility value based on q; CIU, upper limit of the 95% confidence interval of q; CIL, lower limit of the 95% confidence interval of q.
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Validity of forced-choiced inventories 813
managerial occupations to .71 for skilled labour occupations. All the 90%CVs were
positive and very different from zero, which is supporting evidence of validity
generalization for predicting job performance. On average, the validity of conscien-
tiousness was .38, a coefficient remarkably larger than the coefficients found in
previous meta-analyses. The validity for clerical, customer service, health care, sales
occupations, and skilled labour occupations was especially relevant as they were larger
than .30 in all cases. These results fully supported Hypothesis 1. Compared to the
average validity of ipsative measures of conscientiousness, the average validity of quasi-
ipsative inventories is nearly three times larger (.38 vs. .14). This finding fully
supported Hypothesis 2.
Emotional stability was a valid predictor of job performance for military, sales, skilled,
and supervisor occupations and showed validity generalization for these four occupa-
tions. The validity of emotional stability ranged from .02 to .68, which is consistent with
Hypothesis 1. The average q was .20, which is practically three times larger than the
validity of ipsative inventories of this personality dimension. Therefore, Hypothesis 2 was
also confirmed for emotional stability.
With regard to extraversion, this was a valid predictor for managerial and skilled
labour occupations and showed validity generalization in both cases. Extraversion
also showed a small but relevant validity coefficient for clerical and health care
occupations, but there was no evidence supporting validity generalization in these
cases, as the 90%CVs were negative. The values of q ranged from .28 (skilled
labour occupations) to .34 (managerial occupations), which is consistent with
Hypothesis 1. The average validity of the quasi-ipsative inventories was .12, which is
the same value found for the ipsative inventories. The average validity for the
occupational groups mentioned in Hypothesis 1 (i.e., managerial and sales) was .28.
Comparing this last value with the validity of the ipsative measures for the same
occupations, the quasi-ipsative inventories showed more than twice the validity of
the ipsative inventories. Therefore, Hypothesis 2 was also supported for extraver-
sion.
No hypothesis was advanced for openness to experience with regard to specific
occupational groups. However, we found that openness consistently predicted job
performance across all occupations, with coefficients ranging from .44 to .32. Openness
predicted job performance negatively for clerical occupations and positively for health
care, managerial, military, and sales occupations. These results confirmed Hypothesis 1
about the moderator effects of the occupational group on the validity of personality
measures. Overall, the validity of the quasi-ipsative measures of openness was .20, which
is about three times larger than the validity of the ipsative measures. This finding
supported Hypothesis 2.
Finally, agreeableness was a valid predictor of job performance for clerical, health
care, and skilled labour occupations, with coefficients ranging from .31 to .45. The
result for health care occupations was predicted by Hypothesis 1. Agreeableness did
not predict job performance for managerial and sales occupations, which is
consistent with the findings of the meta-analyses by Barrick and Mount (1991),
Salgado (1997), and Hurtz and Donovan (2000). The validity coefficients ranged from
.07 to .45, supporting Hypothesis 1. The average validity was .16 for the totality of
the occupational groups. This value clearly contrasts with the average validity of
.03 found for the ipsative inventories. Consequently, Hypothesis 2 was also
confirmed for agreeableness.
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Validity of forced-choiced inventories 815
Table 6. Comparison among the meta-analyses of the validity of SS and FC personality inventories
Variable K N q K N q K N q K N q K N q K N q
ES 124 19,507 .08 32 3,877 .19 37 5,671 .14 193 29,055 .11 35 7,123 .20 34 6,090 .07
E 123 18,719 .13 30 3,806 .12 39 6,453 .10 192 28,978 .12 34 7,243 .12 34 6,215 .12
O 82 14,236 .04 18 2,722 .09 35 5,525 .07 135 22,483 .05 21 3,130 .22 33 6,022 .07
A 112 17,520 .07 26 3,466 .02 40 6,447 .13 178 27,433 .08 25 4,938 .16 31 5,646 .03
C 123 19,721 .22 24 3,295 .25 45 8,083 .22 192 31,103 .22 43 8,648 .38 36 6,740 .14
Note. K, number of coefficients; N, total sample size; q, theoretical validity; B&M-91, Barrick and Mount (1991); SAL-97, Salgado (1997); H&D-00, Hurtz and Donovan
(2000); Average-SS, weighted-sample average validity of single-stimulus personality inventories; FC-QI, forced-choice quasi-ipsative personality inventories; FC-IP,
forced-choice ipsative personality inventories.
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Table 7. Comparison among the meta-analyses of SS, quasi-ipsative, and ipsative validities
Variable K N q K N q K N q K N q K N q
Manager
ES 55 10,324 .08 6 987 .12 4 495 .13 65 11,806 .08 10 1,528 .10
E 59 11,335 .18 6 987 .05 4 495 .13 69 12,817 .17 11 1,685 .34
O 37 7,611 .08 5 787 .03 4 495 .03 46 8,893 .07 8 1,085 .32
A 47 8,597 .10 6 987 .04 4 495 .04 57 10,079 .08 8 1,085 .07
C 52 10,058 .22 6 987 .16 4 495 .19 62 11,540 .21 9 1,132 .17
Sales
ES 19 2,486 .07 6 576 .07 7 799 .15 32 3,861 .06 4 472 .21
E 22 2,316 .15 6 576 .11 8 1,044 .16 36 3,936 .11 4 472 .08
O 12 1,566 .02 6 732 .04 18 2,298 .00 2 236 .17
A 16 2,344 .00 6 576 .02 8 959 .06 30 3,879 .02 2 236 .12
C 21 2,263 .23 6 576 .18 10 1,369 .29 37 4,208 .24 4 472 .39
Skilled
ES 26 3,694 .12 12 1,264 .25 11 1,874 .09 49 6,832 .14 5 1,350 .51
E 23 3,888 .01 11 1,209 .08 12 2,385 .01 46 7,482 .02 4 1,018 .28
O 16 3,219 .01 7 1,208 .17 11 1,874 .02 34 6,301 .03
A 28 4,585 .06 6 876 .05 12 2,385 .11 46 7,846 .07 2 796 .45
C 25 4,588 .21 8 1,264 .23 14 3,841 .17 47 9,693 .20 8 2,338 .71
Note. K, number of coefficients; N, total sample size; q, theoretical validity; B&M-91, Barrick and Mount (1991); SAL-97, Salgado (1997); H&D-00, Hurtz and Donovan
(2000); Average-SS, weighted-sample average validity of single-stimulus personality inventories; FC-QI, Forced-choice quasi-ipsative personality inventories.
Validity of forced-choiced inventories
817
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
818 Jesus F. Salgado et al.
conscientiousness (.24 vs. .39). Finally, with regard to skilled jobs, the quasi-ipsative
inventories showed larger validity than the SS inventories for the four factors with
estimates of validity, that is emotional stability (.14 vs. .51), extraversion (.02 vs. .28),
agreeableness (.07 vs. .45), and conscientiousness (.20 vs. .71).
Discussion
This meta-analysis makes some unique contributions that should be noted. The first was to
examine meta-analytically the validity of the FFM as assessed with ipsative and quasi-
ipsative measures derived from FC personality inventories for predicting job performance
in nine occupational categories. Globally, quasi-ipsative FC measures of the five
personality factors proved to be valid predictors of job performance and showed validity
generalization for openness, agreeableness, and conscientiousness. With regard to
ipsative FC inventories, only extraversion and conscientiousness showed validity
generalization, although the validity size was very small.
Second, our findings show that occupation is a potent moderator of validity for both
ipsative and quasi-ipsative FC measures. This finding concurs with previous meta-analytic
findings which showed that occupational group moderated the validity of SS personality
inventories (e.g., Barrick & Mount, 1991; Hurtz & Donovan, 2000; Salgado, 1997, 1998a).
Therefore, the role of the occupation being assessed seems to be similar for SS and FC
personality inventories.
Third, across occupations, quasi-ipsative inventories of personality predict job
performance substantially better than ipsative FC inventories. This held true across the
five personality dimensions of the FFM. Globally, the validity of quasi-ipsative measures
was about three times larger than the ipsative equivalent for emotional stability, openness,
agreeableness, and conscientiousness. This is a substantial improvement and has
important ramifications for the design and use of personality inventories for employee
selection. In essence, our findings demonstrate unequivocally that a quasi-ipsative scaling
format produces substantially higher validity than the ipsative formats that have typically
been more widely used in the past.
Fourth, a unique contribution of this meta-analysis was to compare the validity of
ipsative and quasi-ipsative measures of personality with the validity of SS inventories. The
results of this comparison showed that SS inventories were better predictors of job
performance than ipsative FC inventories. Therefore, Hicks’ (1970) hypothesis that SS
inventories would show larger validity than ipsative inventories was supported in our
study. However, the quasi-ipsative inventories were better predictors than the SS ones,
contrary to the prediction of Hypothesis 3. The difference between quasi-ipsative and SS
inventories was especially remarkable in the case of conscientiousness. The estimate of
the average validity of conscientiousness from the independent meta-analyses of Barrick
and Mount (1991), Hurtz and Donovan (2000), and Salgado (1997) was .22 (cumulated
N = 30,999), while the validity found here for the quasi-ipsative measures of conscien-
tiousness was .38 pooled across occupations (N = 8,648). The quasi-ipsative measures of
emotional stability, openness, and agreeableness were also more valid than their
counterpart SS measures, although the magnitude of the validity was smaller. Therefore, as
a whole, this may be the most important finding of this meta-analytic effort, and it suggests
that quasi-ipsativity matters.
Fifty, an additional contribution was to show that openness to experience may be a
more important predictor of job performance than has been thought. Previous meta-
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Validity of forced-choiced inventories 819
analyses generally agreed that openness to experience was a predictor of training success,
but openness was irrelevant for predicting job performance overall and across
occupations (Barrick & Mount, 1991; Hurtz & Donovan, 2000; Salgado, 1997, 2003).
The present findings suggest that, if openness to experience is assessed by quasi-ipsative
personality inventories, its validity may be relevant for a number of occupations, including
clerical, health, managerial, military, and sales occupations. Nevertheless, although the
global finding is relatively robust as it is based on 21 studies and 3,130 individuals, the
results for the specific occupations must be treated with caution as several validity
coefficients were estimated with only two or three samples. There is no clear explanation
for the differential effect found for openness to experience at present. We conjecture two
possible explanations. The first explanation is based on the possible relationship between
openness to experience and general mental ability (GMA). Previous research has shown
that the correlation between these two variables is around .22 when openness is assessed
with SS personality measures (e.g., Judge, Jackson, Shaw, Scott, & Rich, 2007). It is
possible that quasi-ipsative measures of openness may be loaded more strongly by GMA
than SS measures of openness to experience, and consequently, an indirect effect of GMA
might explain the criterion validity of openness. However, there is currently no
conclusive empirical evidence on this. The second potential explanation is based on the
possible relationship between openness and conscientiousness. The Big Five personality
dimensions are orthogonal when the factor space is examined but show moderate to large
correlations among them when raw scores are used for their estimation (see, Costa &
McCrae, 1992). Ones (1993; see also Ones et al., 1996) found that the correlation
between openness and conscientiousness was .06. However, recent research has
shown that this correlation may be substantially larger. Hogan, Barrett, and Hogan (2007),
using a very large data set of applicants, found that the observed correlation between
openness and conscientiousness was .31 and .36 (N = 5,266) in two evaluations of the
same applicants, 6 months apart. The correlations corrected for measurement error in
openness and conscientiousness are .43 and .48 (average correlation = .45). At present,
the correlation between quasi-ipsative measures of openness and conscientiousness
remains to be estimated. If the correlation in this last case was similar to the one found by
Hogan et al. (2007) for SS measures, then the validity of openness could be partially due to
conscientiousness. Nevertheless, with regard to this second conjecture, it should be taken
into account that there is also research that finds that conscientiousness, agreeableness,
and emotional stability tend to show moderate intercorrelations and extraversion and
openness tend to show some moderate intercorrelations (e.g., Chang, Connelly, & Geeza,
2012). Future research is therefore needed into these and other explanatory alternatives.
the items included as alternatives. Therefore, based on the volume of cognitive demands,
quasi-ipsative measures could show a higher correlation with GMA than SS measures.
However, the empirical results are not conclusive. For example, in one experimental
study, Vasilopoulos et al. (2006) found that GMA correlated .36 with FC measures of
openness and conscientiousness in an applicant condition but not in an honest condition
( .03 and .09, respectively). Converse et al. (2008) found that GMA correlated .01, .12,
and .01 with the FC measures of time management, extraversion, and resilience,
respectively. An implication of the potential relation between GMA and FC personality
measures is that their incremental validity may be comparatively smaller than the
incremental validity of SS inventories (Vasilopoulos et al., 2006). The potential
relationships between GMA and FC inventories may also have implications for test
fairness and adverse impact (Converse et al., 2008). The larger the correlation between
GMA and FC inventories is the more potential adverse impact.
Another relevant question is related to applicant reactions to FC measures. The meta-
analysis by Anderson, Salgado, and H€ ulsheger (2010) found that personality inventories
are positively evaluated in general. However, this may be not the case for FC inventories.
For example, Converse et al. (2008) found that the FC format did affect self-reported test-
taking ease, test-taking anxiety, and positive affect, and the applicant may have a less
positive reaction to FC measures than to SS personality inventories. Future studies should
therefore examine applicant reactions to quasi-ipsative inventories.
A third research question is about the relationships among the Big Five as measured
with quasi-ipsative personality inventories. This is a relevant question to estimate the
validity of a composite of personality measures (e.g., composites of emotional stability,
conscientiousness, and agreeableness) and to compare its validity with the validity of
other personality compounds frequently used for selection decisions (i.e., integrity tests).
It would be also interesting to explore issues of incremental validity of one type of format
(e.g., SS inventories) over another (e.g., quasi-ipsative or normative inventories).
There are two further research issues that future studies should examine. The first is to
achieve a better estimation of the validity of normative FC personality inventories with a
larger database. According to Hicks (1970), these should show greater validity than quasi-
ipsative and ipsative inventories. However, the number of studies included in our
database precludes being conclusive with regard to this point. The second is to ascertain
whether the FC format is more resistant to faking than the SS format. In theory, resistance
to faking should be similar for the three types of FC personality inventories, as protection
against faking is related to the format of the inventory and not the type of scores produced
by the format. Nevertheless, this last prediction has to be empirically tested in order to be
able to reach conclusions.
recover normative scores from already existing ipsative and quasi-ipsative FC inventories
(Brown & Maydeu-Olivares, 2013). These two advances may change some previously
negative views over the validity of personality inventories and reinforce the more positive
views of practitioners (e.g., Anderson, 2005).
Conclusion
In summary, quasi-ipsative FC measures of personality, mainly of conscientiousness, were
valid predictors across all occupational categories. In general, they were twice as high as
the validity of ipsative FC personality measures. Furthermore, quasi-ipsativity appears to
have positive effects on the predictive validity of other personality dimensions, such as
openness, that until now had shown low efficiency for predicting job performance. This
initial meta-analysis suggests that quasi-ipsative FC measures may be more valid predictors
of job performance than SS personality inventories. Future research should be devoted to
examining other aspects of ipsativity not studied in this meta-analysis, and new studies
should be carried out for other occupations. Also, we recommend that new personality
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
822 Jesus F. Salgado et al.
Acknowledgements
The research reported in this article was partially supported by Grant PSI2011-27943 from the
Ministry of Economy and Competitiveness (Spain) to Jes us F. Salgado and by Grant IN-2012-095
from the Leverhulme Trust (U.K.) to Neil Anderson.
References
Articles with an asterisk are included in the meta-analysis
*Adkins, C. L., & Naumann, S. E. (2001). Situational constraints on the achievement-performance
relationship: A service sector study. Journal of Organizational Behavior, 22, 453–465. doi:10.
1002/job.96
Anderson, N. (2005). Relationships between practice and research in personnel selection: Does the
left hand know what the right is doing? In A. Evers, N. Anderson & O. Smit-Voskuyl (Eds.), The
blackwell handbook of personnel selection (pp. 1–24). Oxford, UK: Blackwell.
Anderson, N., Salgado, J. F., & H€ ulsheger, U. R. (2010). Applicant reactions in selection:
Comprehensive meta-analysis intro reaction generalization versus situational specificity.
International Journal of Selection and Assessment, 18, 291–304. doi:10.1111/j.1468-2389.
2010.00512.x
*Antler, L., Zaretsky, H. H., & Ritter, W. (1967). The practical validity of the Gordon Personal Profile
among United States and foreign medical residents. Journal of Social Psychology, 72, 257–263.
doi:10.1080/00224545.1967.9922323
*Balch, D. E. (1977). Personality trait differences between successful and non-successful police
recruits at a typical police academy and veteran police officers (Unpublished Doctoral
Dissertation). U.S. International University.
Barrick, M. R., & Mount, M. K. (1991). The Big Five personality dimensions and job performance: A
meta-analysis. Personnel Psychology, 44, 1–26. doi:10.1111/j.1744-6570.1991.tb00688.x
Barrick, M. R., Mount, M. K., & Judge, T. (2001). Personality and performance at the beginning of the
new millennium: What do we know and where do we go next? International Journal of
Selection and Assessment, 9, 9–30. doi:10.1111/1468-2389.00160
Bartram, D. (1996). The relationship between ipsatized and normative measures of personality.
Journal of Occupational and Organizational Psychology, 69, 25–39. doi:10.1111/j.2044-
8325.1996.tb00597.x
*Bartram, D. (2005). The great eight competencies: A criterion-centric approach to validation.
Journal of Applied Psychology, 90, 1185–1203. doi:10.1037/0021-9010.90.6.1185
*Bartram, D. (2007). Increasing validity with forced-choice criterion measurement formats.
International Journal of Selection and Assessment, 15, 263–272. doi:10.1111/j.1468-2389.
2007.00386.x
Bennett, M. (1977). Testing management theories cross-culturally. Journal of Applied Psychology,
62, 578–581. doi:10.1037/0021-9010.62.5.578
*Brown, A., & Bartram, D. (2009, April 2–4). Doing less but getting more: improving forced-choice
measures with IRT. Paper presented at the 24th Annual Conference of the Society for Industrial
and Organizational Psychology, New Orleans.
Brown, A., & Maydeu-Olivares, A. (2013). How IRT can solve problems of ipsative data in forced-
choice questionnaires. Psychological Methods, 18, 36–52. doi:10.1037/a0030641
Carreta, T., & Ree, M. J. (2001). Pitfalls of ability research. International Journal of Selection and
Assessment, 9, 325–335. doi:10.1111/1468-2389.00184
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Validity of forced-choiced inventories 823
Carretta, T., & Ree, M. J. (2000). General and specific cognitive and psychomotor abilities in
personnel selection: The prediction of training and job performance. International Journal of
Selection and Assessment, 8, 227–236. doi:10.1111/1468-2389.00152
Cattell, R. B., & Brennan, J. (1994). Finding personality structure when ipsative measurements are
the unavoidable basis of the variables. American Journal of Psychology, 107, 261–274. doi:10.
2307/1423040
Chang, L., Connelly, B. S., & Geeza, A. A. (2012). Separating method factors and higher order traits of
the Big Five: A meta-analytic multitrait–multimethod approach. Journal of Personality and
Social Psychology, 102, 408–426. doi:10.1037/A0025559
Christiansen, N. D., Burns, G. N., & Montgomery, G. E. (2005). Reconsidering forced-choice item
formats for applicant personality assessment. Human Performance, 18, 267–307. doi:10.1207/
s15327043hup1803_4
Clarke, S., & Robertson, I. T. (2005). A meta-analytic review of the Big Five personality factors and
accidents involvement in occupational and non-occupational settings. Journal of Occupational
and Organizational Psychology, 78, 355–376. doi:10.1348/096317905X26183
Clemans, W. V. (1966). An analytical and empirical examination of some properties of ipsative
measures. Psychometric Monographs, 14. Richmond, VA: Psychometric Society.
*Clevenger, J., Pereira, G. M., Wiechman, D., Schmitt, N., & Schmidt-Harvey, V. (2001). Incremental
validity of situational judgment tests. Journal of Applied Psychology, 86, 410–417. doi:10.1037/
0021-9010.86.3.410
Converse, P. D., Oswald, F. L., Imus, A., Hedricks, C., Roy, R., & Butera, H. (2008). Comparing
personality tests and warnings: Effects on criterion-related validity and test-taker reactions.
International Journal of Selection and Assessment, 16, 155–169. doi:10.1111/j.1468-2389.
2008.00420.x
*Conway, J. M. (2000). Managerial performance development constructs and personality correlates.
Human Performance, 13, 23–46. doi:10.1207/S15327043HUP1301_2
Cornwell, J. M., & Dunlap, W. P. (1994). On the questionable soundness of factoring ipsative data: A
response to Saville and Wilson (1991). Journal of Occupational and Organizational
Psychology, 67, 89–100. doi:10.1111/j.2044-8325.1994.tb00553.x
Costa, P. T., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NEO-PI-R) and NEO Five-
Factor Inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment
Resources.
De Fruyt, F., & Salgado, J. F. (2003). Applied personality psychology: Lessons learned from the IWO
field. European Journal of Personality, 17, 123–131. doi:10.1002/per.486
Dudley, N. M., Orvis, K. A., Lebiecki, J. E., & Cortina, J. M. (2006). A meta-analytic investigation of
conscientiousness in the prediction of job performance: Examining the intercorrelations and the
incremental validity of narrow traits. Journal of Applied Psychology, 91, 40–57. doi:10.1037/
0021-9010.91.1.40
Dunlap, W. P., & Cornwell, J. M. (1994). Factor analysis of ipsative measures. Multivariate
Behavioral Research, 29, 115–126. doi:10.1207/s15327906mbr2901_4
Edwards, A. L. (1957). Manual for the Edwards Personal Preference Schedule. New York, NY:
Psychological Corporation.
Feist, G. J. (1998). A meta-analysis of personality in scientific and artistic creativity. Personality and
Social Psychology Review, 2, 290–309. doi:10.1207/s15327957pspr0204_5
*Fine, S., & Dover, S. (2005). Cognitive ability, personality, and low fidelity simulation measures in
predicting training performance among customer service representatives. Applied HRM
Research, 10, 103–106.
*Fineman, S. (1975). The work preference questionnaire: A measure of managerial need for
achievement. Journal of Occupational Psychology, 48, 11–32. doi:10.1111/j.2044-8325.1975.
tb00293.x
*Francis-Smythe, J., Tinline, G., & Allender, C. (2002). Identifying high potential police officers and
role characteristics. Paper presentation at the Division of Occupational Psychology Conference,
British Psychological Society, Bournemouth.
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
824 Jesus F. Salgado et al.
*Furnham, A. (1994). The validity of the SHL Customer Service Questionnaire (CSQ). International
Journal of Selection and Assessment, 2, 157–165. doi:10.1111/j.1468-2389.1994.tb00136.x
Furnham, A., Steele, H., & Pendleton, D. (1993). A psychometric evaluation of the Belbin Team-Role
Self-Perception Inventory. Journal of Occupational and Organizational Psychology, 66, 245–
257. doi:10.1111/j.2044-8325.1993.tb00535.x
*Furnham, A., & Stringfield, P. (1993). Personality and work performance: Myers-Briggs Type
Indicator correlates of managerial performance in two cultures. Personality and Individual
Differences, 14, 145–153. doi:10.1016/0191-8869(93)90184-5
Garcıa-Izquierdo, A. L. (2001). Validaci on orientada al criterio de procedimientos de selecci
on de
personal de oficio para el pron ostico del rendimiento laboral y formativo en el sector de la
construcci on [Criterion-oriented validation of personnel selection procedures for predicting
traing and job performance in the building trade industry] (Unpublished doctoral dissertation).
University of Muricia, Spain.
Ghiselli, E. E. (1954). The forced-choice technique in self-description. Personnel Psychology, 7,
201–208. doi:10.1111/j.1744-6570.1954.tb01593.x
Gleser, L. J. (1972). On bounds for the average correlation between subtest scores in ipsatively
scored tests. Educational and Psychological Measurement, 32, 759–765. doi:10.1177/
001316447203200314
*Goffin, R. D., Jan, I., & Skinner, E. (2011). Forced-choice and conventional personality assessment:
Each may have unique value in pre-employment testing. Personality and Individual
Differences, 5, 840–844. doi:10.1016/j.paid.2011.07.012
*Gordon, L. V. (1993). Gordon Personal Profile-Inventory. Manual 1993 revision. San Antonio,
TX: Pearson educ.
*Graham, W. K., & Calendo, J. T. (1969). Personality correlates of supervisory ratings. Personnel
Psychology, 22, 483–487. doi:10.1111/j.1744-6570.1969.tb00349.x
*Grimsley, G., & Jarret, H. F. (1973). The relation of past managerial achievement to test measures
obtained in the employment situation: Methodology and results. Personnel Psychology, 26, 31–
48. doi:10.1111/j.1744-6570.1973.tb01115.x
*Guller, M. (2003). Predicting performance of law enforcement personnel using the candidate
and officer personnel survey and other psychological measures (Unpublished Doctoral
Dissertation). Seton Hall University.
Heggestad, E. D., Morrison, M., Reeve, C. L., & McCloy, R. A. (2006). Forced-choice assessments of
personality for selection: Evaluating issues of normative assessment and faking resistance.
Journal of Applied Psychology, 91, 9–24. doi:10.1037/0021-9010.91.1.9
Hicks, L. E. (1970). Some properties of ipsative, normative, and forced-choice normative measures.
Psychological Bulletin, 74, 167–184. doi:10.1037/h0029780
Hogan, J., Barrett, P., & Hogan, R. (2007). Personality measurement, faking, and employment
selection. Journal of Applied Psychology, 92, 1270–1285. doi:10.1037/0021-9010.9
Hogan, J., & Holland, B. (2003). Using theory to evaluate personality and job performance relations:
A socioanalytic perspective. Journal of Applied Psychology, 88, 100–112. doi:10.1037/0021-
9010.88.1.100
Horn, J. L. (1971). Motivation and dynamic calculus concepts from multivariate experiment. In R. B.
Cattell (Ed.), Handbook of multivariate experimental psychology (2nd printing, pp. 611–641).
Chicago, IL: Rand McNally.
Hough, L. M. (1992). The “Big Five” personality variables-construct confusion: Description versus
prediction. Human Performance, 5, 139–155. doi:10.1080/08959285.1992.9667929
Hough, L. M., & Ones, D. S. (2001). The structure, measurement, validity, and use of personality
variables in industrial, work and organizational psychology. In N. R. Anderson, D. S. Ones, H. K.
Sinangil & C. Viswesvaran (Eds.), Handbook of industrial, work, and organizational
psychology. Vol 1: Personnel Psychology (pp. 233–276). London, UK: Sage.
*Hughes, J. L., & Dood, W. E. (1961). Validity versus stereotype: Predicting sales performance by
ipsative scoring of a personality test. Personnel Psychology, 15, 343–355. doi:10.1111/j.1744-
6570.1961.tb01241.x
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Validity of forced-choiced inventories 825
*Hughes, G. L., & Prien, E. P. (1986). An evaluation of alternate scoring methods for the mixed
standard. Personnel Psychology, 39, 839–847. doi:10.1111/j.1744-6570.1986.tb00598.x
H€ulsheger, U. R., Anderson, N., & Salgado, J. F. (2009). Team-level predictors of innovation at work:
A comprehensive meta-analysis spanning three decades of research. Journal of Applied
Psychology, 94, 1128–1145. doi:10.1037/a0015978
Hunter, J. E. (1986). Cognitive ability, cognitive aptitudes, job knowledge, and job performance.
Journal of Vocational Behavior, 29, 340–362. doi:10.1016/0001-8791(86)90013-8
Hunter, J. E., & Hirsh, H. R. (1987). Applications of meta-analysis. In C. L. Cooper & I. T. Robertson
(Eds.), International review of industrial and organizational psychology (Vol. 2, pp. 321–
357). Chichester, UK: Wiley.
Hunter, J. E., & Schmidt, F. L. (1990). Methods of meta-analysis. Correcting error and bias in
research findings. Newbury Park, CA: Sage.
Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis. Correcting error and bias in
research findings. (2nd ed.) Newbury Park, CA: Sage.
Hunter, J. E., Schmidt, F. L., & Le, H. (2006). Implications of direct and indirect range restriction for
meta-analysis methods and findings. Journal of Applied Psychology, 91, 594–612. doi:10.1037/
0021-9010.91.3.594
Hurtz, G. M., & Donovan, J. J. (2000). Personality and job performance: The big five revisited.
Journal of Applied Psychology, 85, 869–879. doi:10.1037/0021-9010.85.6.869
*Illiescu, D., Ilie, A., & Aspas, D. G. (2011). Examining the criterion-related validity of the Employee
Screening Questionnaire: A three-sample investigation. International Journal of Selection and
Assessment, 19, 222–228. doi:10.1111/j.1468-2389.2011.00550.x
Jackson, D. N. (2002). Employee screening questionnaire manual. Port Huron, MI: Sigma
Assessment Systems.
*Jackson, D. N., Wroblewski, V. R., & Ashton, M. C. (2000). The impact of faking on employment
tests: Does forced choice offer a solution? Human Performance, 13, 371–388. doi:10.1207/
S15327043HUP1304_3
Johnson, C. E., Wood, R., & Blinkhorn, S. F. (1988). Spuriouser and spuriouser: The use of ipsative
personality tests. Journal of Occupational Psychology, 61, 153–162. doi:10.1111/j.2044-8325.
1988.tb00279.x
Judge, T., & Bono, J. E. (2001). Relationship of core self-evaluations traits – self-esteem, generalized
self-efficacy, locus of control, and emotional stability – with job satisfaction and job
performance: A meta-analysis. Journal of Applied Psychology, 86, 80–92. doi:10.1037/0021-
9010.86.1.80
Judge, T., Jackson, C. L., Shaw, J. C., Scott, B. A., & Rich, B. L. (2007). Self-efficacy and work-related
performance: The integral roles of individual differences. Journal of Applied Psychology, 92,
107–127. doi:10.1037/0021-9010.92.1.107
Judge, T. A., Rodell, J. B., Klinger, R. L., Simon, L. S., & Crawford, E. R. (2013). Hierarchical
Representations of the Five-Factor Model of personality in predicting job performance:
Integrating three organizing frameworks with two theoretical perspectives. Journal of Applied
Psychology, 98, 875–925. doi:10.1037/a0033901
Knapp, D., Heggestad, E., & Young, M. (2004). Understanding and improving the Assessment of
Individual Motivation (AIM) in the Army’s GED Plus Program (Study Note 2004–03).
Alexandria, VA: US Army Research Institute for the Behavioral and Social Sciences.
*Kriedt, P. H., & Dawson, R. I. (1961). Response set and the prediction of clerical job performance.
Journal of Applied Psychology, 45, 175–178. doi:10.1037/h0041918
*Kusch, R. I., Deller, J., & Albrecht, A. G. (2008, July 20–25). Predicting expatriate job performance:
using the normative NEO-PI-R or the ipsative OPQ32i? Paper presented at the 29th
International Congress of Psychology, Berlin, Germany.
*Lievens, F., Harris, M. M., Keer, E. V., & Bisqueret, C. (2003). Predicting cross-cultural training
performance: The validity of personality, cognitive ability, and dimensions measured by an
assessment center and a behavior description interview. Journal of Applied Psychology, 88,
476–489. doi:10.1037/0021-9010.88.3.476
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
826 Jesus F. Salgado et al.
Matthews, G., Stanton, N., Graham, N. C., & Brimelow, C. (1990). A factor analysis of the scales of the
occupational personality questionnaire. Personality and Individual Differences, 11, 591–596.
doi:10.1016/0191-8869(90)90042-P
McCloy, R. A., Heggestad, E. D., & Reeve, C. L. (2005). A silk purse from the sow’s ear: Retrieving
normative information from multidimensional forced-choice items. Organizational Research
Methods, 8, 222–248. doi:10.1177/1094428105275374
McCrae, R. R., & Costa, P. T. (1990). Personality in adulthood. New York, NY: Guilford Press.
*McDaniel, M. A., Yost, A. P., Ludwick, M. H., Hense, R. L., & Hartman, N. S. (2004, April).
Incremental validity of a situational judgment test. Paper presented at the 19th Annual
Conference of the Society for Industrial and Organizational Psychology, Chicago.
Meade, A. W. (2004). Psychometric problems and issues involved with creating and using ipsative
measures for selection. Journal of Occupational and Organizational Psychology, 77, 531–
552. doi:10.1348/0963179042596504
Meglino, B. M., & Ravlin, E. C. (1998). Individual values in organizations: Concepts, controversies,
and research. Journal of Management, 24, 351–389. doi:10.1177/014920639802400304
Morgeson, F. P., Campion, M. A., Dipboye, R. L., Hollenbeck, J. R., Murphy, K., & Schmitt, N. (2007a).
Reconsidering the use of personality tests in personnel contexts. Personnel Psychology, 60,
683–729. doi:10.1111/1744-6570.2007.00089.x
Morgeson, F. P., Campion, M. A., Dipboye, R. L., Hollenbeck, J. R., Murphy, K., & Schmitt, N.
(2007b). Are we getting fooled again? Coming to terms with limitations in the use of personality
tests for personnel selection. Personnel Psychology, 60, 1029–1049. doi:10.1111/j.1744-6570.
2007.00100.x
Murphy, K. R., & De Shon, R. (2000). Inter-rater correlations do not estimate the reliability of job
performance ratings. Personnel Psychology, 53, 873–900. doi:10.1111/j.1744-6570.2000.
tb02421.x
Myers, I. B., McCaulley, M. H., Quenk, N., & Hammer, A. (1998). MBTI handbook: A guide to the
development and use of the Myers-Briggs Type Indicator. (3rd ed.). Palo Alto, CA: Consulting
Psychologists Press.
Natham, B. R., & Alexander, R. A. (1988). A comparison of criteria for test validation: A meta-analytic
investigation. Personnel Psychology, 41, 517–535. doi:10.1111/j.1744-6570.1988.tb00642.x
*Nelson, C. A. (2008). Job type as a moderator of the relationship between situational judgment
and personality (Unpublished doctoral dissertation). Capella University.
*Neuman, G. A. (1991). Autonomous work group selection. Journal of Business and Psychology, 6,
283–291. doi:10.1007/BF01126715
*Neuman, G. A., & Kickul, J. R. (1998). Organizational citizenship behaviors: Achievement
orientation and personality. Journal of Business and Psychology, 13, 263–279. doi:10.1023/
A:1022963108025
Nguyen, N. T., & McDaniel, M. A. (2000, April). Faking and forced-choice scales in applicant
screening: A meta-analysis. Paper presented at the 15th Annual Conference of the Society for
Industrial and Organizational Psychology, New Orleans, LA.
*Nyfield, G., Gibbons, P. J., Baron, H., & Robertson, I. (1995). The cross-cultural validity of
management assessment methods. Surrey, UK: Saville & Holdsworth.
Ones, D. S. (1993). The construct of integrity tests (Unpublished doctoral dissertation). University of
Iowa, Iowa City.
Ones, D. S., & Viswesvaran, C. (2003). Job-specific applicant pools and national norms for
personality scales: Implications for range-restriction corrections in validation research. Journal
of Applied Psychology, 88, 570–577. doi:10.1037/0021-9010.88.3.570
Ones, D. S., Viswesvaran, C., & Reiss, A. (1996). Rose of social desirability in personality testing for
personnel selection: The red herring. Journal of Applied Psychology, 81, 660–679. doi:10.1037/
0021-9010.81.6.660
Ones, D. S., Viswesvaran, C., & Schmidt, F. L. (1993). Comprehensive meta-analysis of integrity test
validities: Findings and implications for personnel selection and theories of job performance.
Journal of Applied Psychology, 78, 679–703.
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Validity of forced-choiced inventories 827
*Perkins, A. M., & Corr, P. J. (2005). Can worriers be winners? The association between worrying and
job performance. Personality and Individual Differences, 38, 25–31. doi:10.1016/j.paid.2004.
03.008
Piedmont, R. L., Costa, P. T., & McCrae, R. R. (1992). An assessment of the Edwards Personal
Preference Schedule from the perspective of the Five-Factor Model. Journal of Personality
Assessment, 58, 67–78. doi:10.1207/s15327752jpa5801_6
Radcliffe, J. (1963). Some properties of ipsative score matrices and their relevance for some
current interest tests. Australian Journal of Psychology, 15, 1–11. doi:10.1080/
00049536308255468
*Robertson, I. T., Baron, H., Gibbons, P., MacIver, R., & Nyfield, G. (2000). Conscientiousness and
managerial performance. Journal of Occupational and Organizational Psychology, 73, 171–
180. doi:10.1348/096317900166967
*Robertson, I., Gibbons, P., Baron, H., MacIver, R., & Nyfield, G. (1999). Understanding management
performance. British Journal of Management, 10, 5–12. doi:10.1111/1467-8551.00107
*Rolland, J. P., & Mogenet, J. L. (2001). Syst eme de description en cinq dimensions (D5D). Paris,
France: Les Editions du Centre de Psychologie Appliquee.
Rust, J. (1999). The validity of the Giotto integrity test. Personality and Individual Differences,
27, 755–768. doi:10.1016/S0191-8869(98)00277-3
Sackett, P. R. (2003). The status of validity generalization research: Key issues in drawing inferences
from cumulative research studies. In K. R. Murphy (Ed.), Validity generalization: A critical
review (pp. 91–114). Mahaw, NJ: Lawrence Erlbaum.
*Sackett, P. R., Gruys, M. L., & Ellington, J. E. (1998). Ability–personality interactions when
predicting job performance. Journal of Applied Psychology, 83, 545–556. doi:10.1037/0021-
9010.83.4.545
*Salgado, J. F. (1991). Validity of the Preferences and Perception Inventory (PAPI) in a financial
service company (Unpublished technical report). Department of Social Psychology, University
of Santiago de Compostela.
Salgado, J. F. (1997). The five-factor model of personality and job performance in the European
Community. Journal of Applied Psychology, 82, 30–43. doi:10.1037/0021-9010.82.1.30
Salgado, J. F. (1998a). The Big Five personality dimensions and job performance in army and civil
occupations: A European perspective. Human Performance, 11, 271–288. doi:10.1080/
08959285.1998.9668034
Salgado, J. F. (1998b). Sample size in validity studies of personnel selection. Journal of Occupational
and Organizational Psychology, 71, 161–164. doi:10.1111/j.2044-8325.1998.tb00669.x
Salgado, J. F. (2002). The Big Five personality dimensions and counterproductive behaviors.
International Journal of Selection and Assessment, 10, 117–125. doi:10.1111/1468-2389.
00198
Salgado, J. F. (2003). Predicting job performance using FFM and non-FFM personality measures.
Journal of Occupational and Organizational Psychology, 76, 323–346. doi:10.1348/
096317903769647201
Salgado, J. F., Anderson, N., Moscoso, S., Bertua, C., De Fruyt, F., & Rolland, J. P. (2003). A meta-
analytic study of GMA validity for different occupations in the European Community. Journal of
Applied Psychology, 88, 1068–1081. doi:10.1037/0021-9010.88.6.1068
Salgado, J. F., & Moscoso, S. (1996). Meta-analysis of the interrater reliability of job performance
ratings in validity studies of personnel selection. Perceptual and Motor Skills, 83, 1195–1201.
doi:10.2466/pms.1996.83.3f.1195
Salgado, J. F., & Tauriz, G. (2014). The Five-Factor Model, forced-choice personality inventories and
performance: A comprehensive meta-analysis of academic and occupational validity studies.
European Journal of Work and Organizational Psychology, 23, 3–30. doi:10.1080/
1359432X2012.716198
*Saville, P., Sik, G., Nyfield, G., Hackston, J., & Maclver, R. (1996). A demonstration of the validity of
the Occupational Personality Questionnaire (OPQ) in the measurement of job competencies
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
828 Jesus F. Salgado et al.
across time and in separate organizations. Applied Psychology, 45, 243–262. doi:10.1111/j.
1464-0597.1996.tb00767.x
Saville, P., & Willson, E. (1991). The reliability and validity of normative and ipsative approaches
in the measurement of personality. Journal of Occupational Psychology, 64, 219–238.
doi:10.1111/j.2044-8325.1991.tb00556.x
*Schippmann, J. S., & Prien, E. P. (1989). An assessment of the contributions of general mental ability
and personality characteristics to management success. Journal of Business and Psychology, 3,
423–437. doi:10.1007/BF01020710
Schmidt, F. L., & Hunter, J. E. (1996). Measurement error in psychological research: Lessons from 26
research scenarios. Psychological Methods, 1, 199–223. doi:10.1037/1082-989X.1.2.199
Schmidt, F. L., Hunter, J. E., & Urry, V. W. (1976). Statistical power in criterion-related validation
studies. Journal of Applied Psychology, 61, 473–485. doi:10.1037/0021-9010.61.4.473
Schmidt, F. L., & Le, H. (2004). Software for the Hunter-Schmidt meta-analysis methods. Iowa City,
IA: Department of Management and Organizations, University of Iowa.
Schmidt, F. L., Oh, I.-S., & Le, H. (2006). Increasing the accuracy of corrections for range restriction:
Implications for selection procedure validities and other research practices. Personnel
Psychology, 59, 281–305. doi:10.1111/j.1744-6570.2006.00065.x
Schmidt, F. L., Shaffer, J. A., & Oh, I.-S. (2008). Increased accuracy for range restriction corrections:
Implications for the role of personality and general mental ability in job and training
performance. Personnel Psychology, 61, 827–868. doi:10.1111/j.1744-6570.2008.00132.x
Schmidt, F. L., & Zimmerman, R. D. (2004). A counterintuitive hypothesis about employment
interview validity and some supporting evidence. Journal of Applied Psychology, 89, 553–561.
doi:10.1037/0021-9010.89.3.553
Shen, W., Kiger, T. B., Davies, S. E., Rasch, R. L., Simon, K. M., & Ones, D. S. (2011). Samples in
applied psychology: Over a decade of research in review. Journal of Applied Psychology, 96,
1055–1064. doi:10.1037/a0023322
*SHL (2006). OPQ32 manual and user’s guide. Surrey, UK: Author.
*Slocum, Jr, J. W., & Hand, H. H. (1971). Prediction of job success and employee satisfaction for
executives and foremen. Training and Development Journal, 25, 28–36.
*Sommerfeld, D. (1997). Maintenance worker selection: High validity and low adverse impact.
Michigan municipal league. Employment Testing Consortium Project.
Stark, S., Chernyshenko, O., Drasgow, F., & Williams, B. (2006). Examining assumptions about item
responding in personality assessment: Should ideal point methods be considered for scale
development and scoring? Journal of Applied Psychology, 91, 25–39. doi:10.1037/0021-9010.
91.1.25
Tenopyr, M. (1988). Artifactual reliability of forced-choice scales. Journal of Applied Psychology,
73, 749–751. doi:10.1037/0021-9010.73.4.749
Tett, R. P., Christiansen, N. D., Robie, C., & Simonet, D. V. (2011, May 25–28). International
survey of personality test use: An American baseline. Paper presented at the 15th
Conference of the European Association of Work and Organizational Psychology, Maastricht,
The Netherlands.
Tett, R., Rothstein, M. G., & Jackson, D. J. (1991). Personality measures as predictors of
job performance: A meta-analytic review. Personnel Psychology, 44, 703–742. doi:10.1111/j.
1744-6570.1991.tb00696.x
Thompson, B., Levitov, J. E., & Miederhoff, P. A. (1982). Validity of the Rokeach value
survey. Educational and Psychological Measurement, 42, 899–905. doi:10.1177/
001316448204200325
United States Department of Labor (USES) (1991). Dictionary of occupational titles. (4th ed.)
Washington, DC: U.S. Government Printing Office.
Vasilopoulos, N. L., Cucina, J. M., Dyomina, N. V., Morewitz, C. L., & Reilly, R. R. (2006). Forced-
choice personality tests: A measure of personality and cognitive ability? Human Performance,
19, 175–199. doi:10.1207/s15327043hup1903_1
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Validity of forced-choiced inventories 829
Viswesvaran, C., & Ones, D. (2000). Measurement error in “Big Five” personality assessment:
Reliability generalization across studies and measures. Educational and Psychological
Measurement, 60, 224–235. doi:10.1177/00131640021970475
Viswesvaran, C., Ones, D., & Schmidt, F. L. (1996). Comparative analysis of the reliability of job
performance ratings. Journal of Applied Psychology, 81, 557–574. doi:10.1037/0021-9010.81.
5.557
Viswesvaran, C., Schmidt, F. L., & Ones, D. S. (2002). The moderating influence of job performance
dimensions on convergence of supervisor and peer ratings of job performance: Unconfounding
construct-level convergence and rating difficulty. Journal of Applied Psychology, 87, 345–354.
doi:10.1037/0021-9010.87.2.345
Warr, P., Bartram, D., & Brown, A. (2005a). Big Five validity: Aggregation method matters. Journal of
Occupational and Organizational Psychology, 78, 377–386. doi:10.1348/096317905X53868
*Warr, P., Bartram, D., & Martin, T. (2005b). Personality and sales performance: Situational variation
and interactions between traits. International Journal of Selection and Assessment, 13, 87–91.
doi:10.1111/j.0965-075X.2005.00302.x
Whetzel, D. L., McDaniel, M. A., Yost, A., & Kim, N. (2010). Linearity of personality–performance
relationships: A large-scale examination. International Journal of Selection and Assessment,
18, 310–320. doi:10.1111/j.1468-2389.2010.00514.x
*White, L. A. (2002). A quasi-ipsative temperament measure for assessing future leaders. 44th
Annual Conference of the International Military Testing Association (p. 169).
*Willingham, W. W., & Ambler, R. K. (1963). The relation of the Gordon Personal Inventory to
several external criteria. Journal of Consulting Psychology, 27, 460. doi:10.1037/h0040658
*Willingham, W. W., Nelson, P., & O’Connor, W. (1958). A note on the behavioral validity of the
Gordon Personal Profile. Journal of Consulting Psychology, 22, 378. doi:10.1037/h0045815
*Witt, L. A., & Jones, J. W. (1999). Very particular people quit first. Paper presented at the Annual
Conference of the Society for Industrial and Organizational Psychology, Atlanta, GA.
Young, M., & Dulewicz, V. (2007). Relationship between emotional and congruent self-awareness
and performance in the British Royal Navy. Journal of Managerial Psychology, 22, 465–478.
Bartram (2007) S. Africa 68 OPQ I MAN JPR .02 .07 .02 .09 .04
Bartram (2007) USA 86 OPQ I MAN JPR .04 .25 .09 .18 .05
Bennett (1977) 45 SDI Q MAN JPR .40 – – – .05
Bennett (1977) 49 SDI Q MAN JPR .46 – – – .07
Brown and Bartram (2009)* 835 OPQ I MAN SFR .12 .13 .08 .01 .15
Christiansen et al. (2005) 60 IPIP Q MIX JPR – – – – .46
Christiansen et al. (2005) 62 IPIP Q MIX JPR – – – – .17
Clevenger, Pereira, Wiechman, 207 OPQ I CUS JPR – – – – .16
Schmitt, and Schmidt-Harvey (2001)
Conway (2000) 1,567 MBTI N MAN JPR – .06 .06 .06 .06
Fine and Dover (2005) 193 ICS I SO TRA – .06 – .02 –
Fineman (1975) 293 WPQ N MAN JPR – – – – .21
Fineman (1975) 246 WPQ N MAN SAL – – – – .20
Fineman (1975) 84 WPQ N MAN PRG – – – – .25
Francis-Smythe, Tinline, 225 OPQ I POL JPR .08 .08 .28 – .16
and Allender (2002)*
Furnham (1994) 176 CSQ I CUS CWB .01 .03 .00 – .06
Furnham (1994) 176 CSQ I CUS JPR .12 .10 .17 – .23
Furnham and Stringfield (1993) 222 MBTI N MAN JPR – .10 .09 .05 .01
Furnham and Stringfield (1993) 148 MBTI N MAN JPR – .05 .02 .02 .12
Garcıa-Izquierdo (2001) 84 OPQ I SKW TRA .04 .02 .26 – .28
Goffin, Jan, and Skinner (2011) 114 ESQ Q SKW CWB – – – – .21
Gordon (1993) T4.18 94 GPP Q MAN PRG .18 .28 .05 .17 .10
Continued
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
*. (Continued)
Appendix A: (Continued)
References N Inventory Type Job Criterion ES EX O A C
Continued
831
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Appendix A: (Continued)
*. (Continued)
832
Guller (2003) DD 375 EPPS I POL JPR .06 .04 .00 .04 .01
Hughes and Dood (1961) 90 GPI I SO SLS .22 .17 – – .14
Hughes and Dood (1961) 90 GPI Q SO SLS .06 .08 – – .08
Hughes and Prien (1986) 49 GPI Q SKW JPR – – – .15 .03
Illiescu, Ilie, and Aspas (2011) 475 ESQ Q SKW CWB – – – – .43
Jesus F. Salgado et al.
Continued
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
*.
Appendix
(Continued) A: (Continued)
Saville, Sik, Nyfield, Hackston, 440 OPQ I MAN JPR .03 .03 .07 .00 .02
and Maclver (1996)
Saville et al. (1996) 270 OPQ I MAN JPR .03 .12 .12 .02 .08
Schippmann and Prien (1989) 148 EPPS I MAN JPR .02 .05 .09 .05 .09
Schippmann and Prien (1989) 148 GPP + SDI Q MAN JPR .06 .15 .27 .12 .14
SHL (2006) – val19 79 OPQ I SUP JPR .06 .13 .11 .03 .24
SHL (2006) – val22 120 OPQ I MAN JPR .03 .06 .14 .05 .01
SHL (2006) – val30 114 OPQ I SO SLS .04 .10 .06 .28 .06
SHL (2006) – val32 36 OPQ I PRO TRA .05 .19 .03 .15 .20
Slocum and Hand (1971) 57 EPPS I MAN JPR .17 .07 .05 .02 .17
Slocum and Hand (1971) 37 EPPS I SUP JPR .05 .17 .05 .20 .02
Sommerfeld (1997) 332 GPI Q SKW JPR .01 – – – .32
Warr, Bartram, and Brown (2005), 119 CCSQ I SO SLS .03 .10 .10 .19 .26
Warr, Bartram, and Martin (2005)
Warr, Bartram, and Brown (2005), 78 CCSQ I SO SLS .04 .05 .17 .15 .20
Warr, Bartram, and Martin (2005)
Warr, Bartram, and Brown (2005), 90 CCSQ I SO SLS .04 .08 .25 .32 .21
Warr, Bartram, and Martin (2005)
Whetzel et al. (2010) 1,152 OPQ I PRO JPR .02 .07 .08 .04 .08
White (2002)* 613 AIM Q MIL JPR .06 .22 – .01 .20
White (2002)* 399 AIM Q MIL JPR .07 .06 – .01 .04
Willingham, Nelson, 1,039 GPI Q MIL TRA .05 .00 – – .03
and O’Connor (1958)
Willingham and Ambler (1963) 208 GPI Q MIL CWB – .17 – – .19
Witt and Jones (1999) 168 OPQ I CUS JPR .01 .07 .01 .01 .03
Young & Dulewicz (2007) 261 OPQ I MIL JPR .16 .14 .11 .17 .20
Note. CWB, counterproductive work behaviour; JPR, job performance rating; PRG, progress; PRR, peer rating; SAL, salary; SFR, self-rating; SLS, sales; TRA, training; I,
Validity of forced-choiced inventories
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
20448325, 2015, 4, Downloaded from https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/joop.12098 by Universidade de Santiago de Compostela, Wiley Online Library on [14/02/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
834 Jesus F. Salgado et al.
Table B1. Meta-analysis of the validity of normative FC measures of the FFM for managerial occupations
Managers
Extraversion 3 1,937 .05 0.03 .08 .09 0.00 100 .09 .13 .05
Openness to experience 3 1,937 .06 0.02 .09 .10 0.00 100 .10 .14 .06
Agreeableness 3 1,937 .06 0.01 .09 .09 0.00 100 .09 .13 .05
Conscientiousness 6 2,560 .09 0.07 .15 .17 0.08 60 .07 .21 .13
Note. K, number of independent samples; N, total sample size; rw, observed validity; SDr, standard
deviation of observed validity; rc, operational validity; qi, validity corrected for criterion reliability and
indirect range restriction in predictor; SDq, standard deviation of qi; %VE, percentage of variance
accounted for by artefactual errors; 90%CV, 90% credibility value based on q; CIU, upper limit of the 95%
confidence interval of q; CIL, lower limit of the 95% confidence interval of q.