Pre-employment tests are used by many employers as part of the selection process. What steps
should employers take to assure that the tests used are legal and valid?
Chapter
6
Personnel Selection
W
I
L
L
O B J E C T I V EI S
S
After reading this chapter, you should be able to
,
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
1. Understand the concepts of reliability, validity, and utility.
2. Understand the validity evidence for various selection methods.
K
3. Discuss approaches to the more effective use for application blanks,
A
reference checks, biographical
data, testing, and various other selection
methods programsSin order to increase the validity and legal defensibility
of each.
S
4. Discuss the approaches available for drug testing.
A
5. Describe the validity of different approaches to interviewing.
N
6. Explain how the various
types of job candidate information should be
D
integrated and evaluated.
R
A
OVERVIEW
Use of validated selection
models: A HPWS
characteristic
It sounds simple: Match2employees with jobs. Researchers have made this task easier by
developing selection methods that successfully predict employee effectiveness. Still, there
1
is a void between what research indicates and how organizations actually do personnel selection. Real-world personnel
6 selection is replete with examples of methods that have been
proven to be ineffective or inferior.
1
Personnel selection (and retention) is key to organizational effectiveness. The most
T
successful firms use methods
that accurately predict future performance. The use
of validated selection models
is
another of the High-Performance Work Practices
S
linking this HR process to corporate financial performance. Organizations are, or
should be, interested in selecting employees who not only will be effective but who
will work as long as the organization needs them and, of course, will not engage in
counterproductive behaviors such as violence, substance abuse, avoidable accidents,
and employee theft.
A multiple-hurdle process involving an application, reference and background checks,
various forms of standardized testing, and some form of interview is the typical chronology
of events for selection, particularly for external hiring decisions. Internal decisions, such
as promotions, are typically done with less formality. Personnel selection is the process
185
ber29163_ch06_185-236.indd 185
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
First Step is Work analysis
of gathering and assessing information about job candidates in order to make decisions
about personnel. The process applies to entry-level personnel and promotions, transfers,
and even job retention in the context of corporate downsizing efforts. This chapter introduces you to personnel selection, describes some of the most popular types of hiring/
screening procedures, reviews the research evidence on each, and discusses the social and
legal implications of the various options.
The chapter begins with an overview of measurement issues related to personnel selection and staffing. Next the various selection methods are introduced in their usual order
of use. Application blanks, background checks, and reference checks are discussed first.
Then the various forms of standardized tests that purport to assess applicants’ suitability
or KASOCs are reviewed. The use, validity, and possible adverse impact of various types
of selection methods are considered, including general mental ability tests and personality
tests. The final sections of the chapter discuss employment interviews and methods that
have been shown to increase their W
validity, the use of more sophisticated (and expensive)
selection procedures such as assessment centers, performance testing and work samples,
I
and drug and medical tests in the preemployment
selection process. The context of the discussion are the legal implications ofLthe various personnel practices and pointing out where
there are clear discrepancies between what typically happens in practice and what acaL
demic research indicates should happen. This is one chapter where the distance between
academic research findings andIrecommendations and actual selection practices is
great. The good news is that the gap is closing.
S
Wackenhut Security (recently acquired by G4S) had its share of selection challenges.
Although recruitment efforts and a, sluggish economy attracted a large number of applicants for its entry-level armed and unarmed security guard positions, there was concern
about the quality of those hired and high voluntary employee turnover. The turnover rate
K
for some positions exceeded 100 percent—meaning
that the quit rate in 1 year exceeded
the number of available positions. Wackenhut
Security
also was dissatisfied with the qualA
ity of its supervisory personnel.
S (Behavioral Analysts and Consultants), a Florida
The company contracted with BA&C
psychological consulting firm thatSspecializes in staffing problems and personnel selection. Wackenhut asked BA&C to develop a new personnel selection system for entry-level
guards and supervisors. UnderlyingAthis request was a need for Wackenhut to improve its
competitive position in this highlyN
competitive industry by increasing sales and contracts,
decreasing costs, and, most important, making certain its security personnel do the job.
D
The company, which already compensated its guards and supervisors more than others
in the industry, wanted to avoid any
R increase in compensation. The company estimated
that the cost of training a new armed guard was about $1,800. With several hundred guards
A
quitting in less than a year, the company often failed to even recover training costs in sales.
Wackenhut needed new selection methods that could increase the effectiveness of the
guards and supervisors and identify
2 those guard applicants who not only performed well
but would be most likely to stay with the company.
1 work analysis should identify the knowledge, abiliYou will recall from Chapter 4 that
ties, skills, and other characteristics
6 (KASOCs) or competencies that are necessary for
successful performance and retention on the job. In this case, BA&C first conducted a job
analysis of the various guard jobs 1
to get better information on the KASOCs required for
the work. After identifying the critical
T KASOCs, BA&C developed a reliable, valid, and
job-related weighted application blank, screening test, and interview format.
S
The process of selection varies substantially
within this industry. While Wackenhut initially used only a high school diploma as a job specification, an application blank, a background check, and an interview by someone in personnel, competitors used more complex
methods to select employees. American Protective Services, for example, the company that
handled security for the Atlanta Olympics, used a battery of psychological and aptitude
tests along with a structured interview. Wackenhut wanted selection systems that were
even more valid and useful than what their major competitors were using. Their marketing
strategy would then emphasize their more sophisticated screening methods.
As with the job analysis and the recruitment process, personnel selection should be directly linked to the HR planning function and the strategic objectives of the company. For
186
ber29163_ch06_185-236.indd 186
17/02/12 2:38 PM
6 / Personnel Selection
Figure 6-1
Steps in the Development
and Evaluation of a
Selection Procedure
JOB ANALYSIS/HUMAN RESOURCE PLANNING
Identify knowledge, abilities, skills, and other characteristics (KASOCs) (aka: competencies).
Use a competency model tied to organizational objectives.
RECRUITMENT STRATEGY: SELECT/DEVELOP SELECTION PROCEDURES
Review options for assessing applicants on each of the KASOCs:
Standardized tests (cognitive, personality, motivational, psychomotor).
Application blanks, biographical data, background and reference checks, accomplishment record.
Performance tests, assessment centers, interviews.
DETERMINE VALIDITY FOR SELECTION METHODS
Criterion-related validation or validity generalization.
Expert judgment (content validity).
DETERMINE WEIGHTING SYSTEM FOR DATA FROM SELECTION METHODS
W
I
example, the mission ofL
the Marriott Corporation is to be the hotel chain of choice of frequent travelers. As part of
L this strategy, the company developed a successful selection system to identify people who could be particularly attentive to customer demands. Wackenhut
I marketing strategy aimed at new contracts for armed security
Security also had a major
guards who would be extremely
vigilant. The new selection system would be designed to
S
identify people more likely to perform well in this capacity.
,
Figure 6-1 presents a chronology of our recommended strategy for selection system
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
development and the major options available for personnel selection. The previous chapters on work analysis, planning, and recruitment have gotten us to the point of selecting
K
job candidates based on relevant and job-related information from one or more selection
A
methods. Each of these methods
is reviewed in this chapter. But keep in mind that the focus
should be on selecting orS
developing tools that will provide valid assessments on the critical
KASOCs, competencies, and job specifications most important for strategy execution. The
S the strategically important KASOCs or competencies from
work analysis should identify
which the job specifications
A will be derived. Then particular selection methods (selection
tools) should be adopted to assess people in terms of these particular job specifications.
N
D
R
SELECTION METHODS:
A
ARE THEY EFFECTIVE?
This review includes a summary of the validity of each major approach to selection and
an assessment of the relative
2 cost to develop and administer each method. Three key terms
related to effectiveness are reliability, validity, and utility. While these terms are strongly
related to one another, 1
the most important criterion for a selection method is validity.
Remember the discussion
6 of the research on High-Performance Work Practices. One
of the HR practices shown to be related to corporate financial performance was the per1 using “validated selection methods.”1 The essence of the term
centage of employees hired
validity is the extent to which
T scores on a selection method predict one or more important
criteria. While the most typical criterion of interest to selection and staffing specialists is
S also may be interested in other criteria such as how long an
job performance, companies
employee may stay on the job or whether the employee will steal from the organization,
be violent, or be more likely to be involved in work-related accidents. But before addressing the validity of a method, let’s look at one of the necessary conditions for validity: the
reliability of measurement.
What Is Reliability?
The primary purpose of personnel selection is measuring the attributes of job candidates.
A necessary condition for a selection method to be valid is that it first be reliable. Reliability concerns the degree of consistency or the agreement between two sets of scores
187
ber29163_ch06_185-236.indd 187
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
Good reliability:
.8 or higher
What Is Validity?
Validity is close in meaning
to “job relatedness”
Criterion-related validity
on some measurement device. Reliability refers to freedom from unsystematic errors of
measurement. The consistency in measurement applies to the scores that derive from the
selection method. These scores can come from a paper-and-pencil test, a job interview,
a performance appraisal, or any other method that is used to measure characteristics and
make decisions about people. The CIA uses a very long multiple-choice test as an initial
screening device for job applicants to be agents. If applicants were to take the test twice
3 weeks apart, their scores on the test would stay pretty much the same (the same thing can
be said for SAT scores). These tests can be considered reliable. The level of reliability can
be represented by a correlation coefficient. Correlations from 0 to 1.0 show the extent of
the reliability. Generally, reliable methods have reliability coefficients that are .8 or higher,
indicating a high degree of consistency in scores. No selection method achieves perfect
reliability, but the goal should be to reduce error in measurement as much as possible and
achieve high reliability. If raters are a part of the selection method, such as job interviewers
or on-the-job performance evaluators,
W the extent to which different raters agree also can
represent the reliability (or unreliability) of the method.
I use of graphology (or handwriting analysis) for personRemember our criticism about the
nel selection we discussed in Chapter
L 1? Handwriting analysis is used by some U.S. companies and even more European firms as a method of selection. But this method is first of all not
L
even reliable, much less valid. If the same handwriting sample were given to two grapholoI on the levels or scores on various employment-related
gists, they would not necessarily agree
attributes (e.g., drive, judgment, creativity, intelligence), supposedly measured based on a
S
handwriting sample. Thus the method has low reliability as an assessment of these attributes.
, agree on relative levels of some attribute, this agree(But even if the two graphologists did
ment would not necessarily mean that their assessments are valid.)
Reliable methods tend to be long. One of the reasons the SAT, the GRE, the GMAT,
and the LSAT seem to take foreverKto complete is so these tests will have very high levels
of reliability (and they do). Reliabilities
A for “high stakes” tests such as the GMAT, the
SAT, and the LSAT are quite high. For example, the average reliability estimates are .92,
S the Verbal score, and the Quantitative score, respec.90, and .89 for the GMAT total score,
2
tively. But while high reliability S
is a necessary condition for high validity, high reliability does not ensure that a method is valid. The GMAT may be highly reliable, but do
A success in business school? This question addresses
scores on the GMAT actually predict
the validity of the method.
N
D
The objective of the Wackenhut R
Security consultants was to develop a reliable, valid,
legally defensible, user-friendly, and inexpensive test that could predict both job perforA
mance and long job tenure for security guards. The extent to which the test was able to
predict an important criterion such as performance was an indication of the test’s validity. The term validity is close in meaning
but not synonymous with the critical legal term
2
job relatedness, which is discussed in Chapters 3 and 4. Empirical or criterion-related
1
validity involves the statistical relationship
between scores on some predictor or selection
method (e.g., a test or an interview)
and
performance
on some criterion measure such as
6
on-the-job effectiveness (e.g., sales, supervisory ratings, job turnover, employee theft). At
Wackenhut, a study was conducted1in which scores on the new screening test were correlated with job performance and jobT
tenure. Given a certain level of correlation, such a study
would support a legal argument of job relatedness.
S reported as a correlation coefficient. This deThe statistical relationship is usually
scribes the relationship between scores on the predictor and measures of effectiveness (also
called criteria). Correlations from −1 to +1 show the direction and strength of the relationship. Higher correlations indicate stronger validity. Assuming that the study was conducted
properly, a significant correlation between the scores on a method and scores (or data) on
some important criterion could be offered as a strong argument for the job relatedness of
the method. Under certain circumstances, correlation coefficients even in the .20s can signify a useful method. However, higher correlations are clearly better. In general, an increase
in the validity of a selection method will translate into a proportional increase in the average
dollar value of the annual output from employees who are selected with this method.
188
ber29163_ch06_185-236.indd 188
17/02/12 2:38 PM
6 / Personnel Selection
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
Validity Generalization
While higher correlations are generally better, the size of the sample (and other factors)
are very important for achieving statistical significance. Validity studies with small sample
sizes will often not achieve significance mainly because of the error in the study. Many
selection methods have average validities between .20 and .40. Samples of a minimum
of 100 scores are strongly recommended in order to empirically validate in a particular setting.3 So, do scores on the GMAT predict success in business school? Clearly, they do with
an average validity of about .5 across hundreds of studies.
Another key issue that will have an impact on the results and interpretation of empirical studies is the conceptual match between a particular criterion of interest (e.g., some
element of job performance) and any particular predictor. Cognitively loaded predictors
(those correlated with general mental ability [GMA]) are the strongest predictors of task
performance, while so-called noncognitive predictors such as personality and motivational
measures are better predictors of contextual performance/citizenship behavior (e.g., effects
on co-workers) and counterproductive
behavior (e.g., employee theft).
W
A critical concept related to validity is generalizability. This term refers to the extent
to which the validity ofIa selection method can generalize to other employment settings
or situations. At the most
L basic level, generalizability concerns whether the validity of
a selection method established based on a study or studies in other situations can be
L
inferred for a new situation in which no new correlational data are collected. Validity
I
generalization (VG) invokes
evidence from past studies on a selection method that is
then applied to a new and similar setting. Many studies have used appropriate scientific
S
methods to establish the validity and generalizability of constructs, such as cognitive or
, emotional intelligence, and also particular instruments and
general mental ability and
methods developed to measure these constructs. Meta-analytic techniques are used
to establish VG for a method. Meta-analysis is a methodology for quantitatively
K studies. Meta-analytic findings are generally more reliaccumulating results across
able than results obtained
A from an individual study and help researchers draw conclusions. Like other areas of scientific inquiry, meta-analytic methods have evolved and
S to emerge. These improvements have increased the accuracy
new refinements continue
of meta-analytic methods
S and estimates of the validity of these particular selection tests
and methods.4
A
VG is an excellent alternative
to empirical validation for selection methods when a
criterion-related validation
study
cannot
be done because of inadequate sample sizes or
N
other reasons. Employers could invoke an appropriate VG study to argue that a particular
D
test or method is valid for their setting as well. This approach is recommended if there is
insufficient data to allowRfor an empirical study by this employer (i.e., at a minimum, less
than 100 pairs of scores on an instrument correlated with performance data on the same
A
individuals).
A VG argument for validity can be invoked if an organization can first locate previously
conducted empirical studies
2 showing that the same or similar methods (e.g., tests) are valid
for a particular job or purpose. The organization should then produce an analysis showing
1 method is used (or will be used) for selection is the same as, or
that the job for which the
very similar to the job(s)6that were involved in the empirical studies of the VG study and
that the criterion measures used in the VG studies are also important for the organization.
1 program need to do another study showing the validity of the
Does an accredited MBA
GMAT for that particular
Tprogram? Almost certainly not; there is plenty of evidence documenting the VG of this test for predicting business school success.
Figure 6-2 presents aS
summary of the meta-analytic evidence for the most popular selection tools, plus the relative cost of their development and administration. An obvious
and critical question is “How large must a correlation be?” Correlations between of .20
and .30 are often discounted because they account for less than 10 percent of the variance in performance. However, as a matter of fact, a correlation of say .30 for a selection
method is sufficiently large that hiring applicants who score better on this particular measure can actually double the rate of successful performance. For example, with validity at
.30, 67 percent of individuals who score in the top 20 percent on a measure would have
above-average performance versus only 33 percent of individuals who score in the bottom
20 percent.
189
ber29163_ch06_185-236.indd 189
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
Figure 6-2
Selection Tools, and Cost for Development and Administration
Tool
General mental ability tests (or GMA) measure mental abilities such as reading
comprehension, verbal or math skills.
Structured interviews measure a variety of skills and abilities using a standard set of
questions.
Unstructured interviews measure a variety of skills using questions that vary from candidate
to candidate and interviewer to interviewer.
Work samples/performance tests measure job skills using the actual performance of tasks as
on job.
Job knowledge tests measure bodies of knowledge required by a job.
Personality Testing4
Conscientiousness
Extraversion
Emotional Stability
Agreeableness6
Openness to Experience
Biographical information measures a variety of skills and personal characteristics through
questions about education, training, work experience, and interests.
Measures of work experience (e.g., “behavioral consistency”)
Situational judgment tests measure a variety of skills with short scenarios (either in written
or video format) asking test takers what would be their most likely response.
Integrity tests measure attitudes and experiences related to a person's honesty,
dependability, and trustworthiness.
Assessment centers measure KASOCs through a series of work samples/exercises with
trained assessors (may include GMA and other tests).
Reference checks provide information about an applicant's past performance or measure the
accuracy of applicants' statements on their résumés.
W
I
L
L
I
S
,
Validity1
Costs (Development/
Administration)2
.5–.73
Low/low
.4–.45
High/high
.2–.3
Low/high
.3–.4
High/high
.4–.5
High/low
.25–.3
.15–.355
.1–.3
.1–.2
.1–.2
.3–.4
Low/low
Low/low
Low/low
Low/low
Low/low
High/low
.3–.4
.3–.4
High/low
High/low
.3.–.4
Low/low
.3–.45
High/high
K
.2–.3
Low/low
A
Validities range from 0 to 1.0; higher numbers indicate better prediction of job performance.
S Ranges are reported here.
References to high or low are based on relative comparisons to other methods.
Validities for more complex jobs tend to be higher for GMA.
S
Validities for personality measures tend to vary with the job. FFM self-report validity ranges reported here. Much stronger validities (.5–.6 range) for
peer-based (versus self-reported) measures of personality.
A
Stronger validity in predicting managerial and/or leadership performance; weak validities for jobs involving less interaction.
Low validity for managerial jobs (.10); higher validities for team-based settings.
N
Sources: Adapted from W. F. Cascio, and H. Aguinis (2011). Applied psychology in human resource management. Upper Saddle River, NJ: Prentice Hall;
D Tells Us. Human Resource Management, 43, 307–308.
and A. M. Ryan & N. T. Tippins, (2004). Attracting and Selecting: What Psychological Research
R
A
1
2
3
4
5
6
2
Content validity assesses the degree to which the contents of a selection method (i.e.,
1 components) represent (or assess) the requirements of
the actual test or instrument items or
the job. This approach to validation6is of course ideal when the employer lacks an adequate
sample size to be able to empirically validate a method. Subject matter experts are typically
1the content of the method with the actual requirements
used to evaluate the compatibility of
of a job (e.g., is the knowledge or skill
T assessed on the test compatible with the knowledge
or skill required on the actual job?). Such a study or evaluation by experts also can be ofSbut the study should follow the directions provided by
fered as evidence of job relatedness,
the Supreme Court in Albemarle v. Moody (see Chapter 3) and, just to be safe, comply with
the Uniform Guidelines on Employee Selection Procedures (UGESP). (See www.eeoc.gov
for details on the UGESP.)
A knowledge-based test for “Certified Public Accountant” could be considered to have
content validity for an accounting job. Many organizations now use job simulations or
work samples where an applicant is instructed to play the role of a job incumbent and
perform tasks judged to be directly related to the job. Content validation is ideal for these
types of methods. Of course, with this approach to validation, it is assumed that job candidates have the essential KASOCs at the time of assessment. Another possible problem is
190
ber29163_ch06_185-236.indd 190
17/02/12 2:38 PM
6 / Personnel Selection
that content validation relies on the judgments of humans regarding “job relatedness” or
the validity of these methods and the underlying items of the method. This approach is also
inappropriate for tests of basic constructs such as cognitive or general mental ability or
personality characteristics.
What Is Utility?
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
Low SR is needed
for high utility
The validity correlation coefficient can also be used to calculate the financial value of a selection method, using a utility formula, which can convert correlations into dollar savings
or profits that can be credited to a particular selection method. A method’s utility depends
on its validity but on other issues as well. For example, recall the discussion of selection
ratio in Chapter 5. Selection ratio is the number of positions divided by the number
of applicants for those positions. A test with perfect validity will have no utility if the
selection ratio is 1.0 (one applicant per position). This is why an organization’s reputation,
its recruitment programs,
Wand other HR issues such as compensation are so important for
personnel selection. Valid selection methods have great utility for an organization only
I be selective based on the scores on that method.
when that organization can
Utility (U) or expected
L return based on using a particular selection method is typically
derived based on the formula where U = NsrxySDyZx−NT(C) where Ns = number of job
L
applicants selected; rxy = the validity coefficient for the method; SDy = standard deviation
I
of job performance in dollars
and Zx = average score on the selection method for hired
(a measure of the quality of recruitment); NT = number of applicants assessed with the
S
selection method and C = cost of assessing each job candidate with the selection method.
, validity of a method, the higher its utility. Any increase in the
In general, the higher the
validity of a selection method translates into an increase in the average dollar value of the
annual productivity by employees who are selected with the method. Even an increase in a
K into a substantial annual output per employee and thus large
small percentage can translate
financial gains.
A
Selection methods with high validity that are relatively inexpensive are the ideal in
terms of utility. BeforeScontracting with BA&C, Wackenhut Security had studied the
options and was not impressed
with the validity or utility evidence reported by the test
S
publishers, particularly in the context of the $10–$15 cost per applicant. This was the main
A to develop its own selection battery.
reason Wackenhut decided
BA&C investigated the
Nvalidity of its proposed new selection systems using both criterionrelated and content-validation procedures. This dual approach to validation provides stronger
D
evidence for job relatedness and is more compatible with the Uniform Guidelines issued by
the EEOC. The BA&C study
R recommended that new methods of personnel selection should
be used if the company hoped to increase its sales and decrease the costly employee turnover.
A
The resulting analysis showed substantial financial benefit to the company if it adopted the
new methods for use in lieu of the old ineffective procedures. The first method that BA&C
considered was the application
2 blank.
1
6
APPLICATION BLANKS
1
AND BIOGRAPHICAL DATA
T
Like most companies, Wackenhut first required an application blank requesting standard
S
information about the applicant
to be completed, such as his or her previous employment
history, experience, and education. Often used as an initial screening method, the application blank, when properly used, can provide much more than a first cut. However,
application blanks, as with any other selection procedure used for screening people, fall
under the scrutiny of the courts and state regulatory agencies for possible EEO violations.
HR managers should be cautious about using information on an application blank that disproportionately screens out protected class members, and they must be careful not to ask
illegal questions. The Americans with Disabilities Act (ADA) stipulates that application
blanks should not include questions about an applicant’s health, disabilities, and worker’s
compensation history.
191
ber29163_ch06_185-236.indd 191
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
Application blanks obviously can yield information relevant to an employment decision. Yet, it is often the weight—or lack of weight—assigned to specific information
by particular decision makers that can undermine their usefulness. Decision makers
often disagree about the relative importance of information on application blanks. For
instance, they might disagree about the amount of education or experience required.
Wackenhut required a bachelor’s degree in business or a related discipline for the supervisory job. This criterion alone, however, should not carry all the weight. Wackenhut’s
personnel staff made no effort to develop a uniform practice of evaluating the information on the forms. They did not take into consideration indicators such as the distance an
applicant lived from the workplace. A great distance might indicate that, relative to other
responses, the candidate is more likely to quit as soon as another job comes along that is
closer to home.
A Discrepancy
between Research and
Practice: The Use of
Application Blanks
and Biographical Data
W
What companies do to evaluate application blank data and biographical information and
I do are worlds apart. Scholarly research shows that
what research suggests they should
when adequate data are available, L
the best way to use and interpret application blank information is to derive an objective scoring system for responses to application blank quesL
tions.5 The system is based on a criterion-related validation study, resulting in a weighted
application blank (WAB), with Ithe weights derived from the results of the research.
A criterion-related validation study means that the responses from the application blanks
S
are statistically related to one or more important criteria (e.g., job tenure or turnover) such
, between WAB responses and criterion outcomes
that the critical predictive relationships
(e.g., performance, turnover) can be identified. For example, BA&C was able to show
that where a security guard lived relative to his assigned duties was indeed a significant
predictor of job turnover. AnotherKuseful predictor was the number of jobs held by the
applicant during the past 3 years.
A Figure 6-3 shows some examples from a WAB.
The number and sign in parentheses is the predictive weight for a response. For example,
S to travel 21 or more miles to work (see #2).
you would lose five points if you had
The process of statistically weighting
S the information on an application blank enhances
use of the application blank’s information and improves the validity of the whole process.
The WAB is simply an applicationA
blank that has a multiple-choice format and is scored—
similar to a paper-and-pencil test. A
NWAB provides a predictive score for each job candidate and makes it possible to compare the score with that of other candidates. For example,
D
the numbers in parentheses for the WAB examples in Figure 6-3 were derived from an
R
A
Figure 6-3
Examples of WAB and BIB
WAB EXAMPLES
1. How many jobs have you held in the last five years? (a) none (0); (b) 1 (15); (c) 2–3 (11); (d)
4–5 (23); (e) over 5 (25)
2. What distance must you travel from your home to work? (a) less than 1 mile (15); (b) 1–5 miles
(13); (c) 6–10 miles (0); (d) 11–20 miles (23); and (e) 21 or more miles (25)
2
1
BIB EXAMPLES
6
How often have you made speeches in front of a group of adults?
How many close friends did you have in1your last year of formal education? A. None that I would call
“close.” (20.5); B.1 or 2. (20.2); C. 3 or 4. (0); D. 5 or 6. (0.2); E. 7 or 8 (0.5); F. 9 or 10 (0.7); G.
More than 10 (1.0)
T
How often have you set long-term goals or objectives for yourself?
How often have other students come toS
you for advice? How often have you had to persuade
someone to do what you wanted?
How often have you felt that you were an unimportant member of a group?
How often have you felt awkward about asking for help on something?
How often do you work in “study groups” with other students?
How often have you had difficulties in maintaining your priorities?
How often have you felt “burnt out” after working hard on a task?
How often have you felt pressured to do something when you thought it was wrong?
Source: Adapted from C. J. Russell, J. Matson, S. E. Devlin, and D. Atwater, “Predictive Validity of Biodata Items
Generated from Retrospective Life Experience Essays,” Journal of Applied Psychology 75 (1990), pp. 569–580.
Copyright © 1990 by the American Psychological Association. Reproduced with permission.
192
ber29163_ch06_185-236.indd 192
17/02/12 2:38 PM
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
6 / Personnel Selection
actual study showing that particular responses were related to job tenure (i.e., coded as
either stayed with the company for over 1 year or not). Thus, applicants who had only one
job in the last 5 years (#1 in Figure 6.3) were more likely to stay over a year while applicants who indicated that they had had over five jobs in the last 5 years were much less
likely to remain on the job for a year or longer.
Biographical information blanks (BIBs) are similar to WABs except the items of a
BIB tend to be more personal with questions about personal background and life experiences. Figure 6-3 shows examples of items from a BIB for the U.S. Navy. BIB research
has shown that the method can be an effective tool in the prediction of job turnover, job
choice, and job performance. In one excellent study conducted at the Naval Academy,
biographical information was derived from life-history essays, reflecting life experiences
that were then written in multiple-choice format (see Figure 6-3).6 BIB scoring is usually
derived from a study of how responses relate to important criteria such as job performance.
Asking job candidates toW
elaborate on responses to BIBs with details of experiences such as
dates and people involved in the events appears to enhance the effectiveness of the method
by reducing the responseI faking (and embellishments). For example, applicants for a sales
manager job might be asked
L to provide the names and dates of past sales team(s) and the
specific accomplishments of the team.
L
WABs and BIBs have been used in a variety of settings for many types of jobs. WABs
I and sales jobs. BIBs have been used successfully in the miliare used primarily for clerical
tary and the insurance industry with an average validity of .35. Many insurance companies,
S
for example, use a very lengthy BIB to screen their applicants. Check out www.e-Selex.com
, service.
for an online biodata testing
The accomplishment record is an approach similar to a BIB. Job candidates are
asked to write examples of their actual accomplishments, illustrating how they had masK or challenges. Obviously, the problems or challenges should
tered job-related problems
be compatible with theA
problems or challenges facing the organization. The applicant
writes these accomplishments for each of the major components of the job. For example,
S
in a search for a new business
school dean, applicants were asked to cite a fund-raising
project they had successfully
organized. HRM specialists evaluate these accomplishS
ments for their predictive value or importance for the job to be filled. Accomplishment
records are particularlyA
effective for managerial, professional, and executive jobs.7 In
general, research indicates
N that methods such as BIBs and accomplishment records are
more valid as predictors of future success than credentials or crude measures of job
D
experience. For example, having an MBA versus only a bachelor’s degree is not a particularly valid predictorR
of successful management performance. What an applicant has
accomplished in past jobs or assignments is a more valid approach to assessing manageA
rial potential.
How Do You Derive
WAB or BIB or
Accomplishment
Record Weights?
2
To derive the weights for WABs or BIBs, you ideally need a large (at least 100) represen1 or biographical data and criterion data (e.g., job tenure and/or
tative sample of application
performance) of the employees
who have occupied the position under study. You then can
6
correlate responses to individual parts of the instrument with the criterion data. If effective and ineffective (or 1
long-tenure versus short-tenure) employees responded to an item
differently, responses toT
this item would then be given different weights, depending on the
magnitude of the relationship. Weights for the accomplishment record are usually derived
S
by expert judgment for various
problems or challenges.
Research supports the use of WABs, BIBs, and the accomplishment record in selection. The development of the scoring system requires sufficient data and some research
expertise, but it is worthwhile because the resulting decisions are often superior to those
typically made based on a subjective interpretation of application blank information.
What if you can’t do the empirical validation study? Might you still get better results
using a uniform weighted system, in which the weights are based on expert judgment?
Yes. This approach is superior to one in which there is no uniform weighting system and
each application blank or résumé is evaluated in a more holistic manner by whoever is
evaluating it.
193
ber29163_ch06_185-236.indd 193
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
REFERENCE CHECKS
AND BACKGROUND CHECKS
The vast majority of employers now conduct background checks on job applicants. The
goal is to gain insight about the potential employee from people who have had previous
experience with him or her. An important role of the background check is to simply verify
the information provided by the applicant regarding previous employment and experience.
This is a good practice, considering research indicates that between 20 and 25 percent of
job applications include at least one fabrication.8
Many organizations are now “Googling” applicants’ names and searching Facebook
and MySpace for information about job candidates as part of a preliminary background
check. Over a third of executive recruiters indicated in a recent survey that they eliminated job candidates based only on information that they found based on web searches
WA great deal of this information is attributable social
of the candidates’ “digital dossier.”
9
networking sites such as Facebook
I and LinkedIn. In some states, administrators hiring
teachers routinely search the web for potentially embarrassing (or worse) material. In some
states, teachers have been removedL
for risqué web pages and videos. “I know for a fact that
when a superintendent in MissouriL
was interviewing potential teachers last year, he would
ask, ‘Do you have a Facebook or MySpace page?’ ” said Todd Fuller, a spokesman for
I
the Missouri State Teachers Association.
The association is now warning its members to
audit their web pages. “If the candidate
said
yes, then the superintendent would say, ‘I’ve
S
got my computer up right now. Let’s take a look.’ ” The largely unregulated background
check industry may be one of the,fastest growing (and most profitable) of all HR areas
today. These specialty firms often compile “digital dossiers” on individuals based on many
sources, including web searches, interviews with past employers and co-workers, criminal
K
and driving histories, and credit ratings.10 Obviously, people need to closely monitor their
web “presence” or “digital footprint”
A and exercise as much caution as possible to avoid
future incriminating (or embarrassing) information.
S
Fear of negligent hiring lawsuits is a related reason that employers do reference and
S lawsuit is directed at an organization accused
background checks. A negligent hiring
of hiring incompetent (or dangerous)
A employees. Lawsuits for negligent hiring attempt
to hold an organization responsible for the behavior of employees when there is little or
no attempt by the organization toN
assess critical characteristics of those who are hired.
There may be no limit to the liability
D an employer can face for this negligence. One health
management organization was sued for $10 million when a patient under the care of a
R
psychologist was committed to a psychiatric
institution and it was later revealed that the
psychologist was unlicensed and had
lied
about
his previous experience.
A
Organizations also conduct reference checks to assess the potential success of the candidate for the new job. Reference checks provide information about a candidate’s past performance and are also used to assess
2 the accuracy of information provided by candidates.
However, HR professionals should be warned: lawsuits have engendered a reluctance on
1
the part of evaluators to provide anything
other than a statement as to when a person was
employed and in what capacity. These
6 lawsuits have been directed at previous employers
for defamation of character, fraud, and intentional infliction of emotional distress. One jury
1
awarded a man $238,000 for defamation of character because a past employer erroneously
T time, regularly missed two days a week.”11 This legal
reported that “he was late most of the
hurdle has prompted many organizations to stop employees from providing any informaS
tion about former employees other than dates of employment and jobs. Turnaround is fair
play—at least litigiously. Organizations are being sued and held liable if they do not
give accurate information about a former employee when another company makes
such a request. At least one web-based company will check on what references say about
you. At Badreferences.Com, for $87.95, you can receive a reference report from former employers, contractors, even professors. For more money, the same company will help prepare
a “cease and desist” order and, for $120 per hour, provide court testimony on your behalf.
The bottom line appears simple: Tell the truth about former employees. There are laws
in several states that provide protection for employers and former managers who provide
candid and valid evaluations of former employees.
194
ber29163_ch06_185-236.indd 194
17/02/12 2:38 PM
6 / Personnel Selection
What Is the Validity
of Reference Checks?
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
What Are the Legal
Implications of Doing
Background Checks
and Reference Checks
on Job Candidates?
One of the problems with letters of reference is that they are almost always very positive.
While there is some validity, it is low in general (.20–.30 range). One approach to getting
more useful (and valid) distinctions among applicants is to construct a “letter of reference”
or recommendation that is essentially a performance appraisal form. One can construct a
rating form and request that the evaluator indicate the extent to which the candidate was
effective in performing a list of job tasks. This approach offers the added advantage of deriving comparable data for both internal and external job candidates, since the performance
appraisal, or reference data, can be completed for both internal and external candidates.
One study found that reference checks significantly predicted subsequent supervisory ratings (0.36) when they were conducted in a structured and telephone-based format.12 With
this approach, both internal and external evaluators must evaluate performances on the
tasks that are most important for the position to be filled.
An alternative method asks the evaluator to rate the extent of job-related knowledge,
skill, ability, or competencies
W of a candidate. These ratings can then be weighted by experts
based on the relative importance of the KASOCs or competencies for the position to be
I good sense whenever past performance is a strong predictor
filled. This approach makes
of future performance. For
L example, when selecting a manager from a pool of current or
former managers, a candidate’s past performance as a manager is important. Performance
L
appraisals or promotability ratings, particularly those provided by peers, are a valid source
of information about jobI candidates. However, promotability ratings made by managers
are not as valid as other potential sources of information about candidates, such as scores
S
on GMA or performance tests, and assessment centers. The validity of reference checking
,
can be enhanced by gathering
information from a larger number of references (10 to 12 if
possible) and obtaining this information from sources other than those recommended by
the job candidates.13
Employers should doK
their utmost to obtain accurate reference information about external candidates despite the
A difficulties. If for no other reason, a good-faith effort to obtain
verification of employment history can make it possible for a company to avoid (or win)
S
negligent hiring lawsuits.
S
A
Employers often request consumer reports or more detailed “investigative consumer reN credit service as a part of the background check. If they do
ports” (ICVs) from a consumer
this, employers need to D
be aware of state laws related to background checks and the Fair
Credit Reporting Act (FCRA), a federal law that regulates how such agencies provide
R State laws vary considerably on background checks. Exinformation about consumers.
perts maintain that it is A
legally safest to comply with the laws of the states where the job
candidate resides, where the reporting agency is incorporated, and where the employer
has its principal place of business. In general, in order to abide by the FCRA or state law,
four steps must be followed
2 by the employer: (1) Give the job candidate investigated a
notice in writing that you may request an investigative report, and obtain a signed consent
1 of rights under federal law (individuals must request a copy);
form; (2) provide a summary
(3) certify to the investigative
6 company that you will comply with federal and state laws
by signing a form it should provide; and (4) provide a copy of the report in a letter to the
1
person investigated if a copy has been requested or if an adverse action is taken based on
T
information in the report.
White-collar crime, including employee theft and fraud, is an increasingly serious and
S
costly problem for organizations. One bad hire could wipe out a small business. Enter
Ken Springer, a former FBI agent, and now the president of Corporate Resolutions, a fastgrowing personnel investigation company with offices in New York, London, Boston,
Miami, and Hong Kong. Many of Springer’s clients are private equity firms that request
management background checks at companies the equity firms are evaluating for possible
purchase. Springer also does prescreening for management and executive positions.
Springer’s major recommendation is to carefully screen all potential employees (because even entry-level employees can do major damage to an organization) and to carefully
research and verify all information on the résumés. He believes that if a single lie is detected, the applicant should be rejected. In addition, Springer says to be wary of claims that
195
ber29163_ch06_185-236.indd 195
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
are difficult to verify, to carefully research all gaps in applicants’ employment histories and
vague descriptions of what they did, and to require and contact at least three references to
verify as much information as possible. Springer also recommends that after verifying all
facts in a job candidate’s résumé, a thorough background check should be done.
Among other companies doing basic job candidate screening, with prices ranging from
$100 to $400, are Taleo, Automatic Data Processing, HireRight, and National Applicant
Screening. Google “employment screening” and you’ll find numerous other companies
doing preemployment screening and background checks for employers. It is advisable for
employers to consult with the National Association of Professional Background Screeners
(NAPBS) regarding firms to use for background and reference checks. The NAPBS was
founded to promote ethical business practices, to comply with the Fair Credit Reporting
Act, and to foster awareness of issues related to consumer protection and privacy rights
within the background screening industry.
GMA tests are valid for
virtually all jobs
W
I
PERSONNEL TESTINGL
L
Many organizations use general mental ability (GMA) (also known as cognitive ability
I by considerable research indicating that GMA tests
tests) to screen applicants, bolstered
are valid for virtually all jobs in the U.S. economy. The dilemma facing organizations is
S
this: While GMA tests have been shown to be valid predictors of job performance, they can
,
create legal problems because minorities
tend to score lower. GMA tests are ideal for jobs
if considerable learning or training on the job is required and where a more “job-related”
knowledge-based test is inappropriate or unavailable.14
K its use of various forms of personality or motivaCorporate America also is increasing
tional testing—in part due to the body
A of evidence supporting the use of certain methods,
concern over employee theft, the outlawing of the polygraph test, and potential corporate
S Domino’s Pizza settled a lawsuit in which one of
liability for the behavior of its employees.
its delivery personnel was involvedS
in a fatal accident. The driver had a long and disturbing
psychiatric history and terrible driving record before he was hired.
The paper-and-pencil and onlineAtests most frequently used today for employment purposes are GMA tests. These tests N
attempt to measure the verbal, quantitative, mechanical, or sensory capabilities in job applicants. You are probably familiar with these “high
D
stakes” cognitive ability tests: the Scholastic Aptitude Test (SAT), the American College
Test (ACT), the Graduate Management
R Admissions Test (GMAT), the Graduate Record
Examination (GRE), and the Law School Admissions Test (LSAT).
A
Cognitive ability tests, most of which are administered in a paper-and-pencil or computerized format under standardized conditions of test administration, are controversial.
On average, African Americans and
2 Hispanics score lower than Whites on virtually all
of these tests; thus, use of these tests for selection purposes can cause legal problems and
1 greater diversity in its workforce. The critical issue
difficulties for an organization seeking
of test score differences as a function
6 of ethnicity is discussed later in the chapter. Let’s
begin with a definition of GMA testing and provide brief descriptions of some of the most
1 for these tests is reviewed.
popular tests. Next, the validity evidence
What Is a Cognitive
(or General Mental)
Ability Test?
T
S ability (GMA) tests measure one’s aptitude or menCognitive ability or general mental
tal capacity to acquire knowledge based on the accumulation of learning from all possible
sources. Standardized tests of GMA are based on research that has focused on understanding individuals’ ability to reason, plan, solve problems, think abstractly, learn and adapt,
and process and comprehend complex ideas and information.
Such tests should be distinguished from achievement tests, which attempt to measure
the effects of knowledge obtained in a standardized environment (e.g., your final exam in
this course could be considered a form of achievement test). Cognitive ability or GMA
tests are typically used to predict future performance. The SAT and ACT, for example,
were developed to measure ability to master college-level material. Having made this
196
ber29163_ch06_185-236.indd 196
17/02/12 2:38 PM
6 / Personnel Selection
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
The Wonderlic
and the NFL
GMA tests more valid
for more complex jobs
distinction between achievement tests and cognitive ability tests, however, in practice there
isn’t a clear distinction between these two classes of tests. Achievement tests can be used
to predict future behavior, and all tests measure some degree of accumulated knowledge.
Knowledge-based tests assess a sample of what is required on the job. If you are hiring
a computer programmer, a cognitive ability test score might predict who will learn to be
a computer programmer; but a better approach is an assessment of actual programming
knowledge. Knowledge-based tests are easier to defend in terms of job relatedness and are
quite valid (.48) and recommended for identifying those job candidates who can be highly
effective the very first day of work (i.e., no training on the critical knowledge of the job
required). However, knowledge tests can be expensive to develop.15
There are hundreds of GMA tests available. In addition to the “high stakes” tests, some
of the most frequently used tests are the Wechsler Adult Intelligence Scale, the Wonderlic Personnel Test, and the Armed Services Vocational Aptitude Battery. In addition, many of the largest
WU.S. companies have developed their own battery of cognitive
ability tests. AT&T evaluates applicants for any of its nonsupervisory positions on the basis
of scores on one or moreI of its 16 mental ability subtests. McClachy, the communications
giant, has a battery of 10L
mental ability tests that are weighted differently for different jobs.
The Wechsler Adult Intelligence Scale is one of the most valid and heavily researched
L
of all tests. A valid and more practical test is the Wonderlic Personnel Test. The publisher
I in 1938, has data from more than 3 million applicants. The
of this test, first copyrighted
Wonderlic consists of 50 questions covering a variety of areas, including mathematics,
S
vocabulary, spatial relations, perceptual speed, analogies, and miscellaneous topics. Here
, mathematics question: “A watch lost 1 minute 18 seconds in
is an example of a typical
39 days. How many seconds did it lose per day?” A typical vocabulary question might
be phrased as follows: “Usual is the opposite of: a. rare, b. habitual, c. regular, d. stanch,
e. always.” An item thatK
assesses ability in spatial relations would require the test taker to
choose among five figures
A to form depicted shapes. Applicants have 12 minutes to complete the 50 items. The Wonderlic will cost an employer from $1.50 to $3.50 per applicant
Semployer scores the test. The Wonderlic is used by the National
depending on whether the
Football League to provide
S data for potential draft picks (the average score of draftees is
one point below the national population).16
A Wonderlic from the discussion of the Supreme Court rulings in
You may remember the
Griggs v. Duke Power (discussed
in Chapter 3) and Albemarle v. Moody. In Griggs, scores
N
on the Wonderlic had an adverse impact against African Americans (a greater proportion
D
of African Americans failed the test than did whites), and Duke Power did not show that
the test was job related.R
Despite early courtroom setbacks and a decrease in use following the Griggs decision, according to the test’s publisher, the use of the Wonderlic has
A
increased in recent years.
Current interest in cognitive ability tests was spurred by the research on validity
generalization, which strongly
supported the validity of these tests for virtually all jobs
2
and projected substantial increases in utility for organizations that use the tests. Scores on
1 to success in occupational training in both civilian and miliGMA tests are strongly related
tary jobs, with meta-analytic
6 estimates ranging from the high .30s to .70s and averaging
around .50. GMA scores are also related to overall job performance, objective leadership
1
effectiveness, and assessments
of creativity. The strength of the relationship between test
scores and performanceT
increases as training and jobs become more cognitively complex
and mentally challenging. Validities also tend to be even higher for jobs that are dynamic,
S adaptability. Differences in GMA and in specific GMA abilare fast changing, and require
ity patterns also predict differences in educational, occupational, and creative outcomes
years later; that is, the relationships among an individual’s math, verbal, and spatial abilities also predict lead outcomes in education, job performance, and creative endeavors 10
or more years later. Also, a convincing argument can be made that the validities for most
employment selection methods are higher than previously thought. Using an appropriate
statistical adjustment, increases in validity estimates were found to be greater for GMA
than for self-report personality measures. In addition, the incremental validity of the personality measures over that provided by GMA scores alone was found to be smaller (but
still significant) than previously estimated in past studies.17
197
ber29163_ch06_185-236.indd 197
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
Despite abundant research indicating the importance of GMA for complex jobs, it is interesting to note that over half of the top executive MBA programs, as rated by BusinessWeek
magazine in 2005, had actually dropped the GMAT (General Management Admissions Test)
for admissions to their programs. Also, according to one study after controlling for GMA, the
MBA degree itself may not be a good predictor of long-term executive success.18
Figure 6-4 presents some myths regarding the use and interpretation of GMA tests. One
of the more popular myths about GMA is that once a person reaches a certain threshold of
GMA (e.g., a score on a GMA test), then differences on GMA do not matter; that is, these
differences are not related to better performance. For example, Malcolm Gladwell writes
in his best seller Outliers: The Story of Success that “The relationship between success and
IQ works only up to a point. Once someone has an IQ of somewhere around 120, having
additional IQ points doesn’t seem to translate into any measurable real-world advantage.”19
In fact, abundant research indicates that even at the top 1 percent of GMA, a higher level
of GMA is related to higher performance.
W 20
What Are Tests
of Specific Mental
Abilities?
I
A variety of tests have also been L
developed to measure specific abilities, including specific cognitive abilities or aptitudes such as verbal comprehension, numerical reasoning,
L
and verbal fluency, as well as tests assessing mechanical and clerical ability and physical
or psychomotor ability, including Icoordination and sensory skills. The most widely used
mechanical ability test is the Bennett Mechanical Comprehension Test (BMCT). First
S
developed in the 1940s, the BMCT consists mainly of pictures depicting mechanical situations with questions pertaining to, the situations. The respondent describes relationships
between physical forces and mechanical issues. The BMCT is particularly effective in the
prediction of success in mechanically oriented jobs.
K
While there are several tests available
for the assessment of clerical ability, the most
popular is the Minnesota ClericalA
Test (MCT). The MCT requires test takers to quickly
compare either names or numbers and to indicate pairs that are the same. The name
S
S
A
Figure 6-4
Myths about the Usefulness of General Mental Ability
N
1. There is no relationship with important outcomes such as creativity or leadership.
D domains, job for both civilian and military jobs with
FINDING: Scores on GMA tests are strongly related to success in academic
meta-analytic estimates from the high .30s to .70s. GMA scores also predict important outcomes in all jobs including overall job
performance, leadership effectiveness, and assessments of creativity. R
2. There is predictive bias when using GMA tests.
A
FINDING: Research on the fairness of ability tests has drawn the conclusion that tests are not biased against women and minority
groups. More informal hiring practices are much more likely to be biased.
2
FINDING: SES is related to test scores but to only a modest degree. SES
1 variables do not eliminate the predictive power of GMA
tests. SES does not explain the relationship between test scores and subsequent performance.
6
4. There are thresholds beyond which scores cease to matter.
FINDING: More ability is associated with greater performance (e.g., College
1 GPA is linearly related to SAT test scores across the
entire range of scores). Correlations between supervisors’ ratings of employees’ job performance are linearly related to GMA.
5. Other characteristics, especially personality, are more valid than GMA. T
FINDING: Measures of personality, habits, and attitudes can produceS
useful incremental validity in predicting performance but
3. There is a lack of predictive independence from a test takers’ socioeconomic status (SES).
validities of GMA (versus self-report measures of non-cognitive factors) are higher.
Adapted from the following sources: Robertson, K. F., Smeets, S., Lubinski, D., & Benbow, C. P. (2010). Beyond the threshold hypothesis: Even among
the gifted and top math/science graduate students, cognitive abilities, vocational interests, and lifestyle preferences matter for career choice, performance,
and persistence. Current Directions in Psychological Science, 19, 346–351; Connelly, B. S., & Ones, D. S. (2010). Another perspective on personality: Metaanalytic integration of observers’ accuracy and predictive validity. Psychological Bulletin, 136, 1092–1122; Coward, W. M., & Sackett, P. R. (1990). Linearity
of ability performance relationships: A reconfirmation. Journal of Applied Psychology, 75, 297–300; Kuncel, N. R., & Hezlett, S. A. (2010). Fact and fiction in
cognitive ability testing for admissions and hiring decisions. Current Directions in Psychological Science, 19, 339–345; Kuncel, N. R., & Hezlett, S. A. (2007).
Standardized tests predict graduate student success (supplementary material). Science, 315, 1080–1081; Ones, D. S., Viswesvaran, C., & Dilchert, S. (2005).
Cognitive ability in personnel selection decisions. In A. Evers, N. Anderson, & O. Voskuijl (Eds.), The Blackwell handbook of personnel selection. Oxford, UK:
Blackwell; Sackett, P. R., Borneman, M. J., & Connelly, B. S. (2008). High stakes testing in higher education and employment. American Psychologist, 63,
215–227; Sackett, P. R., Kuncel, N. R., Arneson, J., Cooper, S. R., & Waters, S. (2009). Socio-economic status and the relationship between admissions tests
and post-secondary academic performance. Psychological Bulletin, 135, 1–22.
198
ber29163_ch06_185-236.indd 198
17/02/12 2:38 PM
6 / Personnel Selection
comparison part of the test has been shown to be related to reading speed and spelling
accuracy, while the number comparison is related to arithmetic ability.
Research on the use of specific abilities versus GMA favors the use of the GMA in
the prediction of training success and (probably) job performance as well. A meta-analysis
concluded that “weighted combinations of specific aptitudes tests, including those that give
greater weight to certain tests because they seem more relevant to the training at hand, are unnecessary at best. At worst, the use of such tailored tests may lead to a reduction in validity.”21
Are There Racial
Differences in Test
Performance?
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
Griggs v. Duke Power
Many organizations discontinued the use of cognitive ability tests because of the Supreme
Court ruling in Griggs. Despite fairly strong evidence that the tests are valid and their
increased use by U.S. businesses, the details of the Griggs case illustrate the continuing
problem with the use of such tests. The Duke Power Company required new employees
either to have a high school
W diploma or to pass the Wonderlic Personnel Test and the
Bennett Mechanical Comprehension Test. Fifty-eight percent of whites who took the tests
I
passed, while only 6 percent
of African Americans passed. According to the Supreme
Court, the Duke Power Company
was unable to provide sufficient evidence to support the
L
job relatedness of the tests or the business necessity for their use. Accordingly, based on
L
the “disparate impact” theory of discrimination, the Supreme Court ruled that the company
I African Americans under Title VII of the 1964 Civil Rights Act.
had discriminated against
As discussed in Chapter 3, the rationale for the Supreme Court’s decision gave rise to the
S
theory of disparate impact.
,
The statistical data presented
in the Griggs case are not unusual. African Americans, on
average, score significantly lower than whites on GMA tests; Hispanics, on average, fall
about midway between average African American and white scores.22 Thus, under the disK
parate impact theory of discrimination,
plaintiffs are likely to establish adverse impact based
on the proportion of African
Americans
versus whites who pass such tests. If the Griggs
A
case wasn’t enough, the 1975 Supreme Court ruling in Albemarle Paper Company v. Moody
Sorganizations that the use of cognitive ability tests was too risky.
probably convinced many
In Albemarle, the Court S
applied detailed guidelines to which the defendant had to conform
in order to establish the job relatedness of any selection procedure (or job specification) that
caused adverse impact inA
staffing decisions. The Uniform Guidelines in Employee Selection
Procedures, as issued byN
the Equal Employment Opportunity Commission, also established
rigorous and potentially costly methods to be followed by an organization to support the job
D
relatedness of a test if adverse impact should result.
Some major questionsRremain regarding the validity generalization results for cognitive
ability tests: Are these tests the most valid method of personnel selection across all job
A
situations or are other methods, such as biographical data and personality tests, more valid
for some jobs that were not the focus of previous research? Are there procedures that can
make more accurate predictions
than cognitive ability tests for some job situations? Are
2
cognitive ability tests the best predictors of sales success, for example? (Remember the
1 perfect SAT score and a PhD in math from the University of
Unabomber? He had a near
Michigan. How would he
6 do in sales?) Another issue is the extent to which validity can be
inferred for jobs involving bilingual skills. Would the Wonderlic administered in English
have strong validity for 1
a job, such as a customs agent, requiring the worker to speak in
two or more languages?TBilingual job specifications are increasing in the United States.
Invoking the “validity generalization” argument for this type of job based on research inS
volving only the use of English
is somewhat dubious. The validity of such tests to predict
performance for these jobs is probably not as strong as .5.
Another issue concerns the extent to which other measures can enhance predictions
beyond what cognitive ability tests can predict. Generally, human performance is thought
to be a function of a person’s ability, motivation, and personality. The average validity of
cognitive ability tests is about 0.50. This means that 25 percent of the variability in the
criterion measure (e.g., performance) can be accounted for by the predictor, or the test.
That leaves 75 percent unaccounted for. Industrial psychologists think the answer lies in
measures of one’s motivation to perform, personality, or the compatibility of a person’s job
preferences with actual job characteristics.
199
ber29163_ch06_185-236.indd 199
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
What is incremental
validity?
Would a combination of methods—perhaps a cognitive ability test and a personality
or motivational test—result in significantly better prediction than the GMA test alone?
Research indicates that a combination of cognitive and non-cognitive assessments (e.g,
measures of a job candidate’s motivation or personality) may lead to a more comprehensive assessment of an individual and potentially higher validity than any method by itself.23
Motivational or personality assessments through tests, questionnaires, interviews or other
methods add what is known as incremental validity in the prediction of job performance.
In general, GMA and job knowledge tests are highly valid but additional (and valid) tools
can improve validity of personnel decisions and also have the potential to reduce adverse
impact. In general, measures of personality, work habits or preferences, and attitudes demonstrate low to zero correlations with GMA and, therefore, produce very useful incremental
validity in predicting performance across most jobs.24 Accordingly the use of other selection methods that address the non-cognitive components of human performance, in addition to a GMA/cognitive ability orW
knowledge-based test, can help an organization make
better decisions (and with less adverse impact). These measures are discussed shortly.
I
L
This
question
has
interested
researchers
years, yet there appears to be no clear answer.
Why Do Minorities
Lviewforthat
Most
experts
now
generally
take
the
these differences are not created by the tests
Score Lower than
I
but
are
most
related
to
inferior
educational
experiences.
But the problem is not a defect
Whites on GMA Tests?
or deficiency in the tests per se. The critical issue for HRM experts is not how to modify
S
the test itself, but how to use the test in the most effective way. A panel of the National
, cognitive ability tests have limited but real ability to
Academy of Sciences concluded that
predict how well job applicants will perform, and these tests predict minority group performance as well as they predict the future performance of nonminorities. In other words,
K for differences in scores. Obviously, the dilemma for
the tests themselves are not to blame
organizations is the potential conflict
A in promoting diversity while at the same time using
valid selection methods that have the potential for causing adverse impact. As one recent
S
review concluded, “Although the evidence
indicates that the group differences reflected by
standardized cognitive tests are notScaused by the tests themselves, we need to decide how
to address the causes of group differences and wrestle with their consequences. We should
A the nature and development of cognitive abilities
continue to strive to further understand
and seek additional assessments that
N supplement cognitive ability test scores to improve
decision-making accuracy.”25
D
R
The use of top-down selection decisions based strictly on scores on cognitive ability tests
A
is likely to result in adverse impact against minorities. One solution to this problem is to
How Do Organizations
Deal with Race
set a cutoff score on the test so as not to violate the 80 percent rule, which defines adverse
Differences on
Cognitive Ability Tests? impact. Scores above the cutoff score
2 are then ignored and selection decisions are made on
GMA has a linear
relationship with
performance
What is banding?
some other basis. The major disadvantage of this approach is that there will be a significant
decline in the utility of a valid test 1
because people could be hired who are at the lower end
of the scoring continuum, making 6
them less qualified than people at the upper end of the
continuum who may not be selected. Virtually all of the research on cognitive ability
1
test validity indicates that the relationship
between test scores and job performance
is linear; that is, higher test scores
go
with
higher performance and lower scores go
T
with lower performance. Thus, setting a low cutoff score and ignoring score differences
S of people who are less qualified. So, while use of a
above this point can result in the hiring
low cutoff score may enable an organization to comply with the 80 percent adverse impact
rule, the test will lose considerable utility.
Another approach to dealing with potential adverse impact is to use a banding procedure
that groups test scores based on data indicating that the bands of scores are not significantly
different from one another. The decision maker then may select anyone from within this
band of scores. Banding is not unlike grade distributions where scores from 92–100 percent
all receive an “A,” 82–91 receive a “B,” and so on. Where banding can get contentious is
when an organization invokes an argument that scores within a band are “equal” and then
selection is made based on a protected class characteristic to promote diversity or as part
200
ber29163_ch06_185-236.indd 200
17/02/12 2:38 PM
6 / Personnel Selection
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
of an affirmative action program. Unfortunately, research shows that banding procedures
have a big effect on adverse impact only when minority preference within a band is used
for selection. This approach is controversial and may be illegal.26
The use of cognitive ability tests obviously presents a dilemma for organizations.
Evidence indicates that such tests are valid predictors of job performance and academic performance and that validity is higher for jobs that are more complex (see again Figure 6-2).
Employers that use such tests enjoy economic utility with greater productivity and considerable cost savings. However, selection decisions that are based solely on the scores of such
tests will result in adverse impact against African Americans and Hispanics. Such adverse
impact could entangle the organization in costly litigation and result in considerable public relations problems. If the organization chooses to avoid adverse impact, the question
becomes one of either throwing out a test that has been shown to be useful in predicting job
performance or keeping the test and somehow reducing or eliminating the level of adverse
impact. But does such aW
policy leave a company open to reverse discrimination lawsuits
by whites who were not selected for employment since their raw scores on the test were
I by some minorities who were hired? Many organizations, parhigher than scores obtained
ticularly in the public sector,
L have abandoned the use of cognitive ability tests in favor of
other methods, such as interviews or performance tests, which result in less adverse impact
L
and are more defensible in court. However, many other cities and municipalities have opted
I have employed some form of banding in the selection of their
to keep such tests and then
police and firefighters primarily in order to make personnel decisions that do not result in
S
statistical adverse impact.
,
Researchers and practitioners
are very interested in how to select the most effective
candidates while meeting diversity goals and minimizing (or eliminating) adverse impact. There have been some criticisms of the tests themselves with suggestions to remove
K
the “culturally biased” questions.
However, research does not support this recommendation. Research also does
not
support
dropping use of GMA or knowledge-based tests.
A
While many approaches have been proposed and have been taken to reduce statistical
adverse impact against S
minorities, research indicates that some recommendations can
be made.
S
1. Target recruitment A
strategies toward “qualified” minorities.
2. Focus on predictingNall aspects of job performance, including citizenship behavior,
helping co-workers, teamwork, and counter-productive behavior.
D
3. Augment GMA test use with noncognitive methods such as personality tests, peer
assessments, interviews,
R and job preference instruments.
4. Use tools with lessA
adverse impact early in the process and GMA tests later
providing the selection ratio is low.
5. Use accomplishment records, performance tests, or work samples in lieu of GMA
2
tests.
What Are Physical or
Psychomotor Tests?
1
6
Physical, psychomotor, and sensory/perceptual tests are classifications of ability tests
1 particular abilities. Physical ability tests are designed to assess
used when the job requires
a candidate’s physical attributes
(e.g., muscular tension and power, muscular endurance,
T
cardiovascular endurance, flexibility, balance, and coordination). Scores on physical abilS to accidents and injuries, and the criterion-related validity for
ity tests have been linked
these tests is strong. One study found that railroad workers who failed a physical ability test
were much more likely to suffer an injury at work. Psychomotor tests assess processes such
as eye–hand coordination, arm–hand steadiness, and manual dexterity. Sensory/perceptual
tests are designed to assess the extent to which an applicant can detect and recognize differences in environmental stimuli. These tests are ideal for jobs that require workers to edit or
enter data at a high rate of speed and are also valid for the prediction of vigilant behavior.
Recall our discussion earlier that Wackenhut Security was seeking more vigilant armed
security guards. Researchers focused on tests that assessed this skill and found evidence
that sensory/perceptual tests could predict this particular attribute.
201
ber29163_ch06_185-236.indd 201
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
As discussed in Chapter 3, based on the ADA and Title VII, the validity of physical
ability tests has been under close scrutiny. For example, many Title VII lawsuits have been
filed on behalf of female applicants applying for police and firefighter jobs who had failed
some type of physical ability test that purports to assess physically demanding attributes of
the job. In fact, the probability is high for adverse impact against women when a physical
ability test is used to make selection decisions. For example, the strength tests will probably have adverse impact against women (almost two-thirds of all males score higher than
the highest scoring female on muscular tension tests).27 Job analysis data are clearly needed
to establish this attribute as an essential element of a job and that such an attribute is stated
in a job description.
Sensory ability testing concentrates on the measurement of hearing and sight acuity,
reaction time, and psychomotor skills, such as eye and hand coordination. Such tests have
been shown to be related to quantity and quality of work output and accident rates.28
What Is Personality/
Motivational/
Dispositional Testing?
Predicting
counterproductive
behavior
W
While research supports the use ofI GMA tests for personnel selection, performance is a
function of both ability and motivation.
L Scores on GMA or other ability or knowledgebased tests say little or nothing about a person’s motivation or personality to do the job. We
L
can all think of examples of very intelligent individuals who were unsuccessful in many
I
situations (we’re back to the Unabomber
or perhaps you remember Bobby Fisher, the great
but troubled chess player!). Most of us can remember a classmate who was very bright but
S
received poor grades due to low motivation. The validity of GMA tests for predicting sales
, can definitely improve on prediction by using other
success is significant but low and we
assessment tools in addition to a GMA test.29
Most personnel selection programs attempt an informal or formal assessment of an apK
plicant’s personality, motivation, attitudes,
or disposition through psychological testing,
reference checks, or a job interview.
Some
of
these so-called noncognitive assessments are
A
based on scores from standardized tests, performance testing such as job simulations, or
S informal, derived from an interviewer’s gut reaction
assessment centers. Others are more
or intuition. This section reviews the
S abundant literature on the measurement and prediction of motivation, dispositions, and personality characteristics using various forms of assessment. Without question, some A
approaches are more valid than others and some are not
valid at all for use in staffing decisions.
N
There is an increased use of various types and formats for personality or motivational
D
testing, including on-line assessment, video and telephone testing. There is also increasing
evidence that many of these methods
R are valid predictors of job performance and other
important criteria such as job tenure or turnover and counterproductive work behavior
A
(CWB) such as employee theft, aberrant or disruptive behaviors, and interpersonal and
organizational deviance.
Some organizations place great weight
on personality testing for employment decisions.
2
A 2006 survey indicated that 35 percent of U.S. companies use personality tests for person1 may be partially a function of the trend toward more
nel selection.30 The increase in usage
interdependent, team-based, and project-based
organizations with an increased importance
6
placed on the compatibility of the team members. Team members’ personalities are clearly
1 shows that certain traits can predict how people berelated to this compatibility. Research
have and perform in groups.31 We’ll
T review this literature after we define personality and
describe some of the most popular tests that measure personality traits.
S
Although the criterion-related validity
evidence made available to the public is rather
limited, one of the most popular personality assessment tools is the “Caliper Profile,”
developed by the Caliper Corporation (www.calipercorp.com). Its website claims 25,000
clients. BMW, Avis, and GMAC are among the companies that use the Caliper Profile to
hire salespeople. The profile has also been used by numerous sports teams for player personnel issues such as potential trades and drafts. The Chicago Cubs, the Detroit Pistons,
and the New York Islanders are among the sports teams that have used the profile for
drafting and trade considerations (not exactly a ringing endorsement). Many companies
have hired consultants to screen job candidates for their “emotional intelligence” (EI),
probably influenced far less by sound research than by the popularity of the approach, the
202
ber29163_ch06_185-236.indd 202
17/02/12 2:38 PM
6 / Personnel Selection
plethora of consulting in this area, and the 1995 best-seller “Emotional Intelligence” by
Daniel Goleman, who claimed that emotional intelligence is a stronger predictor of job
performance than GMA (it isn’t at least as reported in peer-reviewed, scholarly journals).32
Try HayGroup.com for one of the most popular firms specializing in EI. Sears, IBM, and
AT&T have used personality tests for years to select, place, and even promote employees.
Many companies today use some form of personality test to screen applicants for risk factors related to possible counterproductive behavior.
There are literally thousands of personality tests and questionnaires available that purport to measure hundreds of different traits or characteristics. (Go to www.unl.edu/buros/
for a sample.) The basic categories of personality testing are reviewed next. Figure 6-5
presents a list of some of the most popular tests and methods.
Let’s start with a definition of personality and provide brief descriptions of some of the
more popular personality tests. The validity of the major personality tests is reviewed along
with an overview of relevant
W legal and ethical issues. The section concludes with a description of some relatively new “noncognitive” tests that have shown potential as selection and
I
placement devices.
L
What Is Personality?
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
The “Big Five” or FFM
While personality has been defined in many ways, the most widely accepted definition is
L
that personality refers to an individual’s consistent pattern of behavior. This consistent
pattern is composed of Ipsychological traits. While a plethora of traits have been labeled
and defined, most academic researchers subscribe to a five-factor model (FFM) to describe
S
personality.33 These so-called Big Five personality factors are as follows: (1) Emotional
stability (also known as,Neuroticism)); (2) Extraversion (outgoing, sociable); (3) Openness to experience (imaginative, curious, experimenting); (4) Agreeableness (friendliness, cooperative vs. dominant); and (5) Conscientiousness (dependability, carefulness).
K or inventories that measure the FFM. (Try http://users.wmin
There are several questionnaires
.ac./UK/∼buchant/ for A
a free online “Big Five” test.) There is research supporting the
validity of the FFM in the prediction of a number of criteria (e.g., performance, sales,
S for a variety of jobs. This validity evidence is reviewed in a
counterproductive behaviors)
later section.
S
Two relatively new characterizations of personality are Emotional Intelligence (EI)
A (CSE). A 2008 count found 57 consulting firms devoted priand Core Self-Evaluations
marily to EI and about 90
N firms specializing in training or assessment of EI, 30 EI certification programs, and five EI “universities.”34 EI is considered to be a multidimensional
D
form or subset of social intelligence or a form of social literacy. EI has been the object of
criticism because of differences
in definitions of the contruct and the claims of validity and
R
incremental validity. One definition is that EI is a set of abilities that enable individuals
A
Figure 6-5
Some Examples of
Personality/Dispositional/
Motivational Tests
PROJECTIVE TECHNIQUES AND INSTRUMENTS
2
1
6
1
SELF-REPORT INVENTORIES—EXAMPLES
The NEO-PI-R Personality Inventory
(measures FFM and facets of each)
T
Personal Characteristics Inventory
S
DiSC Profile
Thematic Apperception Test (TAT)
Miner Sentence Completion Scale (MSCS)
Graphology (handwriting analysis)
Rorschach Inkblot Test
Myers-Briggs Type Indicator
Minnesota Multiphasic Personality Inventory (MMPI)
California Personality Inventory (CPI)
Sixteen Personality Factors Questionnaire (16 PF)
Hogan Personality Inventory
Job Compatibility Questionnaire (JCQ)
Emotional Intelligence (e.g., EI Scale)
Core Self-Evaluations Scale (CSES)
Caliper Profile
203
ber29163_ch06_185-236.indd 203
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
Incremental validity
How Do We Measure
Personality?
to recognize and understand their own emotions and those of others in order to guide
their thinking and behavior to help them cope with the environment. One review
concluded that “we are still far from being at the point of rendering a decision as to the
incremental value of EI for selection purposes.”35
CSE is a broad and general personality trait composed of four heavily researched traits:
(1) self-esteem (the overall value that one places on oneself as an individual); (2) selfefficacy (an evaluation of how well one can perform across situations); (3) neuroticism
(the tendency to focus on the negative); and (4) locus of control (the extent to which one
believes s/he has control over life’s events). The core self-evaluation is a basic assessment
of one’s capability and potential.36
There is some research that investigated the extent to which EI and CSE scores add
incremental validity in the prediction of performance beyond the Big Five or other selection tools. In general, this research indicates useful incremental validity for both the EI
construct and CSE..37
W
Personality tests can be sorted intoI two broad categories: projective tests and self-report
inventories. Of course, we also canLuse the interview and data from other sources such as
peer ratings or references as a means for assessing personality characteristics or compeL
tencies as well. Projective tests have many common characteristics, the most significant
I
of which is that the purpose and scoring
procedure of the tests are disguised from the test
taker.38
S
Much concern has been expressed about the ability of job candidates to fake a self-report
,
personality inventory in order to provide
a more favorable impression to an employer. Projective tests make it very difficult to fake responses since the test taker has little or no idea
what a favorable response is. One of the most famous projective tests is the Rorschach
K of inkblots to respondents who must then tell a story
Inkblot Test, which presents a series
of what they see in each one.
A
While numerous projective tests exist, the Miner Sentence Completion Scale (MSCS)
S designed for use in the employment setting and
is one of the few such tests specifically
with some validity evidence to back
Sits use. Its aim is to measure managers’ motivation to
manage others.39 The test appears to work. The test consists of 40 incomplete sentences,
A
such as “My family doctor . . . ,” “Playing
golf . . . ,” and “Dictating letters. . . .” The test
taker is instructed to complete each
sentence.
According to the developer of these tests,
N
the way in which an applicant completes the sentences reflects his or her motivation along
D
seven areas. These areas are capacity to deal with authority figures, dealing with competitive games, handling competitive
R situations, assertiveness, motivation to direct others,
motivation to stand out in a group, and desire to perform day-to-day administrative tasks.
A
On the downside, the MSCS is expensive and there isn’t a great deal of validity evidence
to support its use.
Another projective test that has 2
been used occasionally for employment purposes is the
Thematic Apperception Test, or TAT, a test that typically consists of 31 pictures that de1 situations. The subject is asked to tell a story about
pict a variety of social and interpersonal
each picture to the examiner. Of the631 pictures, 10 are gender-specific while 21 others can
be used with adults of either sex. Test takers are asked to describe who the people are in
each picture and what is happening1in the situation, which is clearly open to interpretation.
The test taker then “projects” the outcome
of the situation. Although a variety of scoring
T
systems have been developed for interpreting a test taker’s responses, one of the most popular approaches involves rating theS
responses with regard to the test taker’s need for power
(i.e., the need to control and influence others), achievement (i.e., the need to be successful), and affiliation (i.e., the need for emotional relationships). Like the MSCS, the TAT
has been used for managerial selection and the limited research indicates some validity as
a predictor of managerial and entrepreneurial success. AT&T has been using the TAT for
years as a part of its assessment center to identify high-potential managerial talent.40
One form of projective test (discussed earlier) that has received considerable attention recently is graphology, or handwriting analysis. With this approach, a sample of
your handwriting is mailed to a graphologist who (for anywhere from $10 to $50) provides an assessment of your intelligence, creativity, emotional stability, negotiation skills,
204
ber29163_ch06_185-236.indd 204
17/02/12 2:38 PM
6 / Personnel Selection
problem-solving skills, and numerous other personal attributes. According to some writers,
graphology is used extensively in Europe as a hiring tool. The Wall Street Journal and Inc.
magazine have reported an increase in the use of the method in the United States since
1989. One handwriting analysis company reports that “With the government pulling the
plug on the polygraph, and employers clamming up on job references and liabilities from
negligent hiring, it is one alternative managers are exploring in an effort to know whom
they are hiring.”41 While the use of the method may be increasing, there is no compelling
evidence that the method does anything but provide an assessment of penmanship. The
only peer-reviewed and published studies on the validity of graphology have found no
validity for the approach.42
Copyright © 2013 The McGraw-Hill Companies. All rights reserved.
Self-Report Personality
Inventories
NEO-PI-R (FFM)
Myers-Briggs
Self-report inventories, which purport to measure personality or motivation with the
respondent knowing the purpose and/or the scoring procedure of the test, are much more
common than projective
Wtechniques. Some instruments screen applicants for aberrant
or deviant behavior (e.g., the MMPI), others attempt to identify potentially high perI
formers, and others, particularly
more recently developed tests, are directed at specific
criteria such as employee
L theft, job tenure/turnover, accident proneness, or customer
orientation.
L
Self-report inventories typically consist of a series of short statements concerning one’s
I
behavior, thoughts, emotions,
attitudes, past experiences, preferences, or characteristics.
The test taker responds to each statement using a standardized rating scale. During the testS
ing, respondents may be asked to indicate the extent to which they are “happy” or “sad,”
“like to work in groups,”, “prefer working alone,” and so forth.
One of the most popular and respected personality tests is the Minnesota Multiphasic
Personality Inventory (MMPI). The MMPI is used extensively for jobs that concern the
Kincluding positions in law enforcement, security, and nuclear
public safety or welfare,
power plants. The MMPI
A is designed to identify pathological problems in respondents,
not to predict job effectiveness. The revised version of the MMPI consists of 566 stateSof going crazy”; “I am shy”; “Sometimes evil spirits control my
ments (e.g., “I am fearful
actions”; “In walking, I S
am very careful to step over sidewalk cracks”; “Much of the time,
my head seems to hurt all over”). Respondents indicate whether such statements are true,
false, or they cannot say.A
The MMPI reveals scores on 10 clinical scales, including depression, hysteria, paranoia, N
and schizophrenia, as well as four “validity” scales, which enable
the interpreter to assess the credibility or truthfulness of the answers. Millions of people
D
from at least 46 different countries, from psychotics to Russian cosmonauts, have struggled
through the strange questions.
R 43
Litigation related to negligent hiring often focuses on whether an organization properly
A
screened job applicants. For example, failure to use the MMPI (or ignoring MMPI results)
in filling public-safety jobs has been cited in legal arguments as an indication of negligent
hiring—although not always
2 persuasively. Unfortunately, some companies are damned if
they do and damned if they don’t. Target stores negotiated an out-of-court settlement based
on a claim of invasion of1privacy made by a California job candidate who objected to a few
questions on the MMPI being
6 used to hire armed guards. Had one of the armed guards who
was hired used his or her weapon inappropriately (and Target had not used the MMPI),
Target could have been 1
slapped with a negligent hiring lawsuit.
Another popular instrument
is the 16 Personality Factors Questionnaire (16PF),
T
which provides scores on the factors of the FFM, plus others. In addition to predicting
performance, the test isS
used to screen applicants for counterproductive work behavior,
such as potential substance abuse or employee theft. AMC Theaters, C&S Corporation
of Georgia, and the U.S. State Department are among the many organizations that use the
16PF to screen job candidates. An advantage of the 16PF over other self-report inventories
is that one of the 16PF factors reveals a reliable and valid measure of GMA as well as
scores on the Big Five factors and “Big-Five subfactors” or facets (discussed later).44
Although there are many instruments available, the NEO Personality Inventory is
one of the most reliable and valid measures of the FFM.45 Another popular instrument
for employee development and team diagnostics rather than for selection purposes is the
Myers-Briggs Type Indicator (MBTI).46
205
ber29163_ch06_185-236.indd 205
17/02/12 2:38 PM
2 / Acquiring Human Resource Capability
What Is the Validity
of Personality Tests?
MSCS validity = .35
Conscientiousness and
emotional stability have
validity for all jobs
Extraversion has validity
for managerial jobs
Use FFM subfactors to
increase validity
Potentially useful personality tests exist among a great number of bad ones, making it difficult to derive general comments regarding their validity. Some instruments and the factors
they measure have shown adequate (and useful) validity while others show little or no validity for employment decisions. In general, the validity is lower for self-report personality
inventories than for cognitive ability tests. However, personality assessments from others
(e.g., peers) appears to have strong validity.47
The one projective instrument with a fairly good but limited track record for selecting
managers is the MSCS. A review of 26 studies involving the MSCS found an average validity coefficient of .35.48 However, almost all of this research was conducted by the test
publisher and not published in peer-reviewed journals.
The latest review of the FFM found that self-reported Conscientiousness and Emotional Stability had useful predictive validity across all jobs but that Conscientiousness
had the highest validity (.31). Extraversion, Agreeableness, and Openness to Experience had useful predictive validity
W but for only certain types of jobs.49 For example,
extraverts are more effective in jobs with a strong social component, such as sales and
management. Extraversion is not Ia predictor of job success for jobs that do not have a
strong social component (e.g., ...
Purchase answer to see full
attachment