ber87251_ch06_135-170
C
1/27/06
H
A
15:48
Page 135
P
E
T
R
6
PERSONNEL SELECTION*
O V E R V I E W
G
It sounds simple: Match employees with
A this task
jobs. Researchers have made
easier by developing selection
T methods that
successfully predict employee effectiveness.
E what research
Still, there is a void between
indicates and how organizations
S actually do
personnel selection. Real-world personnel
,
selection is replete with examples
of
methods that have been proven to be
ineffective or inferior to other methods.
D retention) is
Personnel selection (and
a key to organizational effectiveness.
The
E
most successful firms tend to use methods
A
that accurately predict future performance. We also are interested
Nin selecting
employees who will not only be effective,
D
but who will work for us as long as we
R engage in
need them, and who will not
counterproductive behaviorAsuch as violence, substance abuse, avoidable accidents,
and employee theft.
A multiple-hurdle process
1 involving
an application, reference and background
1
checks, various forms of standardized
testing, and some form of 2
interview is the
typical chronology of events for selection,
3 decisions.
particularly for external hiring
Internal decisions, such asT
promotions, are
typically done with less formality.
S
Selection is the process of gathering and
assessing information about job
candidates and ultimately making
decisions about personnel. The process
applies to both entry-level personnel
decisions and decisions regarding
promotions, transfers, and even job
retention as part of corporate downsizing
efforts.
This chapter will introduce you to
selection, describe some of the most
popular types of screening procedures,
review the research evidence on each, and
discuss the social and legal implications
of selection methods. We will first provide
an overview of selection and the typical
steps employed in the process. We will
then introduce you to the various selection
approaches in their usual order of use:
First, we will review application and
biographical blanks; next, we will review
the use of background and reference
checks; then we will review the various
forms of standardized tests that purport to
assess applicants’ KASOCs. Finally, the
chapter will conclude with a discussion of
the use of more sophisticated selection
procedures, such as assessment centers,
performance testing and work samples,
and drug and medical tests in the
preemployment selection process. Our
context for this discussion will be the
legal implications of the various personnel
practices and areas where there are clear
discrepancies between what typically
happens and what academic research
indicates. This is certainly one chapter
where the discrepancies between findings
in academic research and actual practice
are quite great.
O B J E C T I V E S
After reading this chapter, you should be
able to
1. Understand the concepts of reliability,
validity, and utility.
2. Understand the validity evidence for
various selection methods.
3. Discuss approaches to the more
effective use for application blanks,
*An early version of this chapter was written by Michael
M. Harris and Barbara K. Brown.
135
ber87251_ch06_135-170
136
1/27/06
15:48
Page 136
PART II Acquiring Human Resource Capability
reference checks, biographical data, and the interview
in order to increase the validity and legal defensibility
of each.
4. Discuss the approaches available for drug testing.
5. Review the validity of different approaches to
interviewing.
6. Discuss how the various types of candidate
information should be integrated and evaluated.
Wackenhut Security had its share of selection challenges.
Although recruitment efforts and a sluggish economy
attracted a large number of applicants for its entry-level
armed and unarmed security guard positions, new contract opportunities developed after the September 11,
2001, tragedy and new concern was raised about the
quality of its personnel. The turnover rate for some positions exceeded 100 percent—meaning, the quit rate in
one year exceeded the number of positions. Wackenhut
Security also was dissatisfied with the quality of its
supervisory personnel.
The company contracted with BA&C (Behavioral
Analysts and Consultants), a Florida psychological
consulting firm that specializes in selection problems
and personnel selection. Wackenhut asked BA&C to
develop a new personnel selection system for entry-level
guards and supervisors. Underlying this request was a
need for Wackenhut to improve its competitive position
in this highly competitive industry by increasing sales
and contracts, decreasing costs, and, perhaps most important, making certain their security personnel could
measure up.
The company, which already compensated its guards
and supervisors more than others in the industry, wanted
to avoid an increase in compensation in these areas. The
company estimated that the cost of training a new armed
guard was about $1,800. With several hundred guards
quitting in less than a year, the company often failed to
even recover training costs in sales. Wackenhut needed
new selection methods that could increase the effectiveness of the guards and supervisors and identify guard applicants most likely to stay with the company.
You will recall from Chapter 4 that job analysis
should identify the knowledge, abilities, skills, and other
characteristics (KASOCs) that are necessary for successful performance and retention on the job. In this case,
BA&C first conducted a job analysis of the various guard
jobs to get better information on the KASOCs required
for the work. After identifying the critical KASOCs,
BA&C developed a reliable, valid, and job-related
weighted application blank, screening test, and interview
format.
The process of selection varies substantially from
company to company. While Wackenhut initially used
only a high school diploma as a job specification, an application blank, a background check, and an interview by
someone in personnel, other companies have used more
in the Development and
FIGURE 6-1 Steps
Evaluation of a Selection Procedure
JOB ANALYSIS/HUMAN RESOURCE PLANNING
Identify knowledge, abilities, skills, and other
characteristics (KASOCs) (aka: competencies)
Use a competency model tied to strategy orientation
RECRUITMENT STRATEGY: SELECT/DEVELOP
SELECTION PROCEDURES
Review options for assessing applicants on each of the
KASOCs: Standardized tests (cognitive, personality,
motivational, psychomotor)
Application blanks, biographical data, background,
reference checks, accomplishment record
Performance tests, assessment centers, interviews
G DETERMINE VALIDITY FOR SELECTION
A METHODS
Criterion-related validation
T Expert judgment (content validity)
E Validity generalization
DETERMINE WEIGHTING SYSTEM FOR
S SELECTION METHODS AND RESULTANT DATA
,
Dcomplex methods to select employees. American ProtecEtive Services, for example, the company that handled seAcurity for the Olympics, used a battery of psychological
and aptitude tests along with a structured interview.
N As with the job analysis and the recruitment
Dprocess, personnel selection should be directly linked to
the HR planning function and the strategic objectives of
Rthe company. The mission goal of the Marriott CorporaAtion is to be the hotel chain of choice of frequent travel-
ers. As part of this strategy, the company developed a selection system designed to identify people who could be
1particularly attentive to customer demands. Wackenhut
also had a major marketing strategy aimed at new con1tracts for armed security guards who would be ex2tremely vigilant. They needed a legal selection system
could identify people most likely to perform well in
3that
this capacity.
T Figure 6-1 presents a chronology of events in the seprocess and the major options available for perSlection
sonnel selection. The previous chapters on job analysis,
planning, and recruitment have gotten us to the point of
selecting job candidates based on information from one
or more selection methods. We will review each of these
methods in this chapter. But keep in mind the focus
should be on selecting or developing tools that will provide valid assessments on the critical KASOCs, competencies, or job specifications most important for strategy
execution. So, the job analysis should identify the strategically important KASOCs or competencies. Then, particular selection methods (or tools) should be adopted to
assess these job specifications.
ber87251_ch06_135-170
1/27/06
15:48
Page 137
CHAPTER 6 Personnel Selection
S ELECTION M ETHODS : A RE
T HEY E FFECTIVE ?
Our review includes a summary of the validity of each
major approach to selection and an assessment of the relative cost to develop and administer each method. Three
key terms related to effectiveness are reliability, validity,
and utility. While these terms are strongly related to one
another, the most important criterion for a selection
method is validity. Remember our discussion of the
research on High-Performance Work Systems. One of
the HR practices shown to be related to corporate
performance was the percentage of employees hired
using “validated selection methods.”1 The essence of the
term validity is the extent to which a selection method
predicts one or more important criteria. While the most
typical criterion of interest to selection specialists is job
performance, companies also may be very interested in
other criteria such as how long an employee may stay on
the job or whether the employee will steal, be violent, or
be involved in accidents. But before we address the validity of a method, let’s look at one of the necessary conditions for validity: the reliability of measurement.
What Is Reliability?
A necessary condition in order for a selection method to
be valid is that it first be reliable. Reliability concerns the
consistency of measurement. This consistency applies to
the scores that derive from the selection method. These
scores can come from a paper-and-pencil test, a job interview, a performance appraisal, or any other method that is
used to make decisions about people. The CIA uses a
very long multiple-choice test as an initial sceening device for job applicants to be agents. If applicants were to
take the test twice three weeks apart, their scores on the
test would stay pretty much the same (the same thing can
be said for SAT scores). The level of reliability can be
represented by a correlation coefficient. Correlations
from 0 to 1.0 show the extent of the reliability. Generally,
reliable methods have reliability coefficients that are .8 or
higher, indicating a high degree of consistency in scores.
No selection method achieves perfect reliability, but the
goal should be to reduce error in measurement as much as
possible. If raters are a part of the selection method, such
as job interviewers or on-the-job performance evaluators,
the extent to which different raters agree also can represent the reliability (or unreliability) of the method.
Remember how we cast serious doubts upon
graphology (or handwriting analysis) in Chapter 1? This
method of selection is used by some U.S. companies and
even more European firms. One problem with this
method is that it is not even reliable, much less valid. If
the same handwriting sample was given to two
graphologists, they would generally not agree on scores
on various employment-related attributes (e.g., drive,
137
creativity, intelligence). Even if they did agree this does
not necessarily mean that their assessments are valid.
More reliable tests tend to be longer. One of the reasons the SAT, the GRE, the GMAT, and the LSAT seemingly take forever to complete is so these tests will have
very high reliability (and they do). But while high reliability is a necessary condition for high validity, high reliability does not ensure that a method is valid. The SAT
may be highly reliable, but do scores on the SAT predict
anything important such as how well you actually will
perform in college? This question addresses the validity
of the method.
G
A
T
E
S
,
D
E
A
N
D
R
A
1
1
2
3
T
S
What Is Validity?
The objective of the Wackenhut Security Consultants was
to develop a reliable, valid, legally defensible, userfriendly, and inexpensive test that could predict both job
performance and long job tenure for security guards. The
extent to which the test was able to predict an important
criterion was an indication of the test’s validity. The term
validity is close in meaning but not synonymous with that
critical legal term job relatedness, which we discussed in
Chapters 3 and 4. Empirical or criterion-related validity involves the statistical relationship between performance or scores on some predictor or selection method
(e.g., a test or an interview) and performance on some criterion measure such as on-the-job effectiveness (e.g.,
sales, supervisory ratings, job turnover, employee theft).
At Wackenhut, a study was conducted in which scores on
their proposed screening test were correlated with job
performance and job tenure. Such a study would strongly
support an argument of job relatedness.
The statistical relationship is usually reported as a
correlation coefficient. This describes the relationship between the predictor and measures of effectiveness (also
called criteria). Correlations from 1 to 1 show the direction and strength of the relationship. Higher correlations indicate stronger validity. Assuming that the study
was conducted properly, a significant correlation between
a method’s scores and some important criterion could be
offered as a strong argument for the job relatedness of the
method if the method resulted in adverse impact against a
protected class. Figure 6-2 presents a summary of the empirical validity evidence for the various selection tools,
the cost of their development and administration, and
group differences by ethnicity.
The higher the correlation, the more predictive (and
valid) the selection method. The correlation also can be
used to calculate the financial value of a selection
method, using a utility formula, which can convert correlations into dollar savings or profits that can be credited
to a particular selection method. A method’s utility depends on its validity but other issues as well. For example, recall our discussion of selection ratio in Chapter 5.
Selection ratio is the number of positions divided by the
ber87251_ch06_135-170
138
1/27/06
15:48
Page 138
PART II Acquiring Human Resource Capability
FIGURE 6-2 Selection Tools, Cost for Development and Administration, and Group Differences
Validitya
Tool
Costs
(Development/
Administration)b
Group
Differencesc
Cognitive ability tests measure mental abilities such
as logic, reading comprehension, verbal or mathematical
reasoning, and perceptual abilities, typically with
paper-and-pencil or computer-based instruments.
.51
Low/low
B/W: 1.0
H/W: .5
A/W: .2
W/M: 0
Structured interviews measure a variety of skills
and abilities, particularly noncognitive skills (e.g.,
interpersonal skills, leadership style, etc.) using a
standard set of questions and behavioral response
anchors to evaluate the candidate.
.51
High/high
B/W: .23
H/W: .17
.31
Low/high
B/W: .32
H/W: .71
.54
High/high
B/W: .38
.48
High/low
B/W: .38
.31
Low/low
B/W: .06
H/W: .04
A/W: .08
W/M: .08
.35
High/low
B/W: .78
for grades
B/W: .27
biodata
H/W: .08
biodata
W/M: .15
biodata
.34
High/low
B/W: .61
on paper
and pencil
B/W: .43
on video
H/W: .26
on paper
and pencil
H/W: .39
on video
W/M: .26
on paper
and pencil
W/M .19
on video
Unstructured interviews measure a variety of skills
and abilities, particularly noncognitive skills (e.g.,
interpersonal skills, leadership style, etc.) using
questions that vary from candidate to candidate and
interviewer to interviewer for the same job. Often,
specific standards for evaluating responses are not used.
Work samples measure job skills (e.g., electronic repair,
planning and organizing), using the actual performance
of tasks that are similar to those performed on the job.
Typically, work samples use multiple, trained raters and
detailed rating guides to classify and evaluate behaviors.
Job knowledge tests measure bodies of knowledge (often
technical) required by a job, often using formats such as
multiple-choice questions or essay-type items.
Conscientiousness measures the personality trait
“conscientiousness,” typically with multiple-choice or
true/false formats.
Biographical information measures a variety of noncognitive
skills and personal characteristics (e.g., conscientiousness,
achievement orientation) through questions about
education, training, work experience, and interests.
Situational judgment tests measure a variety of
noncognitive skills by presenting individuals with short
scenarios (either in written or video format) and ask
what would be their most likely response or what they see
as the most effective response.
G
A
T
E
S
,
D
E
A
N
D
R
A
1
1
2
3
T
S
ber87251_ch06_135-170
1/27/06
15:48
Page 139
CHAPTER 6 Personnel Selection
139
FIGURE 6-2 (Continued)
Integrity tests measure attitudes and experiences related
to a person’s honesty, dependability, trustworthiness, and
reliability, typically with multiple-choice or true/false formats.
.41
Low/low
B/W: .04
H/W: .14
A/W: .04
W/M: .16
Assessment centers measure knowledge, skills, and
abilities through a series of work samples/exercises that
reflect job content and types of problems faced on the
job, cognitive ability tests, personality inventories,
and/or job knowledge tests.
.37
High/high
Varies by
exercise;
.02 to .58
Reference checks provide information about an
applicant’s past performance or measure the accuracy of
an applicant’s statements on the résumé or in interviews
by asking individuals who have previous experience with
a job candidate to provide an evaluation.
.26
Low/low
??
G
A
Validity values range from 0 to 1.0, with higher numbers indicating better prediction of job performance.
The labels “high” and “low” are designations relative to other tools rather than based
T on some specific expense level.
Values are effect sizes expressed in standard deviation units. Higher numbers indicate a greater difference; negative values mean the first group scores lower. B/W
is black/white difference; H/W is Hispanic/white difference; A/W is Asian/white E
difference; W/M is female/male difference.
Source: Adapted from Ryan, A. M. & Tippins, N. T. (2004). “Attracting and selecting: What psychological research tells us.” HUMAN RESOURCE
S
MANAGEMENT, 43, pp. 307–308. Reprinted with permission of John Wiley & Sons.
,
a
b
c
number of applicants for those positions. A test with perfect validity will have no utility if the selection ratio is 1.0
(one applicant per position). This is why recruitment and
other HR issues such as compensation are so important
for personnel selection. Valid selection methods only
have great utility for an organization when that organization can be selective based on the scores on that method.
There’s almost no point in developing and administrating
a highly valid selection method if you have to hire anyone
who took the test. This was a problem for the military in
2005 and perhaps later.
Content validity assesses the degree to which the
content of a selection method represents (or assesses) the
requirements of the job. A knowledge-based test for “Certified Public Accountant” could be considered to have content validity for an accounting job. Subject matter experts
are typically used to evaluate the compatibility of the content of a test with the actual requirements of a job (e.g., is
the knowledge or skill assessed on the test compatible with
the knowledge or skill required on the actual job?). Such a
study also can be offered as evidence of job relatedness, but
the study should follow the directions provided by the
Supreme Court in Albemarle v. Moody (see Chapter 3) and,
just to be safe, comply with the Uniform Guidelines on Employee Selection Procedures (UGESP). (See www.eeoc.gov
for details on the UGESPs.) Validity generalization invokes evidence from past studies on a selection method that
is then applied to a new and similar jobs and settings.
What Is Utility?
Utility concerns the economic gains derived from using a
particular selection method. The basic formula involves
D
E
A
N
D
R
A
1
1
2
3
T
S
estimating the increase in revenue as a function of the use
of the selection method after subtracting the cost of the
method. As we said above, good utility requires low
selection ratios and thus is related to the ability of the
organization to attract a large number of qualified applicants for each position they need to fill.
Selection methods with high validity but that cost
relatively little are the ideal for utility. Before contracting
with BA&C, Wackenhut Security had studied the options
and was not impressed with the validity or utility evidence reported by the test publishers, particularly in
the context of the $10–$15 cost per applicant. This was
the main reason Wackenhut decided to develop its own
methods.
BA&C investigated the validity of its proposed new
selection systems using both criterion-related and
content-validation procedures. This dual approach to validation provides stronger evidence for job relatedness.
The BA&C study strongly suggested that new methods of
personnel selection should be used if the company hoped
to increase its sales and decrease the costly employee
turnover. The resulting analysis showed substantial financial benefit to the company if it adopted the new methods
for use in lieu of the old ineffective procedures. The first
method BA&C considered was the application blank.
A PPLICATION B LANKS AND
B IOGRAPHICAL DATA
Like most companies, Wackenhut first required a completed application blank requesting standard information
about the applicant, such as previous employment history, experience, and education. Often used as an initial
ber87251_ch06_135-170
140
1/27/06
15:48
Page 140
PART II Acquiring Human Resource Capability
screening method, the application blank, when properly
used, can provide much more than a first cut. However,
application blanks, as with any other selection procedure used for screening people, falls under the scrutiny of
the courts for possible EEO violations. HR managers
should be cautious about using information on an application blank that disproportionately screens out protected
class members, and they must be careful not to ask illegal
questions. The passage of the Americans with Disabilities Act (ADA), for example, states that application
blanks should not include questions about an applicant’s
health, disabilities, and worker’s compensation history.
Application blanks obviously can yield information
relevant to an employment decision. Yet, it is often the
weight—or lack of weight—assigned to specific information by particular decision makers that can seriously undermine their usefulness. Decision makers often disagree
about the relative importance attached to information on
application blanks. For instance they might disagree
about the amount of education or experience required.
Wackenhut required a bachelor’s degree in business or a
related discipline for the supervisory job. This criterion
alone, however, should not carry all the weight. Wackenhut’s personnel staff made no effort to develop a uniform
practice of evaluating the information on the forms. They
did not take into consideration indicators such as the fact
that an applicant lives 20 miles from the workplace. This
may indicate that, relative to other responses, the candidate is more likely to quit as soon as another job comes
along that is closer to home.
A Discrepancy between Research and Practice:
The Use of Application and Biographical Data
What companies do to evaluate application blank data
and biographical information and what research suggests
are worlds apart. Decision makers rarely use a uniform
approach to evaluate data. Scholarly research clearly
shows, with adequate data available, the best way to use
and interpret application blank information is to derive
an objective weighting system.2 The system is based on
an empirical research study, resulting in a weighted application blank (WAB), with the weights derived from
the results of the research. By empirical study, we mean
the responses from the application blanks are statistically
related to one or more important criteria such that the
critical predictive relationships can be identified. For example, BA&C was able to show that where a security
guard lived relative to his assigned duties was indeed a
significant predictor of job turnover. Another useful
predictor was the number of jobs held by the applicant
during the past three years. Figure 6-3 shows some examples from a WAB.
The process of statistically weighting the information on an application blank enhances use of the application blank’s information and improves the validity of
FIGURE 6-3 Examples of WAB and BIB
WAB EXAMPLES
How many jobs have you held in the last five years?
(a) none (0); (b) 1 (5); (c) 2–3 (1); (d) 4–5 (3);
(e) over 5 (5)
What distance must you travel from your home to work?
(a) less than 1 mile (5); (b) 1–5 miles (3); (c) 6–10
miles (0); (d) 11–20 miles (–3); and (e) 21 or more
miles (5)
BIB EXAMPLES
How often have you made speeches in front of a group
of adults?
How often have you set long-term goals or objectives for
yourself?
G How often have other students come to you for advice?
A How often have you had to persuade someone to do what
you wanted?
T How often have you felt that you were an unimportant
member of a group?
E How
often have you felt awkward about asking for help
S on something?
often do you work in “study groups” with other
, How
students?
How often have you had difficulties in maintaining your
priorities?
D How often have you felt “burnt out” after working hard
on a task?
E How
often have you felt pressured to do something when
A you thought it was wrong?
NSource: C. J. Russell, J. Matson, S. E. Devlin, and D. Atwater, “Predictive
of Biodata Items Generated from Retrospective Life Experience
DValidity
Essays,” Journal of Applied Psychology 75 (1990), pp. 569–580. Copyright
1990 by the American Psychological Association. Reproduced with
R©permission.
A
the whole process. The WAB simply is an application
blank that is scored—similar to a paper-and-pencil test.
1It provides a score for each job candidate and makes
it possible to compare the score with that of other
1candidates. For example, the numbers in parentheses for
2the WAB examples in Figure 6-3 were derived from an
study showing particular responses were related
3actual
to job tenure.
T Biographical information blanks (BIB) are similar
WABs except the items of a BIB tend to be more perSto
sonal with questions about personal background and life
experiences. Figure 6-3 shows examples of items from a
BIB for the U.S. Navy. BIB research has shown the method
can be an effective tool in the prediction of job turnover,
job choice, and job performance. In one excellent study
conducted at the Naval Academy, biographical information
was derived from life-history essays, reflecting accomplishments that were then written in multiple-choice format (see Figure 6-3).3 BIB scoring is also derived from a
study of how responses relate to important criteria.
WABs and BIBs have been used in a variety of settings for many types of jobs. WABs are used primarily for
ber87251_ch06_135-170
1/25/06
09:09
Page 141
CHAPTER 6 Personnel Selection
clerical and sales jobs. BIBs have been used successfully
in the military and the insurance industry. Many insurance companies, for example, use a very lengthy BIB to
screen their applicants. Check out www.e-Selex.com for
an online biodata testing service.
The accomplishment record is an approach similar to
a BIB. Job candidates are asked to write examples of their
actual accomplishments, illustrating how they had mastered
job-related problems or challenges. Obviously, the problems or challenges should be compatible with the problems
or challenges facing the organization. The applicant writes
these accomplishments for each of the major components
of the job. For example, in a search for a new business
school dean, applicants were asked to cite a fund-raising
project they had successfully organized. HRM specialists
evaluate these accomplishments for their predictive value
or importance for the job to be filled. Accomplishment
records are particularly effective for managerial, professional, and executive jobs.4 In general, research indicates
that methods such as BIBs and accomplishment records
are more valid than credentials. For example, having an
MBA versus only a Bachelor’s degree is not a particularly
valid predictor of successful management performance.
What an applicant has accomplished in past jobs or assignments is a more valid approach to “leadership” assessment.
How Do You Derive WAB or BIB or
Accomplished Record Weights?
To derive the weights for WABs or BIBs, you ideally need
a large (at least 150) representative sample of application
or biographical data and criterion data (e.g., job tenure
and/or performance) of the employees in the position under
study. You then can correlate responses to individual parts
of the instrument with the performance data. If effective
and ineffective employees responded to an item differently,
responses to this item would then be given different
weights, depending on the magnitude of the relationship.
Weights for the accomplishment record are usually derived
by expert judgment for various problems or challenges.
Research supports the use of WABs, BIBs, and the
accomplishment record in selection. The development
of the scoring system requires considerable work, but it
is worthwhile because the resulting decisions are often
superior to those typically made based on a subjective interpretation of application blank information. However,
since you need a large sample size to validate results, the
WAB technique will probably be useful only for jobs with
many incumbents.
What if you can’t do the empirical validation study?
Might you still get better results using a weighted system,
in which the weights are based on expert judgment? Yes.
This approach is superior to one in which there is no uniform weighting system and each application blank or
résumé is evaluated in a more holistic manner by whoever
is evaluating it.
141
R EFERENCE C HECKS AND
BACKGROUND C HECKS
G
A
T
E
S
,
D
E
A
N
D
R
A
More than 80 percent of companies do some form of reference or background check.5 The goal is to gain insight
about the potential employee from people who have had
previous experience with him or her. An important role of
the background check is to simply verify the information
provided by the applicant regarding previous employment and experience. This is a good practice, considering
research indicates that between 20 and 25 percent of job
applications include at least one fabrication.6 Fear of
negligent hiring lawsuits is a related reason employers
do reference and background checks. A negligent hiring
lawsuit is directed at an organization accused of hiring
incompetent (or dangerous) employees. One HMO was
sued for $10 million when a patient under the care of a
psychologist was committed to a psychiatric institution
and it was later revealed that the psychologist was unlicensed and lied about his previous experience.
A second purpose for reference checks is to assess the
potential success of the person for the new job. Reference
checks provide information about a candidate’s past
performance and are also used to assess the accuracy of
information provided by candidates. However, HR professionals should be warned: a proliferation of lawsuits has
engendered a great reluctance on the part of evaluators to
provide anything other than a statement as to when a person was employed and in what capacity. These lawsuits
have been directed at previous employers for defamation
of character, fraud, and intentional infliction of emotional
distress. This legal hurdle has prompted many organizations to stop employees from providing any information
about former employees other than dates of employment
and jobs. Turnaround is fair play—at least litigiously.
Organizations are being sued and held liable if they do not
give accurate information about a former employee when
another company makes such a request. The bottom line
appears simple: Tell the truth. There are laws in several
states that provide protection for employers who provide
candid and valid evaluations of former employees.
1
1
2
3
T What Are the Legal Implications of Doing
S Background Checks on Job Candidates?
Employers often request consumer reports or more detailed “investigative consumer reports” (ICVs) from a
consumer credit services as a part of the background
check. If this is so, employers need to be aware of state
laws related to background checks and The Fair Credit
Reporting Act (FCRA), amended in 2005, a federal law
that regulates how such agencies provide information
about consumers. State laws vary considerably on background checks. Experts maintain that it is legally safest to
comply with the laws of the states where the job candidate resides, where the reporting agency is incorporated,
ber87251_ch06_135-170
142
1/25/06
09:09
Page 142
PART II Acquiring Human Resource Capability
and the employer has its principal place of business. In
general, in order to abide by the FCRA or state law, four
steps must be followed by the employer: (1) Give the job
candidate investigated a notice in writing that you may request an investigative report, and obtain a signed consent
form; (2) Provide a summary of rights under federal law
(individuals must request a copy); (3) Certify to the investigation company that you will comply with federal
and state laws by signing a form they should provide; and
(4) Provide a copy of the report in a letter to the person
investigated if a copy has been requested or if an adverse
action is taken based on information in the report.
One of the problems with letters of reference is that
they are almost always very positive while there is some
validity, it is low in general (.26). One approach to getting
more useful (and valid) distinctions among applicants is
to construct a “letter of reference” or recommendation
that is essentially a performance appraisal form.7 One can
construct a rating form and request that the evaluator
indicate the extent to which the candidate was effective in
performing a list of job tasks. This approach offers the
added advantage of deriving comparable data for both
internal and external job candidates, since the performance appraisal, or reference data, can be completed for
both internal and external candidates.
With this approach, both internal and external evaluators must evaluate performances on the tasks that are most
important for the position to be filled. An alternative approach asks the evaluator to rate the extent of job-related
knowledge, skill, ability, or competencies of a candidate.
These ratings can then be weighted by experts based on the
relative importance of the KASOCs or competencies for
the position to be filled. This approach makes good sense
whenever past performance is a strong predictor of future
performance. For example, when selecting a manager from
a pool of current or former managers, a candidate’s past
performance as a manager is important. Performance appraisals or promotability ratings, particularly those provided by peers, are a valid source of information about
job candidates. However, promotability ratings made by
managers are not as valid as other potential sources of information about candidates, such as performance tests and
assessment centers.
Employers should do their utmost to obtain accurate
reference information despite the difficulties. If for no
other reason, a good-faith effort to obtain verification of
employment history can make it possible for a company
to avoid (or win) negligent hiring lawsuits.
P ERSONNEL T ESTING
Surveys indicate that between 15 and 20 percent of
organizations use some form of ability or knowledge testing to make selection decisions.8 Many companies now
use aptitude or cognitive ability tests to screen applicants,
bolstered by considerable research indicating the tests are
valid for virtually all jobs in the U.S. economy. The
dilemma facing organizations is this: While mental or
cognitive ability tests have been shown to be valid predictors of job performance, they can create legal problems
because minorities tend to score lower.
Corporate America also is increasing its use of various forms of personality or motivational testing—in part
due to the body of evidence supporting the use of certain
methods, concern over employee theft, the outlawing of
the polygraph test, and potential corporate liability for the
behavior of its employees. Lawsuits for negligent hiring
and negligent retention, for example, attempt to hold an
organization responsible for the behavior of employees
when there is little or no attempt to assess critical characGteristics of those who are hired. Domino’s Pizza settled a
lawsuit in which one of its delivery personnel was inAvolved in a fatal accident. The driver had a long and
Tdisturbing psychiatric history and terrible driving record
before he was hired.
E Cognitive ability tests are the most frequently used
Spaper-and-pencil tests in use today. These tests attempt to
mental, clerical, mechanical, or sensory capabil, measure
ities in job applicants. You are probably familiar with
these cognitive ability tests: the Scholastic Aptitude Test
the American College Test (ACT), and the GenD(SAT),
eral Mental Ability Test (GMAT). Cognitive ability tests,
Emost of which are administered in a paper-and-pencil or
Acomputerized format under standardized conditions of
test administration, are controversial. On the average,
NAfrican Americans and Hispanics score lower than
Dwhites on virtually all of these tests; thus, use of these
tests can affect employment and other opportunities for
Rminorities (see Figure 6-2).
A We will address the critical issue of test score differences as a function of ethnicity later in the chapter. Let us
begin our discussion with a definition of cognitive ability
1testing and provide brief descriptions of some of the most
popular tests. Then we will review the validity evidence
1for these tests. We will conclude with a focus on the legal
2aspects of cognitive ability testing in the context of the
latest research, ethnic score differences, and case law.
3
T
What Is a Cognitive Ability Test?
SCognitive ability tests measure one’s aptitude or mental
capacity to acquire knowledge based on the accumulation
of learning from all possible sources. Such tests are often
distinguished from achievement tests, which attempt to
measure the effects of knowledge obtained in a standardized environment (e.g., your final exam in this course could
be considered a form of achievement test). Cognitive
ability or aptitude tests are typically used to predict future
performance. Examples are the SAT and ACT, which were
developed to measure ability to master college-level material. Having made this distinction between achievement
tests and cognitive ability tests, however, we hasten to
ber87251_ch06_135-170
1/27/06
15:48
Page 143
CHAPTER 6 Personnel Selection
point out that in practice there isn’t a clear distinction
between these two classes of tests. Achievement tests can
be used to predict future behavior and all tests measure
some degree of accumulated knowledge. Knowledgebased tests assess a sample of what is required on the job.
If you are hiring a computer programmer, a cognitive
ability test score might predict who will learn to be a
computer programmer; yet, you would benefit more with
an assessment of actual programming knowledge.
Knowledge-based tests are easier to defend in terms of
job relatedness and are quite valid (.48). They can be
expensive to develop.
There are hundreds of mental or cognitive ability
tests available. Some of the most frequently used and
highly regarded tests are the Wechsler Adult Intelligence
Scale, the Wonderlic Personnel Test, and the Armed Services Vocational Aptitude Battery. In addition, many of
the largest U.S. companies have developed their own battery of cognitive ability tests. AT&T evaluates applicants
for any of its nonsupervisory positions on the basis of
scores on one or more of its 16 mental ability subtests—
the weights given to a particular test depend on the particular job and the validation results. Knight-Ridder, the
communications giant, has a battery of 10 aptitude tests,
some of which are even used to select newspaper carriers.
There are hundreds of cognitive ability tests available for commercial use. The Wechsler Adult Intelligence Scale is one of the most valid and heavily re
searched. A valid and more practical test is the Wonderlic
Personnel Test. The publisher of this test, first
copyrighted in 1938, has data from more than 3 million
applicants. The Wonderlic consists of 50 questions, covering a variety of areas including mathematics, vocabulary, spatial relations, perceptual speed, analogies, and
miscellaneous topics. Here is an example of a typical
mathematics question: “A watch lost 1 minute 18 seconds
in 39 days. How many seconds did it lose per day?” A
typical vocabulary question might be phrased as follows:
“Usual is the opposite of: a. rare b. habitual c. regular
d. stanch e. always.” An item that assesses ability in spatial relations would require the test taker to choose among
five figures to form depicted shapes. Applicants have
12 minutes to complete the 50 items. The Wonderlic will
cost an employer from $1.50 to $3.50 per applicant
depending on whether the employer scores the test. The
Wonderlic is used by the National Football League to
provide data for potential draft picks (the average score of
draftees is one point below the national population).9
You may remember the Wonderlic from our discussion of the Supreme Court ruling in Griggs v. Duke Power
(discussed in Chapter 3) and Albemarle v. Moody. In
Griggs, scores on the Wonderlic had an adverse impact
against African Americans (a greater proportion of
African Americans failed the test than did whites); and
Duke Power did not show that the test was job related.
Despite early courtroom setbacks and a decrease in use
143
following the Griggs decision, according to the test’s
publisher, the use of the Wonderlic has increased in
recent years.
What Are Tests of Specific Ability?
G
A
T
E
S
,
D
E
A
N
D
R
A
1
1
2
3
T
S
A variety of tests also have been developed to measure
specific abilities, including specific cognitive abilities
such as verbal comprehension, numerical reasoning, and
verbal fluency, as well as tests assessing mechanical or
clerical ability, physical or psychomotor ability, including
coordination and sensory skills. The most widely used
mechanical ability test is the Bennett Mechanical Comprehension Test (BMCT). First developed in the 1940s,
the BMCT consists mainly of pictures depicting mechanical situations with questions pertaining to the situations.
The respondent describes relationships between physical
forces and mechanical issues. The BMCT is particularly
effective in the prediction of success in mechanically
oriented jobs.
While there are several tests available for the assessment of clerical ability, the most popular is the
Minnesota Clerical Test (MCT). The MCT requires test
takers to quickly compare either names or numbers and to
indicate pairs that are the same. The name comparison
part of the test has been shown to be related to reading
speed and spelling accuracy, while the number comparison is related to arithmetic ability.
Physical, psychomotor, and sensory/perceptual are
classifications of ability tests used when the job requires
particular abilities. Physical ability tests are designed to
assess a candidate’s muscular strength, movement quality, and cardiovascular endurance. Scores on physical
ability tests have been linked to accidents and injuries.
One study found that railroad workers who failed a physical ability test were much more likely to suffer an injury
at work. Psychomotor tests assess processes such as eyehand coordination, arm-hand steadiness, and manual dexterity. Sensory/perceptual tests are designed to assess the
extent to which an applicant can detect and recognize differences in environmental stimuli. These tests are ideal
for jobs that require workers to edit or enter data at a high
rate of speed. For example, Bank of America uses a battery of these tests to screen applicants for checking account data entry.
As we discussed in Chapter 3, the validity of physical ability tests has been under close scrutiny lately, particularly with regard to their use for public safety jobs.
Many lawsuits have been filed on behalf of female applicants applying for police and firefighter jobs who had
failed some type of physical ability test, such as push-ups,
sit-ups, or chin-ups. In fact, the probability is great for adverse impact against women when a physical ability test
is used to make selection decisions.10 Sensory ability
testing concentrates on the measurement of hearing and
sight acuity, reaction time, and psychomotor skills, such
ber87251_ch06_135-170
144
1/27/06
15:48
Page 144
PART II Acquiring Human Resource Capability
as eye and hand coordination. Such tests have been
shown to be related to quantity and quality of work output
and accident rates.
Are There Racial Differences
in Test Performance?
Many organizations discontinued the use of cognitive
ability tests because of the Supreme Court ruling in
Griggs. Despite fairly strong evidence that the tests are
valid and increased use by U.S. businesses, the details
of the Griggs case illustrate the continuing problem
with the use of such tests. The Duke Power Company
required new employees either to have a high school
diploma or to pass the Wonderlic Personnel Test and
the Bennett Mechanical Comprehension Test. Fifty-eight
percent of whites who took the tests passed, while only
6 percent of African Americans passed. According to the
Supreme Court, the Duke Power Company was unable to
provide sufficient evidence to support the job relatedness
of the tests or the business necessity for their use.
Accordingly, the High Court ruled that the company had
discriminated against African Americans under Title VII
of the 1964 Civil Rights Act. As we discussed in Chapter 3,
the rationale for the Supreme Court’s decision gave rise to
the theory of disparate impact.
The statistical data presented in the Griggs case are
not unusual. African Americans, on average, score significantly lower than whites on cognitive ability tests; Hispanics, on average, fall about midway between average
African American and white scores.11 See Figure 6-2 for
a summary of group differences as a function of selection
tool options. Thus, under the disparate impact theory of
discrimination, plaintiffs are likely to establish adverse
impact based on the proportion of African Americans
versus whites who pass such tests. If the Griggs case wasn’t
enough, the 1975 Supreme Court ruling in Albemarle
Paper Company v. Moody probably convinced many organizations that the use of cognitive ability tests was too
risky. In Albemarle, the Court applied specific and difficult guidelines to which the defendant had to conform in
order to establish the job relatedness of the particular test.
The Uniform Guidelines in Employee Selection Procedures, as issued by the Equal Employment Opportunity
Commission, also established rigorous and potentially
costly methods to be followed by an organization to support the job relatedness of the test if adverse impact
should result. Current interest in cognitive ability tests was
spurred by the research on validity generalization, which
strongly supported the validity of these tests for virtually
all jobs and projected substantial increases in utility for
organizations that use the tests. The average validity of such
tests was reported to be .51.12 (See Figure 6-2.)
Some major questions remain regarding the validity
generalization results for cognitive ability tests: Are
these tests the most valid method of personnel selection
across all job situations or are other methods, such as
biographical data and personality tests, more valid for
some jobs that were not the focus of previous research?
Are there procedures that can make more accurate predictions than cognitive ability tests for some job situations?
Are cognitive ability tests the best predictors of sales success, for example? (Remember the Unabomber? He had a
Ph.D. in math from the University of Michigan. How
would he do in sales?) Another issue is the extent to which
validity can be inferred for jobs involving bilingual skills.
Would the Wonderlic administered in English have strong
validity for a job, such as a customs agent, requiring the
worker to speak in two or more languages? Bilingual job
specifications are increasing in the United States. Invoking
Gthe “validity generalization” argument for this type of job
based on research involving only the use of English is
Asomewhat dubious. The validity of such tests to predict
Tperformance for these jobs is probably not as strong as .5.
Another issue concerns the extent to which other
Emeasures can enhance predictions beyond what cognitive
Sability tests can predict. Generally, human performance is
to be a function of a person’s ability, motivation,
, thought
and personality. The highest estimate of the validity
of cognitive ability tests is about .50. This means that
percent of the variability in the criterion measure (e.g.,
D25
performance) can be accounted for by the predictor, or the
Etest. That leaves 75 percent unaccounted for. Industrial
Apsychologists think the answer lies in measures of one’s
motivation to perform, personality, or the compatibility of
Na person’s job preferences with actual job characteristics.
D Would a combination of methods—perhaps, a cognitive ability test and a personality or motivational test—
Rresult in significantly better prediction than the cognitive
Aability test alone? Research indicates that a combination
of cognitive and motivational tests may lead to a more
comprehensive assessment of an individual.13 These tools
1add what is known as “incremental validity” in the prediction of job performance. In general, cognitive ability and
1job knowledge tests are valid but additional (and valid)
2tools can add validity to the prediction. Accordingly, the
of other tests that address the motivational components
3use
of human performance, in addition to a cognitive ability or
Tknowledge-based test, can help an organization make better
decisions. We will discuss these measures shortly.
S
Why Do Minorities Score Lower than
Whites on Cognitive Ability Tests?
This question has interested researchers for years; yet
there appears to be no clear answer. Most HRM experts
now generally take the view that these differences are not
created by the tests, but are most related to inferior educational experiences. But the problem is not a defect or
deficiency in the tests per se. The critical issue for HRM
experts is not how to modify the test itself, but how to use
the test in the most effective way. A panel of the National
ber87251_ch06_135-170
1/27/06
15:48
Page 145
CHAPTER 6 Personnel Selection
Academy of Sciences concluded that cognitive ability
tests have limited but real ability to predict how well job
applicants will perform, and these tests predict minority
group performance as well as they predict the future performance of nonminorities. In other words, the tests
themselves are not to blame for differences in scores.
Obviously, the dilemma for organizations is the potential
conflict in promoting diversity while at the same time
using valid selection methods that have the potential for
causing adverse impact.14
How Do Organizations Deal with Race
Differences on Cognitive Ability Tests?
The use of top-down selection decisions based strictly on
scores on cognitive ability tests is likely to result in
adverse impact against minorities. One solution to this
problem is to set a cutoff score on the test so as not to
violate the 80 percent rule, which defines adverse impact.
Scores above the cutoff score are then ignored and selection decisions are made on some other basis. The major
disadvantage of this approach is that there will be a significant decline in the utility of a valid test because people could be hired who are at the lower end of the scoring
continuum, making them less qualified than people at the
upper end of the continuum who may not be selected. Virtually all of the research on cognitive ability test validity
indicates that the relationship between test scores and job
performance is linear; that is, higher test scores go with
higher performance and lower scores go with lower performance. Thus, setting a low cutoff score and ignoring
score differences above this point can result in the hiring
of people who are less qualified. So, while use of a low
cutoff score may enable an organization to comply with
the 80 percent adverse impact rule, the test will lose
considerable utility.
Another approach to dealing with potential adverse
impact is to use a banding procedure that groups test
scores based on data indicating that the bands of scores
are not significantly different from one another. The decision maker then may select anyone from within this band
of scores. Research shows that banding procedures have
less effect on adverse impact than the characteristics of
the applicant pool. Banding only has a big effect on
adverse impact when minority preference within a band is
used for selection. This approach is controversial and
legally questionable.15
The use of cognitive ability tests obviously presents a
dilemma for organizations. Evidence indicates that such
tests are valid predictors of job performance across a wide
array of jobs (see Figure 6-2). Employers who use such
tests enjoy economic utility with greater productivity and
considerable cost savings. However, selection decisions
that are based solely on the scores of such tests will result
in adverse impact against African Americans and Hispanics. Such adverse impact could entangle the organization in
G
A
T
E
S
,
145
costly litigation and result in considerable public relations
problems. If the organization chooses to avoid adverse impact, the question becomes one of either throwing out a test
that has been shown to be useful in predicting job performance or keeping the test and reducing or eliminating the
level of adverse impact. Does such a policy leave a company open to reverse discrimination lawsuits by whites
who were not selected for employment—their raw scores
on the test were higher than scores obtained by some
minorities who were hired? Many organizations, particularly in the public sector, have abandoned the use of cognitive ability tests in favor of other methods, such as interviews or performance tests, which result in less adverse
impact and are more defensible in court.
However, many other cities and municipalities have
opted to keep such tests and then employed some form of
banding in the selection of their police and firefighters
primarily in order to make personnel decisions that do not
result in statistical adverse impact.
Researchers and practitioners are very interested in
how to select the most effective candidates while meeting
diversity goals and minimizing (or eliminating) adverse
impact. Figure 6-4 presents a summary of common practices used to reduce adverse impact, the degree of support
in research, and the research findings.
D
E What Is Personality/Motivational/
A Dispositional Testing?
N While research supports the use of cognitive ability tests for
D personnel selection, virtually all HRM professionals regard
performance as a function of both ability and motivation.
R Scores on ability tests say little or nothing about a person’s
A motivation to do the job. We can all think of examples of
1
1
2
3
T
S
very intelligent individuals who were unsuccessful in many
situations (we’re back to the Unabomber!). Most of us can
remember a classmate who was very bright but received
poor grades due to low motivation. The general validity of
cognitive ability tests for predicting sales success is rather
low and much could be done to improve prediction.
Most personnel selection programs attempt an informal or formal assessment of an applicant’s motivation, attitudes, or disposition through psychological testing or a
job interview. Some of these assessments are based on
scores from standardized tests, performance testing such
as job simulations, or assessment centers. Others are more
informal, derived from an interviewer’s gut reaction or
intuition. This section will review the abundant literature
on the measurement and prediction of motivation, disposition, and personality using various forms of testing.
There is an increased use of various types and formats for personality or motivational testings, including
paper-and-pencil types, video and telephone testing, and,
most recently, online testing. Some organizations place
great weight on personality testing for employment decisions. BA&C, the company working with Wackenhut
ber87251_ch06_135-170
146
1/27/06
15:48
Page 146
PART II Acquiring Human Resource Capability
FIGURE 6-4 Practices Used to Reduce Adverse Impact
Common Practices to
Reduce Adverse Impact
Degree of Support
for Practice in Literature
Research Findings
Target recruitment strategies
toward qualified minorities.
Characteristics of the applicant pool (e.g.,
proportion of minorities, average score levels of
minorities) have the greatest effect on rates of
adverse impact; changing these characteristics
through targeted recruitment should help reduce
adverse impact. However, simply increasing
numbers of minorities in the pool will not help
unless one is increasing numbers of qualified
recruits.
Use a selection system that focuses
on predicting performance in areas
such as helping coworkers,
dedication, and reliability, in
addition to task performance.
If the overall performance measure weights
contextual performance (e.g., helping, reliability)
more than task performance and the tests
in a battery are uncorrelated, a test battery
designed to predict this definition of overall
performance will have smaller levels of adverse
impact. Weighting task performance less than
contextual performance in the overall
performance measure will make cognitive ability
less important in hiring and will lead to less
adverse impact.
Use a tool with high adverse
impact and good validity in
combination with a tool with low
adverse impact to reduce the
overall adverse impact
of the system.
0
Provide orientation and preparation
programs to candidates.
0
Remove cognitive ability testing
from the selection process.
Use banding of test scores.
Use tools with less adverse impact
as screening devices early in the
selection process and those with
greater adverse impact as later
hurdles in the process.
Change the more negative test taking
perceptions of minority test takers
about test validity, thereby increasing
motivation and performance.
G
A
T
E
S
,
0
0
0
0
D
E
A
N
D
R
A
1
1
2
3
T
S
The degree to which adverse impact is reduced
by combining tools with lower adverse impact
is greatly overestimated; reductions may be
small or the combination may actually increase
adverse impact.
Coaching and orientation programs have little
effect on size of group differences but are well
received by examinees.
Using only noncognitive predictors (e.g.,
interview, conscientiousness, biodata) will lead
to significantly reduced adverse impact, but
significant black/white differences will remain.
Also, cognitive ability tests are among the most
valid predictors of job performance, and their
removal may result in a selection system that is
less effective.
The use of banding has less effect on adverse
impact than the characteristics of the applicant
pool. Substantial reduction of adverse impact
through banding only occurs when minority
preference within a band is used for selection
(i.e., preferential selection is employed).
Using tools with less adverse impact as
screening devices early in the process and
those with greater adverse impact later in the
process will aid minority hiring if the selection
ratio is low, but will not have much effect if
the selection ratio is high (i.e., few applicants
per position).
May provide a very small reduction in adverse
impact.
ber87251_ch06_135-170
1/25/06
09:09
Page 147
CHAPTER 6 Personnel Selection
147
FIGURE 6-4 (Continued)
Identify and remove culturally
biased test items.
0
Research suggests that clear patterns regarding
what items favor one group or another do not
exist and that removal of such items has little
effect on test scores; however, item content
should not be unfamiliar to those of a particular
culture and should not be more verbally
complex than warranted by job requirements.
Use other modes of presenting
test stimuli than multiple-choice,
paper-and-pencil testing (e.g., video).
0
Changes in format often result in changes in
what is actually measured and can be
problematic; in cases where a format change
was simply that (e.g., changed format without
affecting what was measured), there was no
strong reduction in group differences.
Use portfolios, accomplishment
records, and performance
assessments (work samples) instead
of paper-and-pencil measures.
Relax time limits on timed tools.
0
0
G
A
T
E
S
,
Evidence suggests group differences may not be
reduced by realistic assessments, and reliable
scoring of these methods may be problematic.
Well-developed work samples may have good
validity and less adverse impact than cognitive
ability tests.
Research indicates that longer time limits do not
reduce subgroup differences, and may actually
increase them.
Source: Adapted from Ryan, A. M. & Tippins, N. T. (2004). “Attracting and selecting: What psychological research tells us,” HUMAN RESOURCE
MANAGEMENT, 43, pp. 312–313. Reprinted with permission of John Wiley & Sons.
Security, does psychological screening for hundreds of
companies using specialized reports based on the fivefactor model (FFM) of personality. One of the most popular
personality assessment tools is the “Caliper Profile,” developed by the Caliper Corporation (www.calipercorp.com).
Their Web site claims 25,000 clients. Avis uses the Caliper
Profile to hire salespeople.
Sears, Roebuck and Company, Standard Oil of New
Jersey, and AT&T have used personality tests for years to
select, place, and even promote employees. More companies today use some form of personality test to screen
applicants for risk factors related to possible counterproductive behavior.
We will begin this section with a definition of personality and provide brief descriptions of some of the
more popular personality tests. We will review the validity of these tests and provide an overview of relevant legal
and ethical issues. We will conclude with a description
of four relatively new personality tests that have shown
potential as selection and placement devices.
What Is Personality?
While personality has been defined in many ways, the
most widely accepted definition is that personality refers
to an individual’s consistent pattern of behavior. This
consistent pattern is composed of psychological traits.
Many researchers subscribe to a five-factor model (FFM)
for describing personality.16 These so-called “Big Five”
personality factors are as follows: (1) introversion/extra-
D
E
A
N
D
R
A
1
1
2
3
T
S
version (outgoing, sociable); (2) emotional stability;
(3) agreeableness/likability (friendliness, cooperative);
(4) conscientiousness (dependability, carefulness); and
(5) openness to experience (imaginative, curious, experimenting). There are several tests that measure the
FFM. (Try http://users.wmin.ac./UK/~buchant/ for a free
online “Big Five” test.)
Two relatively new characterizations of personality are
Emotional Intelligence (EI) and Core Self-Evaluations
(CSE). EI is considered to be a multidimensional form or
subset of social intelligence or a form of social literacy.
There are many definitions and many different instruments that purport to measure EI. One definition is that EI
is a set of abilities that enable individuals to recognize
and understand their own emotions and those of others in
order to guide their thinking and behavior to help them
cope with the environment. A recent review of EI revealed an average validity of .23 in the prediction of performance. EI was also found to have low correlations
with cognitive ability (.22), agreeableness (.23), and extraversion (.34) of the Big Five. Obviously, EI is another
construct measure to consider in the development of (or
selection of) a testing battery for job candidates.17
CSE is a broad and general personality trait composed of four heavily researched traits: (1) self-esteem
(the overall value that one places on oneself as an individual); (2) self-efficacy (an evaluation of how well one can
perform across situations); (3) neuroticism (the tendency
to focus on the negative); and (4) locus of control (the
ber87251_ch06_135-170
148
1/27/06
15:48
Page 148
PART II Acquiring Human Resource Capability
Some Examples of Personality/
FIGURE 6-5 Dispositional/Motivational Tests
PROJECTIVE TECHNIQUES AND INSTRUMENTS
Thematic Apperception Test (TAT)
Miner Sentence Completion Scale (MSCS)
Graphology (Handwriting analysis)
Rorschach Inkblot Test
SELF-REPORT INVENTORIES—EXAMPLES
The NEO Personality Inventory (FFM)
Personal Characteristics Inventory
Gordon Personal Preference Inventory
Myers-Briggs Type Indicator
Minnesota Multiphasic Personality Inventory (MHPI)
California Personality Inventory (CPI)
Sixteen Personality Factors Questionnaire (16 PF)
Hogan Personality Inventory
Job Compatibility Questionnaire (JCQ)
Emotional Intelligence (e.g., EI Scale)
Core Self-Evaluation Scale (CSES)
Caliper Profile
extent to which one believes s/he has control over life’s
events). The core self-evaluation is a basic assessment of
one’s capability and potential.
There is some research that investigated the extent to
which these new measures add predictive value (or incremental validity) beyond the Big Five or other selection
tools. In general, this research indicates useful incremental
validity for these measures beyond the big five and other
selection models or tools. For example, research with a new
instrument that purports to measure (CSE) shows scores on
the scale are correlated with job performance and that CSE
has incremental validity over the five-factor model.18
There are literally thousands of personality tests available that purport to measure hundreds of different traits or
characteristics. (Go to www.unl.edu/buros/ for a sample).
We will review the basic categories of personality testing
next. Figure 6-5 presents a list of some of the most popular
tests and methods.
How Do We Measure Personality?
Personality tests can be sorted into two broad categories:
projective tests and self-report inventories. Of course, we
also can use the interview and data from other sources
such as performance appraisals as a means for assessing
personality characteristics or competencies as well. Projective tests have many common characteristics, the most
significant of which is that the purpose and scoring procedure of the test are disguised from the test taker. One of
the most famous projective tests is the Rorschach Inkblot
Test, which presents a series of inkblots to respondents
who must then record what they see in each one.
While numerous projective tests exist, the Miner
Sentence Completion Scale (MSCS) is one of the few
such tests specifically designed for use in the employment
setting. Its aim is to measure managers’ motivation to manage others.19 And the test appears to work. The test
consists of 40 incomplete sentences, such as “My family
doctor . . . ,” “Playing golf . . . ,” and “Dictating letters . . . .”
The test taker is instructed to complete each sentence.
According to the developer of these tests, the way in
which an applicant completes the sentences reflects his or
her motivation along seven areas. These areas are capacity
to deal with authority figures, dealing with competitive
games, handling competitive situations, assertiveness,
motivation to direct others, motivation to stand out in a
group, and desire to perform day-to-day administrative
tasks. On the downside, the MSCS is expensive and there
isn’t a great deal of validity evidence to support its use.
G Another projective test that has been used occasionally for employment purposes is the Thematic ApperAception Test, or TAT, a test that typically consists of a
Tseries of pictures that depict one or more persons in
different situations. Test takers are asked to describe who
Ethe people are and what is happening in the situation,
Swhich is somewhat ambiguous and open to interpretation.
The test taker then determines the outcome of the situa, tion. Although a variety of scoring systems have been developed for interpreting a test taker’s responses, one of the
popular approaches involves rating the responses
Dmost
with regard to the test taker’s need for power (i.e., the
Eneed to control and influence others), achievement (i.e.,
to be successful), and affiliation (i.e., the need for
Aneed
emotional relationships). Like the MSCS, the TAT has
Nbeen used primarily for managerial selection and the
Dlimited research indicates some validity as a predictor of
managerial and entrepreneurial success.
R One form of projective test (which we alluded to
Apreviously) that has received considerable attention recently
is graphology, or handwriting analysis. With this approach,
a sample of your handwriting is mailed to a graphologist
1who (for anywhere from $10 to $50) provides an assessment of your intelligence, creativity, emotional stability, ne1gotiation skills, problem-solving skills, and numerous other
2personal attributes. According to some writers, graphology
used extensively in Europe as a hiring tool. The Wall
3isStreet
Journal and Inc. magazine have reported an increase
Tin the use of the method in the United States since 1989. As
in The Wall Street Journal, “With the government
Sdescribed
pulling the plug on the polygraph, and employers clamming
up on job references and liabilities from negligent hiring, it
is one alternative managers are exploring in an effort to
know whom they are hiring.” While the use of the method
may be increasing, there is no compelling evidence that the
method does anything but provide an assessment of penmanship. The only published studies on the validity of
graphology have found no validity for the approach.20
Self-Report Personality Inventories
Self-report inventories, which purport to measure personality or motivation with the respondent knowing the
ber87251_ch06_135-170
1/27/06
15:48
Page 149
CHAPTER 6 Personnel Selection
purpose and/or the scoring procedure of the test, are more
popular today than projective techniques. Some instruments
screen applicants for aberrant or deviant behavior (e.g., the
MMPI), others attempt to identify potentially high performers, and others, particularly more recently developed tests,
are directed at specific criteria such as employee theft,
job tenure/turnover, accident proneness, or customer
orientation.21
Self-report inventories typically consist of a series of
short statements concerning one’s behavior, thoughts,
emotions, attitudes, past experiences, preferences, or
characteristics. The test taker responds to each statement
using a standardized rating scale. During the testing,
respondents may be asked to indicate the extent to which
they are “happy” or “sad,” “like to work in groups,” “prefer working alone,” and so forth.
One of the most popular and respected personality
tests is the Minnesota Multiphasic Personality Inventory (MMPI). The MMPI is used extensively for jobs that
concern the public safety or welfare, including positions in
law enforcement, security, and nuclear power plants. The
MMPI is designed to identify pathological problems in respondents, not to predict job effectiveness. The revised version of the MMPI consists of more than 566 statements:
“I am fearful of going crazy.” “I am shy.” “Sometimes evil
spirits control my actions.” “In walking, I am very careful
to step over sidewalk cracks.” “Much of the time, my head
seems to hurt all over.” Respondents indicate whether the
statement is true, false, or cannot say. The MMPI reveals
scores on 10 clinical scales, including depression, hysteria,
paranoia, and schizophrenia, as well as four “validity”
scales, which enable the interpreter to assess the credibility
or truthfulness of the answers. Millions of people, from at
least 46 different countries, from psychotics to Russian
cosmonauts, have struggled through the strange questions.
Litigation related to negligent hiring often focuses on
whether an organization properly screened job applicants.
Failure to use the MMPI in filling sensitive jobs has been
cited in legal arguments as an indication of negligent
hiring—although not always persuasively. Unfortunately,
some companies are damned if they do and damned if
they don’t. Target stores negotiated an out-of-court settlement based on a claim of invasion of privacy made by a
California job candidate who objected to a few questions
on the MMPI being used to hire armed guards. Had one of
the armed guards who was hired used his or her weapon inappropriately (and Target had not used the MMPI), Target
could have been slapped with a negligent hiring lawsuit.
Another popular instrument is the 16 Personality Factors (16PF), which provides scores on the factors of the
FFM, plus others. In addition to predicting performance, the
test is used to screen applicants for counterproductive behavior, such as potential substance abuse or employee theft.
AMC Theaters, C&S Corporation of Georgia, and the U. S.
State Department are among the many organizations that
use the 16PF to screen most employees.
149
Although there are many instruments available, the
NEO Personality Inventory is one of the most reliable
and valid measures of the FFM.22 Another very popular
instrument for employee development but that is not considered a good selection instrument is the Myers-Briggs
Type Indicator (MBTI).23
What Is the Validity of Personality Tests?
G
A
T
E
S
,
D
E
A
N
D
R
A
1
1
2
3
T
S
Potentially useful personality tests exist among a great
number of bad ones, making it difficult to derive general
comments regarding their validity. Some instruments
have shown adequate validity while others show no validity at all. One instrument with a good track record for
selecting managers is the MSCS. A review of 26 studies
involving the MSCS found an average validity coefficient
of .35.24 In general, the validity is lower for personality
tests than for cognitive ability tests.
The latest review of the FFM found that Conscientiousness and Emotional Stability had useful predictive
validity for all jobs but that Conscientiousness had the highest validity (see Figure 6-2). Extraversion, agreeableness,
and openness to experience had useful predictive validity
but for only certain situations.25 For example, extraverted
workers are more effective in jobs with a strong social
component, such as sales and management. More Agreeable workers are more effective team members. People
with high scores on Openness to Experience are more receptive to new training. A particular combination of FFM
factors also can predict important criteria. Research involving the FFM and managerial performance shows that Conscientiousness (.25), Extraversion (.21), and Emotional Stability (.24) are useful predictors of managerial success.26
Why do personality tests have such low validity relative to cognitive ability? Experts have given a number of
explanations for the low (but useful) validity of such tests
in the employment context. First, applicants can “fake”
personality tests so their personality as reflected on the
tests was compatible with the requirement of the job.27
Second, some proponents of personality testing have asserted that most of the validity studies involving personality tests are poorly designed with very small sample
sizes. These experts contend that more carefully designed
research would demonstrate higher validity for personality tests. Research shows the weight given to personality
factors should derive from a job analysis of criterionrelated validation research.
Another possible explanation is that behavior is to
a great extent determined situationally, making stable
personality traits unpredictable for criteria, such as job performance or employee turnover. Recall some of the
examples of items from personality tests listed earlier in this
chapter—note that most of the examples are not specific to
the workplace; in fact, most of them are quite general.
Research in other areas has found that behavior is dependent
on the situation. A person who is friendly in outside work
ber87251_ch06_135-170
150
1/27/06
15:48
Page 150
PART II Acquiring Human Resource Capability
might be less sociable in the work setting. In order to enhance predictability, personality assessment should involve
more than one method (e.g., tests, interviews). Personality
assessment could be more specific to the workplace and
target particular criterion measures of interest, such as employee theft, honesty or integrity, or job retention/turnover.
Let us examine these newer approaches next.
Approaches to the Prediction of Particular Criteria
Some forms of personality, dispositional, or motivation assessment attempt to focus on either particular problems or
criteria characteristic of the workplace. Examples are the
prediction of voluntary turnover and the prediction of
employee theft. Another instrument attempts to measure
job compatibility in order to predict turnover. Other new
instruments are designed for particular employment issues,
such as customer service, violence, or accident proneness.
Predicting (and Reducing) Voluntary Turnover
Employee turnover can be a serious and costly problem
for organizations. You may recall the discussion of
Domino’s Pizza. They found that the cost of turnover was
$2,500 each time an hourly employee quit and $20,000
each time a store manager quit. Among other things,
Domino’s implemented a new and more valid test for selecting managers and hourly personnel that was aimed at
predicting both job performance and voluntary turnover.
As of 2005, the program was a great success on all
counts. Turnover was down, store profits were up, and the
stock was doing well in an otherwise difficult market. Attracting and keeping good employees was a key factor in
their turnaround. There are numerous other examples of
companies that have expensive and preventable high levels of turnover that can be reduced with better HR policy
and practice. Recall the discussion of SAS, the North
Carolina software company. Even at the height of the socalled “high tech” bubble in the late 90s, SAS had
turnover rates that were well below the industry average.
Attracting and keeping good employees is considered a
key to the SAS success story.
One recent study revealed guidlines regarding methods that have been shown to be effective at reducing
voluntary turnover.28 A summary of the findings merged
with previous research on turnover is presented in
Figure 6-6. This most recent research drew several conclusions. First, voluntary turnover is less likely if a job
candidate is referred by a current employee or has friends
or family working at the organization. Candidates with
more contacts within the organization are apt to better understand the nature of the job and the organization. Such
candidates probably have a more realistic view of the job
that may provide a “vaccination effect” that lowers expectations, thereby preventing job dissatisfaction and
turnover (realistic job previews can also do this). Also,
current job holders are less likely to refer job candidates
who they feel are less capable or those who (they feel)
would not fit in well with the organization’s culture.
Another argument for an employee referral system is that
having acquaintances within the organization is also
likely to strengthen an employee’s commitment to the
firm and thus reduce the probability that he or she will
leave. Of course, this argument also applies to the employee who made the referral.
Another reliable predictor of voluntary turnover is
tenure in previous jobs. In general, if a person has a history of short-term employment, that person is likely to
quit again. This tendency may also reflect a lower work
ethic (lower Conscientiousness), which is correlated with
organizational commitment and turnover. As discussed
earlier, tenure in previous jobs, measured in a systematic
manner as a part of a weighted application blank (WAB),
Gis predictive of turnover. Intention to quit is also a solid
predictor of, and perhaps the best predictor of, quitting.
ABelieve it or not, questions on an application form such as
T“How long do you think you’ll be working for this company?” are quite predictive of voluntary turnover. Prehire
Edispositions or behavioral intentions, derived from quesStions such as this one or from interview questions, work
well. Measures of the extent of an applicant’s desire
, quite
to work for the organization also predict subsequent
turnover. However, almost all of the research on WABS,
involved entry-level and nonmanagerial positions so
Dhas
its applicability to managerial positions is questionable.
EThis is not true for biodata (or BIBs). Disguised-purpose
Aattitudinal scales measuring self-confidence and decisiveness have been shown to predict turnover for higher level
Npositions as well, including managerial positions.
DAnswers to questions such as “How confident are you
that you can do this job well?” and “When I make a deciRsion, I tend to stick to it” did predict turnover quite well.
AIn addition, there was no evidence of adverse impact
against protected classes using these measures. This
research also revealed that disguised-purpose measures
1added incremental validity to the prediction of turnover
beyond what could be predicted by biodata alone.
1 Another example of a disguised purpose disposi2tional measure is the Job Compatibility Questionnaire
We discussed the JCQ in Chapters 4 and 5. The
3(JCQ).
JCQ was developed to determine whether an applicant’s
Tpreferences-for-work characteristics match the characterof the job.29 One theory is that the compatibility of
Sistics
preference with the job will predict job tenure and
performance. Test takers are presented groups of items
and are instructed to indicate which item is most desirable
and which is least desirable. As we discussed in Chapter 4,
the items are grouped based on a job analysis that identifies those characteristics that are common to the job(s) to
be filled. Here is an example of a sample group: (a) being
able to choose the order of my work tasks, (b) having different and challenging projects, (c) staying physically active on the job, (d) clearly seeing the effects of my hard
work. The items are grouped together in such a way that
the scoring key is hidden from the respondent, reducing
the chance for faking.
ber87251_ch06_135-170
1/27/06
15:48
Page 151
CHAPTER 6 Personnel Selection
151
FIGURE 6-6 Predictors of Voluntary Turnover
1. Rely on employee referrals
Voluntary turnover is less likely if a job candidate is referred by a current employee or has friends or family working at
the organization.
Candidates with more contacts within the organization are apt to better understand the nature of the job and the
organization.
Having friends or family within the organization prior to hire is likely to strengthen the employee’s commitment to the
firm and reduce the likelihood that he or she will leave.
2. Put weight on tenure in previous jobs
A past habitual practice of seeking out short-term employment predicts future short-term employment.
Short-term employment may reflect a poor work ethic, which is correlated with lack of organizational commitment and
turnover.
G
A of turnover.
Intention to quit is one of the best (if not the best) predictors
Despite their transparency, expressions of intentions to stayTor quit before a person starts a new position are an effective
predictor of subsequent turnover. (e.g., how long do you plan to work for the company?)
E
Measure the applicant’s desires/motivations for the position
S
New employees with a strong desire for employment will require less time to be assimilated into the organization’s
,
culture.
3. Measure intent to quit
4.
5. Use disguised-purpose dispositional measures
High self-confidence should respond more favorably to theD
challenges of a new environment.
Employees with higher confidence in their abilities are lessElikely to quit than those who attribute their past performance
to luck.
Decisive individuals are likely to be more thoughtful aboutAtheir decisions, more committed to the decisions they make,
and less likely to leave the organization.
N
Decisiveness is a component of the personality trait of Conscientiousness from the five-factor model.
D turnover.
Decisiveness affects organizational commitment and, indirectly,
R
A
Source: Adapted from M. R. Barrick and R. D. Zimmerman, “Reducing Voluntary, Avoidable Turnover through Selection,” Journal of Applied Psychology 90
(2005), pp. 159–166.
Studies involving customer service representatives,
security guards, and theater personnel indicate that the
JCQ can successfully predict employee turnover for lowskilled jobs. In addition, no evidence of adverse impact
has been found. BA&C incorporated the JCQ in their test
for security guards.30 The JCQ has never been used or
validated for managerial positions.
Predicting Employee Theft
More than five million job applicants took some form of
honesty or integrity test in 2005. These tests are commonly used for jobs in which workers have access to
money, such as retail stores, fast-food chains, and banks.
Honesty tests have become more popular since the
polygraph, or lie detector, test was banned in 1988 by the
Employee Polygraph Protection Act. This federal law
outlawed the use of the polygraph for selection and
greatly restricts the use of the test for other employment
situations. There are some employment exemptions to
the law, such as those involving security services, businesses involving controlled substances, and government
employers.
1
1
2
3
T
S
Most honesty tests contain items concerning an applicant’s attitude towards theft. Sample items typically
cover beliefs about the amount of theft that takes place,
asking test takers questions such as the following: “What
percentage of people take more than $1.00 per week from
their employer?” The test also questions punitiveness towards theft: “Should a person be fired if caught stealing
$5.00?” The test takers answer questions reflecting their
thoughts about stealing: “Have you ever thought about
taking company merchandise without actually taking
any?” Other honesty tests include items that have been
found to correlate with theft: “You freely admit your mistakes.” “You like to do things that shock people.” “You
have had a lot of disagreements with your parents.” Many
banks and retail establishments use honesty tests for employee screening.
The validity evidence for honesty tests is fairly strong,
with no adverse impact. Still, critics point to a number of
problems with the validity studies. First, most of the validity studies have been conducted by the test publishers
themselves; there have been very few independent validation studies. Second, very few of the criteria-related
ber87251_ch06_135-170
152
1/25/06
09:09
Page 152
PART II Acquiring Human Resource Capability
validity studies use employee theft as the criterion. A report
by the American Psychological Association concluded that
the evidence, albeit limited, supports the validity of some
of the most carefully developed and validated honesty
tests. The most recent studies on honesty tests support their
use.31
Predicting Accident Proneness
Accidents are a major problem in the workplace, causing
deaths, injuries, and expense. Preemployment testing is
one strategy some companies have turned to in an effort
to lower accident rates. One test developed to predict (and
prevent) accidents is the Safety Locus of Control (SLC),
which is a paper-and-pencil test containing 17 items assessing attitudes towards safety. A sample item is as follows: “Avoiding accidents is a matter of luck.” The limited validity studies have been encouraging. Such studies
have been conducted in several different industries, including transportation, hotel, and aviation. In addition,
these investigations indicate no adverse impact against
minorities and women.32
Predicting Customer Service
The Service Orientation Index (SOI) was initially developed as a means of predicting the helpfulness of
nurses’ aides in large, inner-city hospitals.33 The test
items were selected from three main dimensions: patient
service, assisting other personnel, and communication.
Here are some examples of SOI items: “I always notice
when people are upset” and “I never resent it when I don’t
get my way.” Several other studies of the SOI involving
clerical employees and truck drivers have reported positive results as well.
How Do You Establish a Testing Program?
Establishing a psychological testing program is a difficult
undertaking—one that should involve the advice of an industrial psychologist. HR professionals should follow
these guidelines before using psychological tests:
1. Most reputable testing publishers provide a test
manual. Study the manual carefully, particularly the
adverse impact and validity evidence. Has the test
been shown to predict success in jobs similar to the
jobs you’re trying to fill? Have adverse impact
studies been performed? What are the findings? Are
there positive, independent research studies in
scholarly journals? Have qualified experts with
advanced degrees in psychology or related fields
been involved in the research?
2. Check to see if the test has been reviewed in Mental
Measurements Yearbook (MMY). Published by the
Buros Institute of the University of Nebraska, the
MMY publishes scholarly reviews of the test by
qualified academics who have no vested interest in
the tests they are reviewing. You can also download
Buros test reviews on line at www.unl.edu/buros.
You can retrieve reviews by test name or by
category (e.g., achievement, intelligence,
personality).
3. Ask the test publishers for the names of several
companies that have used the test. Call a sample of
them and determine if they have conducted any
adverse impact and validity studies. Determine if
legal actions have been taken related to the test; if
so, what are the implications for your situation?
4. Obtain a copy of the test from the publisher and
carefully examine all of the test items. Consider
G each item in the context of ethical, legal, and
privacy ramifications. Organizations have lost court
A cases because of specific items on a test.
T Proceed cautiously in the selection and adoption of
Epsychological tests. Don’t be wowed by a slick test
brochure; take a step back and evaluate the product in the
Ssame manner you would evaluate any product before buy, ing it. Be particularly critical of vendors’ claims and remember that you might be able to assess personality and
motivation by other means. If you decide to adopt a test,
Dmaintain the data so that you can evaluate whether the test
working. In general, it is always advisable to contact
Eissomeone
who can give you an objective, expert appraisal.
A
N
D RUG T ESTING
D
Drug abuse is one of the most serious problems in the
RUnited States today, with productivity costs in the billions
Aand on the rise. Drug abuse in the workplace also has been
linked to employee theft, accidents, absences, use of sick
time, and other counterproductive behavior. Detected am1phetamine use doubled between 2000 and 2004. Methamphetamine is the most commonly used form of ampheta1mine today. To combat this growing problem, many
2organizations are turning to drug testing for job applicants
incumbents. One survey found 87 percent of major
3and
U.S. corporations now use some form of drug testing.34
T While some of the tests are in the form of paper and
examinations, the vast majority of tests conducted
Spencil
are clinical tests of urine or hair samples. Ninety-six
percent of firms refuse to hire applicants who test positive
for illegal drug use, methamphetamines, and some prescription drugs (e.g., oxycontin). While the most common practice is to test job applicants, drug testing of job
incumbents, either through a randomized procedure or
based on probable cause, is also on the increase.
The most common form of urinalysis testing is the
immunoassay test, which applies an enzyme solution to a
urine sample and measures change in the density of the
sample. The drawback of the $20 (per applicant) immunoassay test is that it is sensitive to some legal drugs as
ber87251_ch06_135-170
1/27/06
15:48
Page 153
CHAPTER 6 Personnel Selection
well as illegal drugs. Due to this problem, it is recommended that a positive immunoassay test be followed by
a more reliable confirmatory test, such as gas chromatography. The only errors in testing that can occur with the
confirmatory tests are due to two causes: positive results
from passive inhalation, a rare event (caused by involuntarily inhaling marijuana), and laboratory blunders (e.g.,
mixing urine samples). Hair analysis is a more expensive
but also more reliable and less invasive form of drug
testing. Testing for methamphetamine use is more difficult since the ingredients pass through the body quickly.
Positive test results say little regarding one’s ability
to perform the job, and most testing gives little or no
information about the amount of the drug that was used,
when it was used, how frequently it was used, and
whether the applicant or candidate will be (or is) less
effective on the job.
The legal implications of drug testing may have
changed significantly since this chapter was written.
Currently, drug testing is legal in all 50 states for preemployment screening and on-the-job assessment; however, employees in some states have successfully challenged dismissals based solely on a random drug test.
For those employment situations in which a collectivebargaining agreement has allowed drug testing, the
punitive action based on the results is subject to arbitration. One study found that the majority of dismissals
based on drug tests were overturned by arbitrators.35
Among the arguments against drug testing are that it is
an invasion of privacy, it is an unreasonable search and
seizure, and it violates the rights of due process. Most
experts agree that all three of these arguments may
apply to public employers, such as governments, but do
not apply to private industry. State law is relevant here
since some drug testing programs have been challenged
under privacy provisions of state constitutions. With
regard to public employment, the Supreme Court has
ruled that drug testing is legal if the employer can show
a “special need” (e.g., public safety). We will explore
the matter of drug testing in more detail in Chapter 14.
Is Testing an Invasion of Privacy?
Some have critiqued the widespread use of employment
tests on the grounds that these procedures may be an invasion of an individual’s privacy and produce information
that will affect an individual’s employment opportunities.
Some types of selection methods that seem particularly
prone to these concerns are drug tests and honesty/
integrity tests.36 Questions on tests or interviews that are
political in nature are also illegal in some states. Experts
in the field of employment testing who support the use of
these types of selection procedures have responded to the
challenges in a number of ways. First, various professional standards and guidelines have been devised to protect the confidentiality of test results. Second, almost any
interpersonal interaction, whether it be an interview or an
153
informal discussion with an employer over lunch, involves the exchange of information. Thus, advocates of
employment testing contend that every selection procedure comprises some invasion of the applicant’s privacy.
Finally, in the interests of high productivity and staying
within the law, organizations may need to violate an individual’s privacy to a certain degree. Companies with
government contracts are among those that are obliged to
maintain a safe work environment and may need to
require drug testing and extensive background checks of
employees.
There are those who will continue to voice concern
over the confidentiality and ethics of employment testing,
particularly as computer-based databases expand in scope
and availability to organizations. It also is likely that there
will be increasing calls for more legislation at federal,
state, and local levels to restrict company access to and
use of employment-related information.
G
A
T
E
P ERFORMANCE T ESTING /W ORK
S S AMPLES 37
, Despite making valuable contributions to employee seD
E
A
N
D
R
A
1
1
2
3
T
S
lection, paper-and-pencil tests have their problems and
limitations. The validity of cognitive ability tests is clear.
Unfortunately, the potential legal implications that stem
from their use is considerable. As we discussed, the
validity of paper-and-pencil measures of applicant motivation or personality is not nearly as impressive. Many
experts suggest that the prediction of job performance can
be enhanced through performance testing or work samples that involve samples of actual or simulated job tasks
and/or behaviors. There is also evidence that the use of
such tests can result in less adverse impact than cognitive
ability tests (see Figure 6-2).
Performance testing is usually more complex than
paper-and-pencil testing in that behavioral responses are
required by test takers that are similar to the responses required on the job. A work sample consists of tasks representing the type, complexity, and difficulty level of the
activities that are required on the job. Applicants must
demonstrate that they possess the necessary competencies or skills needed for successful job performance. The
most obvious example of a work sample is a word processing test for clerical personnel. More complex examples attempt to simulate what managers must do on the
job. Assessment centers, for example, often entail several work samples or simulations of on-the-job behaviors
typically exhibited by managers.
The objective of performance testing is to assess
candidates’ ability to do the job. Thus, applicants for
clerical positions may be required to take word processing (typing) tests or demonstrate proficiency in shorthand or filing. These exercises are work samples because
word processing, shorthand, and filing are representative
of t...
Purchase answer to see full
attachment