PSYCHOLOGICAL SCIENCE
Research Article
AUTOMATIC STEREOTYPING
Mahzarin R. Banaji' and Curtis D. Hardin^
'Yale University and ^University of California, Los Angeles
Abstract—Two experiments tested a form of automatic stereotyping. Subjects saw primes related to gender (e.g., mother,
father, nurse, doctorj or neutral with respect to gender (e.g.,
parent, student, ^p^Kon) followed by target pronouns (stimulus
onset asynchrony = 3(X) ms) that were gender related (e.g..
she, he) or neutral (it, me) or followed by nonpronourts (do, all;
Experiment 2 only). In Experiment I, subjects judged whether
each pronoun was male or female. Automatic gender beliefs
(stereotypes) were observed in faster responses to pronouns
consistent than inconsistent with the gender component of
the prime regardless of subjects' awareness of the prime-target
relation, and independently of subjects' explicit beliefs about
gender stereotypes and language reform. In Experiment 2, automatic stereotyping was obtained even though a genderirrelevant judgment task (pronouninot pronoun) was used. Together, these experiments demonstrate that gender information
imparted by words can automatically infiuence judgment, although the strength of such effects may be moderated hy judgment task and prime type.
Based on recent theory and research on the role of unconscious processes in beliefs about social groups (Banaji & Greenwald, 1994; Bargh, 1994; Greenwald & Banaji, 1995), we report
two experiments that provide stricter tests than previously conducted of a form of automatic stereotyping. Several recent experiments bave demonstrated tbat stereotyping can occur implicitly, without subjects" conscious awareness of tbe source or
use of stereotypic information in judgment (Banaji & Greenwald, 1995; Banaji, Hardin, & Rothman, 1993; Devine, 1989).
In this article, we focus on a particular brand of stereotyping
than can occur even wben the perceiver retains awareness of
the source of influence on judgment, but is unable to readiiy
control the stereotypic response. Stereotyping, like other cognitive processes, consists of both automatic and controlled
components, and the particular form of automaticity that is involved (e.g., awareness, intentionality, efficiency, and controllability) has been of recent interest (see Bargh, 1994). In the
present experiments, we demonstrate that gender, as lexically
coded in English, can operate automatically in judgment, even
when the primary (denotative) meaning is not about gender.'
Address correspondence to Mahzarin Banaji, Department of Psychology, Yale Umversity, P.O. Box 208205, New Haven, CT 065208205, e-mail: mbanaji@minerva.cis.yale.edu, or Curtis Hardin, Department of Psychology, UCLA, 1282A Franz Hall, Box 951563, Los
Angeles, CA 90024-1563, e-maih hardin@psych,ucla.edu. All materials
may be obtained from the authors.
1. These demonstrations, although showing evidence for the automatic use of gender information, should not be taken to imply that
seemingly automatic responses can never be controlled. The effects of
automatically activated information are controllable under theoretically
specified conditions (Bargh, 1994; Blair & Banaji, 1995).
:nirht © 1996 .American Psycholo^cal Society
The semantic priming procedure is commonly used to examine automatic information processing and, in particular, to reveal the strength of association between two concepts that exists independently of conscious thought. Developed more than
20 years ago, this procedure has led to important discoveries
about attention, signal processing, and semantic memory
(Meyer & Schevaneveldt, 1971; Neely, 1977; Posner & Snyder,
1975). The first of these tests showed the now well-known effect
that response latency to a target word is facilitated to the extent
to which a prime word that appears prior to the target word is
semantically related to it. In addition, the technique has recently been successfully adapted to demonstrate the operation
of automatically activated attitudes or evaluations (Bargh,
Chaiken, Govender, & Pratto, 1992; Fazio, Sanbonmatsu, Powell, & Kardes, 1986; Perdue & Gurtman, 1990).
Our primary interest lies in beliefs, and for the present research , we adapted the semantic priming procedure to provide
a strict test of the extent to which beliefs about gender (i.e.,
gender stereotypes) operate automatically. Two words were
presented in close succession, and the relationship between
them was captured by reaction time (RT) to judge the second
(target) word. In both experiments, the central empirical question of interest was, what is the influence of the gender code of
a prime on speeded judgments of gender-consistent or genderinconsistent targets? Faster judgments on targets that follow
gender-congruent primes than on targets that follow genderincongruent primes (i.e., gender-based priming) are taken as
evidence for the automatic use of gender stereotypes.^ We also
examine related questions, such as (a) the relationship between
automatic stereotyping and traditionally used explicit stereotyping measures, (b) the role of awareness of gender as a potential source of influence on performance, (c) the gender relevance of the judgment task, and (d) the gender strength of the
primes.
Although variations of the semantic priming procedure have
been used in previous research on stereotypes, the experimental procedures of these studies did not adhere to conventional
standards for revealing automatic infonnation use,^ For example, Dovidio, Evans, and Tyler (1986) presented a prime (black
or white) followed by a target (intelligent or lazy) and asked
subjects if the target "couid ever be true" of the prime category. They found that white subjects were reliably faster to
respond to stereotype-related traits than stereotype-unrelated
traits following the primes white and black. Such experiments
2. The term stereotype has been a construct of changing meaning in
social psychology, and our use of it is in keeping with recent definitiotis
that reduce it to refer to beliefs about the attributes of social groups
(Ashmore & Del Boca, 1981; Greenwald & Banaji, 1995).
3. In experiments in which the currently specified conditions of
automaticity were met, the findings addressed the role of automatic
evaluation rather than automatic belief (Gaertner & McLaughlin, 1983;
Perdue & Gurtman, 1990, Experiment 2).
VOL. 7, NO. 3, MAY 19%
PSYCHOLOGICAL SCIENCE
Mahzarin R, Banaji and Curtis D. Hardin
were critical in setting the stage for the present study, but did
not strictly test automatic stereotyping. Most notably, the time
between the onset of the prime and target (stimulus onset asynchrony, or SO A) in the previous studies was long enough to
allow strategic processing, casting doubt about the automaticity
of the process being measured. In the present experiments, we
used a 300-ms SOA, a condition known to capture relatively
automatic processes (Neeiy, 1977, 1991).
Further, Dovidio et al. (1986) required subjects to deliberately link the prime and target by asking whether the target
"could ever be trtie"' of the prime category, and Neely (1977)
demonstrated that such explicit expectations do affect RT under long SOAs such as those used by Dovidio et al. (S986).
Although these judgment tasks demonstrate important differences in judgment latencies for stereotyped traits following the
prime black or white, these tasks do not index automatic
processes that may occur outside conscious deliberation of
prime-target relationships. In the present experiments, we used
judgment tasks that required no attention to the relationship
between prime and target. Indeed, subjects were instructed to
ignore the prime word and classify the target word as a male or
female pronoun (Experiment 1) or a pronoun or not a pronoun
(Experiment 2).
In addition, the experiments we report differ from previous
research in the number and type of stimuli that were used.
Instead of the repeated presentation of two-category labels as
primes (black or white, young or old), we used 150 primes signifying gender in a variety of ways, including those associated
to gender by definition (e.g., mother, father, waiter, waitress)
or by normative base rates (e.g., doctor, nurse, mechanic,
secretary), or neutral with respect to gender (e.g., humanity,
citizen, people, cousin). The larger set of primes more fully
represents the social category, allows use of primes other than
category labels alone, and permits a comparison of the strength
of primes that denote gender and of primes that connote gender.
Primes also included so-called generic masculine terms (e.g.,
mankind) to allow a test of whether such words automatically
connote maleness or perform the more inclusive function that
critics of nonsexist language assert is the case.
In the choice of target words, we departed from the almost
exclusive reliance of past research on trait adjectives. In both
experiments, we used pronouns because they inescapably mark
gender (e.g., she, he). However, the judgment task itself differed across the two experiments in whether the decision focused on gender (male or female; Experiment 1) or grammatical
form (pronoun or not pronoun; Experiment 2). To date, no
studies of stereotyping have used a task that does not focus
attention on the category of interest (e.g., gender, race). A finding that automatic stereotyping occurs even when a genderirrelevant task is used would attest to the potency of automatic
gender stereotypes.
nurse-he). In addition, measures of behefs about explicit gender
stereotypes, language reform, and the influence of gender in
everyday life were included to test the relationship between
automatic stereotyping and more traditional explicit measures
of gender stereotyping.
Method
Subjects
Sixty-eight subjects (32 female, 36 male) from the introductory psychology pool at Yale University participated in partial
fulfillment of a course requirement.
Materials and apparatus
The experimental task was administered on IBM-PS2 microcomputers running Micro-Experimental Laboratory software
(Schneider, 1990). Subjects entered judgments on protruding
keys, marked "M" and " F , " affixed to the/and j keys. Key
position was reversed for half the subjects.
Two hundred primes were divided evenly among four categories; male related, female related, neutral with respect to gender, and nonword letter string (ZZZZZ). Within each of the first
three prime categories, words were chosen to appear virtually
equally from two subcategories. The first subcategory contained words associated to gender by normative base rates.
These words were chosen on the basis of 1990 census data
indicating occupations that were heavily skewed (over 90%)
toward the participation of either females (e.g., nurse, secretary) or males (e.g., doctor, mechanic) or that had equal participation (e.g., reporter, postal clerk). In addition, several
other words having strong stereotypical associations to one
gender or the other were included (e.g., feminist, god). The
second subcategory contained words associated to gender by
definition, that is, words that expressly refer to gender (e,g,,
woman, man), kinship terms (e.g., mother, father), and titles
(e.g., mr, mrs, king, queen). Within this subcategory, words
containing male morphemes (e.g., salesman), female morphemes (e.g., salesgirl), or neutral morphemes (e.g., chairperson) were also included. Targets were the six most common
pronouns in English, half male {he, him, his) and half female
(she. her, her,s).
Three measures were designed to assess exphcit beliefs regarding gender stereotypes, language reform, and the influence
of gender in peoples' lives.''
Design and procedure
For each trial, events occurred in the following order: First,
an orientation symbol (H-) appeared for 500 ms. Then the prime
word appeared for 200 ms, followed by a blank screen for 1(X)
ms. Finally, the target pronoun appeared and remained on the
screen until a response was entered. Subjects made 432 judgEXPERIMENT 1
ments (not including practice and buffer trials) divided equally
Experiment 1 tested whether gender information in words is among the eight prime-target categories (prime; male, female.
automatically used in judgment as assessed by faster response
times when the genders of the prime and target words match
4. For a description of these measures, see Hardin and Banaji (in
(e.g., doctor-he, nurse-she) than mismatch (e.g., doctor-she. press).
VOL. 7, NO. 3, MAY 19%
137
PSYCHOLOGICAL SCIENCE
Automatic Stereotyping
2.73-.
533b
ion Time
(SUI)
neutral, nonword; target: male pronoun, female pronoun)
within each of three blocks of trials that were counterbalanced
across subjects. Prime and target stimuli were paired randomly
for each subject.
The design was a 4 (prime gender: female, male, neutral,
nonword) x 2 (target gender: female, male) x 2 (subject gender:
female, male) mixed factorial, with subject gender the betweensubjects factor. Subjects judged each pronoun as either male or
female. They were instructed to ignore the primes and judge the
targets as quickly and accurately as possible. Subjects then
completed the three explicit measures of gender beliefs. Finally, they were probed for their awareness of the hypotheses
and debriefed.
nI1
525b ^ H
1 1 ^H
2.71-
a
492a
j
511b
503a
^ •4943
X
Target Gender
f1
1
JJ
1
4B7a|
q
2.67-
u
M lie
—I—
Female
Definition
Male
•
MALE
n
FEMALE
Female
Normative Base Rate
Prime Type
Results and Discussion
Reported results are based on correct judgments, excluding
responses that were extreme outliers. Consistent with other
studies employing this procedure, the error rate was low (1,117
of 29,502 judgments, or 3.8%). RTs greater than 3 SD above the
mean (> 1,300 ms) were identified as outliers and excluded (208
trials, or 0.7%). In sum, 95.6% (28,193 judgments) were retained in the reported analyses. The pattern of results is unchanged when these data are included. To achieve a better
approximation to the normal distribution, analyses were performed on a log transformation of the raw RT latencies. Thirtyseven of 68 subjects were aware of the gender relationship
between primes and targets. However, consistent with the assumption that this procedure reflects relatively automatic processing, the pattern of results was identical for both aware and
unaware subjects, all i^s < 1.
As shown in Figure 1, the predicted gender priming effect
was obtained, indicating that judgment was faster when target
gender matched than mismatched prime gender. The omnibus
Prime Gender (female, male, neutral, nonword) x Target Gender (female, male) x Subject Gender (female, male) three-way
analysis of variance yielded the predicted Prime Gender x Target Gender interaction, F(3, 198) = 12.15, p < .0001. The specific Prime Gender x Target Gender interaction (excluding the
Fig. 2. Mean reaction time to judge words as male or female as
a function of prime type, prime gender, and target gender (Experiment 1, n = 68). Bars with shared subscripts are not significantly different from each other (p > .05).
neutral conditions) was also reliable, F(l, 66) = 117.56, p <
.0001. Subjects were faster to judge male pronouns after male
than female primes,/(67) = 11.59, p = .0001, but faster to judge
female pronouns after female than male primes, f(67) = 6.90, p
= .0001. In addition, subjects were faster to respond to targets
preceded by male (M = 2.702) than female primes (M = 2.708),
F(l, 66) = 7.52, p < .01. No other reliable main effects or
interactions were obtained as a function of either subject gender
or target gender (Fs < 1).
The automatic gender priming effect was obtained for primes
related to gender both by definition (e.g., mother, father, man.
woman), F{1, 67) = 103.97, p < .0001, and by normative base
rate, F{\, 67) = 18.61, p < .0001. However, as shown in Figure
2, the gender priming effect was significantly larger for primes
related to gender by definition, as revealed by the three-way
Prime Type (definition, normative base rate) x Prime Gender
(female, male) x Target Gender (female, male) interaction, F(l,
67) = J3.67, p < .0005.
Generic masculine terms contributed to the automatic gender priming effect. After primes containing the morpheme man
(e.g., fireman, mankind, human, ma«), judgments were faster
2.73for male pronouns (M = 2.687) than female pronouns (M =
627d
2.709), /(67) = 4.06, p < .0001. The relationship also held under
513c
the most conservative analysis, in which terms that are somei
2.71times
used to refer only to men (e.g., man, fireman) were exTarget Gender
cluded. Primes considered to be generic masculine terms in
•
MALE
virtually ali contexts (e.g., mankind, layman) produced faster
'' 492a 1
4B9a
•
FEMALE
judgments for male pronouns {M = 2.689) than female pro1 2.69- 49Oa
nouns (M = 2.712), /(67) = 2.18, p < .05.
5
Finally, we examined terms that differed in no way except
for
the gender of their suffix (e.g., chairman, chairwoman,
2.67chairperson). As expected, the gender of the suffix did influMale
Neutral
Nonword
ence response latencies as indicated by a Prime Gender (male,
Prime Gender
female, neutral) x Target Gender (male, female) interaction,
F(2, 112) = 11.59, p < .0001. Judgments were faster for male
Fig. 1. Mean reaction time to judge words as male or female as pronouns after words with male (M = 2.693) than femeile (M =
a function of prime gender and target gender (Experiment 1, n 2.722) suffixes, /(67) = 3.29, p < .01, whereas judgments were
= 68). Bars with shared subscripts are not significantly differ- marginally faster for female pronouns after primes with female
ent from each other (p > .05).
(Af = 2.694) than male (M = 2.716) suffixes, r(67) = 1.81, p =
I
In
I
138
l l1
M
VOL. 7, NO. 3, MAY 1996
PSYCHOLOGICAL SCIENCE
Mahzarin R, Banaji and Curtis D. Hardin
.07. In addition, judgments were faster when primes with female suffixes were followed by female pronouns (M = 2.694)
than male pronouns (M = 2.722), r(67) = 2.82, p < .01. Interestingly, neutral -person suffixes after the identical words did
not produce equivalent responses to female and male pronouns.
Instead, after these primes, subjects were still faster to judge
male targets (M = 2.704) than female targets (M = 2.722), t{61)
= 2.64, p < .05.'
Relations between exphcit beliefs and automatic gender stereotyping were examined by computing a correlation between
each of the three explicit belief measures and a gender priming
score, which was calculated by subtracting log RT for gendercongruent trials from log RT for gender-incongruent trials.
None of the three correlations of explicit measures with the
priming score was significant (language reform: r[67] = - .003,
p = .978; role of gender in everyday life: r[68] = - .050, p =
.686; explicit gender stereotypes; r[66] = .037, p = .767). This
result is consistent with other research demonstrating a lack of
correspondence between explicit and implicit measures of stereotyping (Banaji & Greenwald, 1995).
In sum. Experiment 1 provided evidence for automatic gender stereotyping using a broad range of primes and using time
and task parameters that reflect automatic information use. The
effect occurred regardless of subjects' awareness of the primetarget relation, and independently of explicit beliefs about gender stereotypes. The effect was also obtained for both primes
related to gender by definition and primes related to gender by
normative base rate, although not surprisingly the effect was
larger for primes related to gender by definition.
EXPERIMENT 2
Participants in Experiment 1 judged whether each target was
male or female, thereby focusing attention on the gender of the
target. This form of the judgment task is quite conventional. For
example, when theoretical interest has focused on the semantic
link between prime and target, the commonly used judgment
task is a lexical decision (word/nonword; Neely, 1991). Likewise, when the interest is in the evaluative component of the
prime and target, the task is typically a good/bad judgment
(Bargh et al., 1992; Fazio et al., 1986; Greenwald, Klinger, &
Liu, 1989; Perdue & Gurtman, 1990). However, stronger evidence for automaticity may be obtained if the effect is observed
when the judgment task is unrelated to the dimension of the
prime-target relationship. For example, Bargh, Chaiken, Raymond, and Hymes (in press) showed that the automatic evaluative effect is obtained even when the judgment involves mere
pronunciation, a task unrelated to evaluation. Hence, in Experiment 2, the judgment task was a pronoun/not pronoun decision, unrelated to gender.
Method
Subjects
Sixty subjects (29 female, 31 male) from Yale University
participated in exchange for $5 or in partial fulfillment of a
course requirement.
Materials, design, and procedure
For this experiment, 120 of the primes used in Experiment 1,
representing male (40 primes), female (40 primes), and neutral
(40 primes) categories, were selected. Of the four target pronouns used, she and he allowed the comparisons of primary
interest. The pronoun it was included because it is the most
frequently occurring gender-neutral pronoun, and me was included for exploratory purposes to examine a possible relationship between prime gender and subject gender (cf. Markus,
1977). The four nonpronouns (is, do, as, all) were chosen to
match the critical targets in length, number of syllables, and
frequency (Ku^era & Francis, 1967).
In all, each subject made 720 experimental judgments divided into five blocks of trials, counterbalanced across subjects. For 480 of these judgments, the correct response to the
question "Is this a pronoun?" was "yes," and for 240, the
correct answer was "no." Each prime was paired with (a) both
critical "yes"-response targets (i.e., she, he), (b) both noncritical "yes"-response targets (i.e., it, me), and (c) two of the four
"no"-response targets (i.e., do, all, is, as). For each subject
within each block, prime and target items were randomly associated. After completing the priming task, subjects were probed
for awareness regarding the hypotheses and debriefed.
Results and Discussion
As before, resuits are based on a log transformation of the
raw RT latencies for correct judgments, excluding outliers (RT
> 1,300 ms or > 3 SD above the mean; 1.4% of the total). Also
as in Experiment 1, the error rate was low (370 of 28,800 "yes"
judgments, or 1.3%; 928 of 43,200 total judgments, or 2.1%);
97.7% (28,134) of the "yes" judgments were retained in the
reported analyses. Seven of the 60 subjects revealed some
knowledge of a possible gender relation between the prime and
target words. Again, however, the pattern of results was identical for both aware and unaware subjects, but no statistical
significance tests were conducted because of the small number
of aware subjects.
As Figure 3 shows, the predicted gender priming effect was
obtained, indicating that judgment was faster when target gender matched than mismatched prime gender. The omnibus 3
(prime, gender: male, female, neutral) x 4 (target gender: she,
he, it, me) x 2 (subject gender: male, female) analysis of variance yielded the predicted Prime Gender x Target Gender interaction, F(6, 336) = 3.66, p < .01. In addition, a Subject
Gender x Target Gender interaction indicated that subjects
were faster to respond to targets that matched rather than mismatched their own gender, F(3, 168) = 3.58, p < .02.^ No
difference in responding to the male and female targets was
observed for dubiously neutral primes such as layman and man-
6. A similar fmding was reported by Zarate and Smith (1990). In
addition, a main effect of prime gender indicated that subjects" responses were fastest foUowing male primes and slowest following female primes, f(2, il2) = 3.89, p < .03. A main effect of target gender
5. However, we found (Hardin & Banaji, in press) no bias favoring indicated that subjects were slower to respond to the target it than to
she, he, and me, F(3, 168) = 154.33, p < .0001.
males in a similar experiment usingfirstnames as targets.
VOL. 7, NO. 3, MAY 1996
139
PSYCHOLOGICAL SCIENCE
Automatic Stereotyping
2.71 n
2.69-
490b
4S5a.b
491b
4S5a.b4eSa.
Target Gender
• HE
OS
2.67-
G SHE
o
2.65
Female
Neutrai
Prime Gender
Fig. 3. Mean reaction time to judge words as pronouns or not
pronouns as a function of prime gender and target gender (Experiment 2, n = 58). Bars with shared subscripts are not significantly different from each other {p > .05).
kind. There was also no main effect of subject's sex, F(l, 56) =
1.49.
The more specific interaction of prime gender by target gender (excluding neutral primes) was also significant, indicating
that subjects were faster to judge targets in gender-congruent
prime-target pairs than in gender-incongruent pairs, F(l, 56) =
4.63, p < .04. Again, the Subject Gender x Target Gender
interaction was reliable, indicating that subjects were faster to
respond to the target pronouns that were consistent than inconsistent with their own social category, F(l, 56) = 17.95, p <
.0001.
However, these 2 two-way interactions were qualified by a
three-way Subject Gender x Prime Gender x Target Gender
interaction, F(l, 56) = 4.15, p < .05. For purposes of clarity,
we describe results separately for primes related to gender by
definition and primes related to gender by normative base rates.
Analyses of primes related to gender by definition (e.g.,
mother, father, waitress, waiter) yielded the gender priming
effect unmoderated by subject gender (Fig, 4, left panel). RT
was smaller when prime gender and target gender were congruent than incongruent, as indicated by a reliable two-way interaction, F(l, 56) = 8.70, p < .005. Subjects were faster to identify he when primes were male than fettiale, ?(59) = 2.44, p <
.02, but faster to identify she than he when the primes were
female, ^59) = 2.53, p < .02. In addition, subjects were faster
to identify targets that matched their own gender, as indicated
by the reliable interaction between subject gender and target
gender, F(l, 56) = 7.43, p < .01.
Analyses of primes related to gender by normative base rates
(e.g., secretary, mechanic, doctor, nurse) suggest limitations to
the generality of automatic gender priming under conditions in
which the task does not require subjects to focus on the dimension of gender (see Fig. 4, right panel). Although reliable effects
were obtained for the Subject Gender x Target Gender interaction, F(l, 56) = 14.02, p < .0001, and there was a main effect
of prime gender, F(l, 56) = 6.49, p - .01, both were qualified
by a marginal three-way Subject Gender x Prime Gender x
140
dTarget Gender interaction, F(l, 56) = 3.49, p < .07. Male subjects were faster to identify he than she regardless of prime
gender, F(l, 29) = 7.44, p = .01, and faster to identify targets
after male than female primes, F(l, 29) = 9.21, p < .01. In
contrast, female subjects were faster to identify she than he
after female primes, f(28) = 2.81, p < .01, and faster to identify
he after male than female primes, r(28) = 2.18, p < .05. Females were also faster, in general, to identify she than he, F(l,
27) = 6.70, p < .05.
GENERAL DISCUSSION
These two experiments provide the first strict tests of a form
of automatic stereotyping. Using a large number and wide range
of stimuli, we demonstrated that judgments of targets that follow gender-congruent primes are made faster than judgments of
targets that follow gender-incongruent primes. This effect was
obtained despite subjects' deliberate attempt to ignore the
prime, regardless of whether subjects were aware or unaware of
the gender relation of prime-target pairings, independently of
subjects' explicit beliefs about gender, regardless of whether
the judgment was gender relevant or irrelevant, and on both
words that are gender related by definition and words that are
gender related by normative base rates.
The results, however, also show two moderators of the gender priming effect. First, the effect was stronger when the judgment was gender relevant (e.g., male or female pronoun?) than
gender irrelevant (e.g., pronoun or not pronoun?). Further research will investigate whether this difference also obtains on
other forms of gender-irrelevant tasks, such as pronunciation.
Second, the gender priming effect was stronger for primes related to gender by definition (e.g., mother, father) than by normative base rate (e.g., doctor, nurse). In Experiment 1, for
example, the effect size for definition primes was large (Cohen's d = .78), whereas for normative-base-rate primes, the
effect size was moderate (d - .47). This difference refiects the
differential strength of the two types of primes in evoking gender. Words that are exclusively reserved to denote gender will
produce stronger priming than words that connote gender (for a
rephcation with names as targets, see Hardin & Banaji, in
press).
2.69rarget G«nder
• HE
• SHE
2.67-
2.6S
Male
Female
Male
Definition
Female
Normative Base Rate
Prime Type
Fig. 4. Mean reaction time to judge words as pronouns or not
pronouns as a function of prime type, prime gender, and target
gender (Experiment 2, n = 58). Bars with shared subscripts are
not significantly different from each other (p > .05).
VOL. 7, NO. 3, MAY 1996
PSYCHOLOGICAL SCIENCE
Mahzarin R. Banaji and Curtis D. Hardin
Sapir (1963) commented that one of the important functions
of language is to repeatedly declare to society the psychological
status of its members. These experiments show the automatic
effects of such repeated linguistic declarations, in particular,
those that convey the social psychological positions that
are occupied through gender. A noteworthy aspect of the gender priming effect observed in Experiment 1 is that the effect
can obtain not only when the primes denote gender (man,
woman), but also when they more tacitly connote gender (mechanic, nurse). That gender-signifying infonnation permeates
thought sufficiently to infiuence judgment points to the fundamental nature of gender as a category in verbally communicated
thought. This article is not the place to catalogue the various
ways in which gender is coded in most languages, but we note
that EngUsh stands out as one language that has received a
quite extensive analysis of what might be called the "genitalia
of language" (Baron, 1986): gender-signifying words, gender-specific pronouns, and the covert presence of gender in
grammatical structure. We expect that such automatic gender
priming effects are best observed in languages that provide extensive and deep coding of gender in grammar and semantics
(Hardin & Banaji, 1993). Further evidence for the generality of
automatic gender-stereotyping effects might be obtained by
demonstrating such effects independently of language (i.e.,
through the use of nonverbal, pictoral stimuli that denote and
connote gender). Such effects would be especially important in
reveahng the degree to which the present effect is a function of
gendered language per se or gender stereotypes more generally.
Although research on beliefs and attitudes has usually depended on direct, verbal measures of stereotypes (see Greenwald & Banaji, 1995), response latencies may provide a more
indirect measure of stereotype strength. A case for RT as a
measure of attitude or evaluation has already been effectively
made (see Bargh et al., 1992; Fazio et ai.. 1986; Perdue & Gurtman, 1990), and other investigators have used RT as an indicator of stereotypes (Dovidio et al., 1986). However, these experiments, in conjunction with others (Blair & Banaji, 1995; Hardin
& Banaji, in press), demonstrate the operation of beliefs under
conditions that meet currently accepted standards for measuring automatic processes. Such measures are likely to increasingly complement the more traditional measures of evaluation
and belief, especially as their validity and feasibility are further
established.
Acknowledgments—This research was supported in part by Grants
DBC 9120987 and SBR 9422291 from the National Science Foundation. We are grateful to Lisa Driscoll and John Beauvais for assistance with data collection; Kimberly Hinds and Irene Blair for assistance with programming; and R. Bhaskar. Irene Blair. Richard
Hackman, and Eliot Smith for comments on a previous draft.
REFERENCES
Ashmore, D.D.. & Del Boca, F.K. (1981). Conceptua! approaches to stereotypes
and stereotyping. In D.L. Hamilton (Ed.). Cognitive proeesses in stereotying and intergroup behavior (pp. t-36). Hillsdale, NJ: Eribaum.
VOL. 7, NO. 3, MAY 19%
Banaji, M.R., & Greenwald, A.G. (1994). Implicit stereotypes and prejudice. In
M.P. Zanna & J.M. Olson (Eds.), The psychotogy of prejudice: The Ontario
Symposium (Vol. 7, pp. 55-76). Hillsdale, NJ: Eribaum.
Banaji, M.R.. & Greenwald, A.G. (1995). Implicit stereotyping in false fame
judgments. Journal of Personality und Social Psychotogy, 68, 1« 1-198.
Banaji, M.R.. Hardin, C.. & Rothman. A.J. (1993). Implicit stereotyping in person
judgment. Journal of Personality and Social Psychology, 65, 272-281.
Bargh. J.A. (1994). The four horsemen of automaticity: Awareness, intention,
efficiency, and control in social cognition. In R.J. Wyer & T. Srull (Eds.),
Handbook of social cognition (2nd ed., pp. 1-40). Hillsdale. NJ: Eribaum.
Bargh. J.A.. Chaiken, S.. Govender. R., & Pratto. F. (1992). The generality of the
automatic activation effect. Journal of Personality and Social Psychologv,
62. 893-912.
Bargh. J.A., Chaiken, S., Raymond, P.. & Hymes, C. (in press). The automatic
evaluation effect: tJnconditional automatic attitude activation with a pronunciation task. Journal of E.xperimental Social Psvchologv.
Baron, D. (1986). Grammar and gender. New Haven, CT: Yale University Press.
Blair, I.V.. & Banaji. M.R. iXViS). Automatic and controlled processes in gender
stereotyping. Unpublished manuscript. Yale University, New Haven. CT.
Devine, P. (1989). Stereotypes and prejudice: Their automatic and controlled
components. Journal of Personality and Social Psychology. 56, 5-18.
Dovidio, J.F.. Evans. N., & Tyler. R.B. (1986). Racial stereotypes: The contents
of their cognitive representations. Journal of Experimental Social Psychology. 22, 22-37.
Fazio. R.H.. Sanbonmatsu. D.M., Powell, M.C., & Kardes, F.R. (1986). On the
automatic activation of attitudes. Journal of Personality and Social Psychology. 50. 229-238.
Gaertner. S.L.. & McLaughiin. J.P. (1983). Racial stereotypes: Associations and
ascriptions of positive and negative characteristics. Social Psychology
Quarterly. 46, 23-30.
Greenwald. A.G.. & Banaji, M.R. (1995). Implicit social cognition: Attitudes,
self-esteem, and stereotypes. Psychological Reviev, 102, A-Tl.
Greenwald, A.G.. Klinger. M.R., & Liu. T.J. (19«9). Unconscious processing of
dichoptically masked words. Memory^ and Cognition. 17. 3.5-47.
Hardin, C , & Banaji, M.R. (19931. The influence of language on thought. Social
Cogniiion,
II.
277-308.
Hardin. C . & Banaji. M.R. (in press). Gender in language and thought. Social
Cognition.
Kufera. N.. & Francis. W.N. (1967). Computational analysis of present day
American English. Providence. Rl: Brown University Press.
Markus. H. (1977). Self-schemata and processing information about the self. Journal of Personality and Sociat Psychology, 35, 6.3-78.
Meyer, D., & Schevaneveldt. R. (1971). Facilitation in recognizing pairs of words;
Evidence of a dependence between retrieval operations. Journal of Experimental P.yfchology, 90. 227-234.
Neely. J.H. (1977). Semantic priming and retrieval from lexical memory: Roles of
inhibitionless spreading activation and limited-capacity attention. Journal of
Experimental Psychology: General. 106, 226-254.
Neely, J.H. (19911. Semantic priming effects in visual word recognition: A setective review of current fmdings and theories. In D. Besner & G. Humphreys
(Eds.), Basic processes in reading: Visual word recognition (pp. 264-336).
Hillsdale, NJ: Erlbaiim.
Perdue, C . & Gunman, M. (1990). Evidence for the automaticity of ageism.
Journal of Experimental Social Psychology, 26. 199-216.
Posner. M.I., & Snyder. C.R. (1975). Attention and cognitive control. In R.L.
Solso (Ed.). Information processing in cognition: The Loyola Symposium
(pp. 55-85). Hillsdale, NJ: Eribaum.
Sapir, E. (1963). Selected writings of Edward Sapir in language, culture, and
personality (D. Mandelbaum, Ed.). Berkeley: University of California
Press.
Schneider, W. (1990). MEL users guide: Computer techniques for real time psychological experimentation. Pittsburgh: Psychology Software Tools.
Zarate. M.A., & Smith, E.R. (1990). Person categorization and stereotyping.
Social Cognition, 8, I6l-\S5.
(RECEIVED 12/8/94; ACCEPTED 3/26/95)
141
Available online at www.sciencedirect.com
Journal of Experimental Social Psychology 44 (2008) 445–452
www.elsevier.com/locate/jesp
Evidence that blatant versus subtle stereotype threat cues
impact performance through dual processes
Jeff Stone
b
a,*
, Chad McWhinnie
b
a
Psychology Department, University of Arizona, Tucson, AZ 95721, USA
Psychology Department, McGill University, Montreal, Canada QC H3A 1B1
Received 16 August 2005; revised 10 September 2006
Available online 27 February 2007
Communicated by Spencer
Abstract
An experiment tested three competing hypotheses for how blatant and subtle stereotype threat cues influence the performance of
female sports participants on a golf-putting task. A ‘‘predominant’’ model predicts that blatant threat cues have a more negative effect
on performance than subtle threat cues, whereas an ‘‘additive’’ model predicts that both cues combine to have a greater negative effect
than either threat cue alone. However, a ‘‘dual process’’ model predicts that each threat cue has an independent negative influence
through separate mechanisms. To test these predictions, we varied the presence of blatant (e.g., the task frame) and subtle cues (e.g.,
the gender of the experimenter) for negative stereotypes about female athletes, and then measured both the number of strokes required
to finish the course and accuracy on the last putt of each hole. The results supported the dual process model prediction: females required
more strokes to finish the golf task when it was framed as measuring gender differences compared to racial differences in athletic ability,
and females performed less accurately on the last putt of each hole in the presence of a male versus a female experimenter. The discussion
focuses on how the presence of multiple stereotype threat cues can induce independent mechanisms that may have separate but simultaneously deleterious effects on performance.
2007 Elsevier Inc. All rights reserved.
Keywords: Stereotype threat; Female; Athlete; Sports; Dual process
The theory of stereotype threat proposes that for individual members of a stigmatized group, the salience of a
negative stereotype in a performance context causes concern about confirming the validity of the negative characterization (Steele, 1997; Steele, Spencer, & Aronson,
2002). Numerous studies now show that the salience of
negative stereotypes in a performance context can impair
the performance of African-American students on standardized tests of verbal ability (Steele & Aronson, 1995),
women on tests of math ability (Schmader & Johns,
2003; Spencer, Steele, & Quinn, 1999), and the performance of other groups (e.g., White men) on other tasks
*
Corresponding author. Fax: +1 520 631 9306.
E-mail address: jeffs@u.arizona.edu (J. Stone).
0022-1031/$ - see front matter 2007 Elsevier Inc. All rights reserved.
doi:10.1016/j.jesp.2007.02.006
(golf-putting, see Beilock, Jellison, Rydell, McConnell, &
Carr, 2006).
The purpose of the current study was to examine how
stereotype threat processes unfold when multiple cues for
threat are present in a performance situation. Steele et al.
(2002) proposed that when stigmatized targets are in a stereotype relevant situation, their assumptions about the
existence and application of negative stereotypes to their
group—their ‘‘theory of context’’—causes targets to
become vigilant about detecting the presence of bias. To
accomplish this goal, targets ‘‘evaluate a broad set of cues
in the setting’’ to assess the potential for a negative characterization. Little is know, however, about how targets
detect and react to the presence of multiple stereotype
threat cues in a performance context.
446
J. Stone, C. McWhinnie / Journal of Experimental Social Psychology 44 (2008) 445–452
We speculate that the way multiple threat cues impact
performance depends in part on the nature of the cues
themselves. Some cues for threat are relatively blatant,
such as when a task is explicitly framed as measuring attributes that relate to a negative ingroup stereotype (e.g.,
Steele & Aronson, 1995; Stone, Lynch, Sjomeling, & Darley, 1999), or when targets are directly told that their group
tends to perform more poorly on the task in comparison to
some other group (e.g., Spencer et al., 1999). Other cues for
threat, however, present the potential for bias in a subtler
manner. For example, research indicates that the performance of stigmatized targets can be adversely affected
when they hold minority status in the performance context
(Inzlicht & Ben-Zeev, 2000; Sekaquaptewa & Thompson,
2002), or when the task is administered by an outgroup
member (e.g., Danso & Esses, 2001; Marx & Goff, 2005).
This suggests that the theory of context held by targets
includes not only knowledge about the content of the negative ingroup stereotypes, but also information about the
conditions under which the stereotypes are likely to bias
perceptions of their behavior.
It is also possible that different threat cues negatively
impact performance through different mechanisms. For
example, the subtle nature of cues like the outgroup identity of the test administrator may cause targets to focus
part of their cognitive and emotional resources on reducing
uncertainty about the presence of bias. The cognitive load
that results from the attention paid to subtle cues impacts
working memory capacity, which then reduces performance on the task. Thus, when the threat is induced
through subtle cues in the situation, performance is more
likely to be mediated by the negative impact on working
memory (Croizet et al., 2004; Schmader & Johns, 2003).
In contrast, when negative stereotypes are blatantly tied
to performance, targets do not have to expend cognitive
resources assessing the potential for bias, it is clearly present. Instead, blatant cues cause targets to adopt a prevention focus orientation designed to minimize mistakes and
avoid the failure that would confirm the negative stereotype (Keller & Dauenheimer, 2003; Seibt & Forster,
2004). However, their attempt to avoid failure can backfire
if it causes them to adopt strategies that disrupt the effective execution of task-relevant skills. Thus, blatant cues
for threat may reduce performance by inducing prevention
focus processes that negatively impact their approach to
the task.
If blatant and subtle stereotype threat cues impact performance through their effects on separate processes, then
there are at least three predictions that can be made for
how the presence of both cues influence performance.
One possibility is that when both cues are present in a performance context, one cue has more influence on performance than the other. Targets may perceive both cues as
a source of bias, but one may be perceived as a more likely
threat than the other. An obvious prediction is that blatantly framing a task as measuring a negative stereotype
makes the threat more concrete, and as a result, prevention
focus processes consume more cognitive and emotional
resources than does the cognitive load a subtle threat cue.
This ‘‘predominant cue’’ model predicts when both types
of cues are present during performance, blatant cues have
a greater negative impact on performance than subtle cues.
A second possibility is that blatant and subtle threat
cues operate together to influence task performance. The
assumption in an ‘‘additive cues’’ model is that all cues
are perceived as a potential source of threat and that both
cognitive load and prevention focus strategies work in tandem to impact the processes that determine performance
on the task. When an outgroup member frames the task
in terms of a negative ingroup stereotype, targets are simultaneously overloaded by assessing the meaning of the subtle cue and motivated to avoid being negatively
characterized as per the blatant cue information (Seibt &
Forster, 2004). Thus, an additive cues model would predict
that two threat cues have a greater negative impact on performance than when either cue is presented alone.
A third possibility is that blatant and subtle threat cues
operate independently of each other to influence task performance (e.g., Strack & Deutsch, 2004). Here the assumption is that each cue induces processes that influence
different aspects of performance. Those aspects of performance that depend on effortful processing skills for successful execution may be more influenced by reductions
in working memory, whereas those aspects of performance
that require fluent, automatic execution are influenced
more by prevention focus processes (Beilock et al., 2006).
Whereas some tasks may depend more heavily on one set
of skills than another, other tasks may require both skills
for a successful performance. However, because most studies to date have focused on manipulating a single cue to
measure its mediational effect on one performance measure, previous research has not addressed a ‘‘dual process’’
explanation for the effect of multiple threat cues on
performance.
The current study tested the three competing predictions
for how multiple threat cues impact performance by using
the golf-putting task from previous research on stereotype
threat in sports (e.g. Stone et al., 1999). Performance on the
putting task can be measured in two ways: As the total
number of putts needed to complete the course, which is
the standard measure of performance in golf, and it can
also measure how accurate participants are as they putt
the ball into the hole on their last shot (Beilock et al.,
2006). Accordingly, cognitive load or prevention focus processes may influence each of these outcomes independently.
For example, if blatant cues such as framing the task in
terms of a negative ingroup stereotype induce a prevention
focus orientation, targets should become motivated to try
to avoid failure during the task. A ‘‘try not to miss’’ strategy on each putt would focus them on the micro elements
of their putting stroke, but if this response disrupts the
automatic and fluent elements of execution, targets might
‘‘choke’’ under the pressure. As a result, the blatant task
frame would increase the number of strokes they would
J. Stone, C. McWhinnie / Journal of Experimental Social Psychology 44 (2008) 445–452
need to complete the overall course (e.g., Stone et al.,
1999).
In contrast, if subtle cues such as the outgroup identity
of the test administrator induce cognitive load, this could
influence the accuracy of their final putt into the hole. After
getting close to the hole on the previous putt, accuracy on
the last putt takes considerable concentration, which is
likely to be influenced by working memory capacity. Note
that in addition to providing a task that can measure the
potential independent effect of dual processes on performance, the putting task permits a test of the competing
models as well. That is, if one cue is predominant, the
results will reveal a main effect for the predominant cue,
or if both cues add up to create more threat than either
alone, then the results will show an additive effect for both
cues on both performance measures.
A second purpose of the proposed study was to examine
the influence of negative stereotypes on the performance of
female athletes. Research indicates that whereas ‘‘poor athletic ability’’ is a negative stereotype about North American White athletes (Sailes, 1996; Stone, Perry, & Darley,
1997), it is perhaps more widely held as a negative stereotype about female athletes (Biernat & Vescio, 2002; Knight
& Guiliano, 2001). We believe that in the domain of sports,
women have considerably more experience being negatively
compared to men than they have being compared to
females from other ethnic or racial groups. For example,
in the United States, women’s sports received less funding
and institutional support than men’s sports until Title IX
legislation was passed in 1972. Whereas Title IX has significantly increased the number of girls and women who participate in organized sports, women still occupy fewer
administrative, management, and training positions than
men, implying that off the field, men are more qualified
than women to run the show (Roper, 2002). The belief that
women are less athletic than men is also conveyed in the
media, as women’s sports at the high school, college and
professional level receive less media attention than men’s
sports at the same level (Tuggle & Owen, 1999). A third
source for negative stereotypes about female athleticism
is transmitted interpersonally when people participate in
sports, such as when a coach admonishes a young player
to ‘‘stop throwing like a girl’’ (Fredrickson & Harrison,
2005). These various sources instantiate and support the
culturally held belief that females possess less athletic ability than males. Racial differences between female athletes,
in comparison, do not have the same history of discourse
or institutionalized segregation and discrimination, and
therefore, do not carry the same burden that gender does
for female athletes.
It was hypothesized in the present research that blatantly framing the golf task as a measure of gender differences in athletic ability would be perceived as more
threatening to White female participants than would framing the task as a measure of racial differences in athletic
ability. In addition to manipulating the type of blatant
cue present, the presence of a subtle threat cue was manip-
447
ulated by having either a male or female experimenter conduct the putting task. Thus, the procedure was designed to
test three competing hypotheses for how multiple threat
cues influence the performance of female sports participants on a task that was capable of revealing more than
one mediational process.
Method
Participants
Participants were 110 female undergraduates at the University of Arizona who participated in the study for partial
course credit. All were recruited after they identified their
ethnicity as ‘‘Caucasian American’’ during a mass pre-testing of the introductory psychology courses. Participants
also rated their athleticism as above average but reported
they played golf no more than one day per week (see Stone
et al., 1999). Thus, the sample consisted of women who perceived themselves to be athletic but novice golfers.
Procedure
Participants completed the procedures individually.
When they arrived at the laboratory, they were greeted
by one of two male or two female experimenters (who were
blind to the experimental hypothesis1), which served as the
subtle cue manipulation. The experimenter explained that
they would complete brief questionnaires and perform a
sports test based on the game of golf.
The athletic test was based on the golf task described in
Stone et al. (1999). Participants first read a handout that
described the athletic task as a standardized measure of
sports aptitude. Ostensibly, performance on the test had
been shown to correlate with actual performance on many
of the physical and mental activities relevant to most college varsity sports, such as basketball, hockey and golf.
At this point, the instructions changed course according
to condition.
Blatant cue manipulation
Participants were randomly assigned to one of three task
frame conditions. Participants in one of the two athletic
ability conditions read that the test was designed to measure ‘‘personal factors correlated with natural athletic
ability’’. Natural athletic ability was defined as ’’one’s natural ability to perform complex tasks that require hand-eye
coordination, such as shooting, throwing, or hitting a ball
or other moving object’’. It was explained that as test difficulty increased, so would the demand on their natural athletic ability or hand-eye coordination.
1
The experimenters were kept blind to the primary hypothesis by telling
them that the purpose of the study was to investigate personality
differences in reactions to how the task was framed. Thus, the
experimenters were led to believe that they could not guess how a specific
participant would react to the task frame manipulation and they were
carefully trained to treat every participant in the same manner.
448
J. Stone, C. McWhinnie / Journal of Experimental Social Psychology 44 (2008) 445–452
Within the two athletic ability frame conditions, those
assigned to the Gender-differences frame were told the following: ‘‘Now you are probably aware that there are gender differences in sports performance. Previous studies
using this test of natural athletic ability have reported differences in the performance of men and women. So even
though there may be gender differences on this test, we
ask that you give 100% effort on the task so we can accurately measure your natural skills. Do you have any
questions?’’
Those in the athletic ability condition that were assigned
to the Racial-differences frame were told the following,
‘‘Now you are probably aware that there are racial differences in sports performance. Previous studies using this test
of natural athletic ability have reported differences in the
performance of Blacks and Whites. So even though there
may be racial differences on this test, we ask that you give
100% effort on the task so we can accurately measure your
natural skills. Do you have any questions?’’
Participants randomly assigned to the Sport psychology
control condition read that the test was designed to measure ’’psychological factors correlated with general sports
performance’’. The handout explained that as test difficulty
increased, so would the demand on the psychological factors that correlate with general sports performance.
After they read the handout, the experimenter reiterated
the instructions verbally and answered questions. They
were then led into an adjoining room to complete the
golf-putting task.
The golf-putting task
Based on Stone et al. (1999), the task was designed to
resemble a miniature golf course on which participants
used a putter to hit a golf ball down a 3 ft · 10 ft stretch
of carpet into a hole apparatus—an inclined felt mat with
a hole 5 in. in diameter, a hole 4 in. in diameter, and a hole
3 in. in diameter. To complete each ‘‘course layout’’ in the
test, participants were told the ball had to roll up the
incline and stop in one of the holes.
Participants were told they would complete eight different holes that would be created by placing 2 · 4s either on
or under the carpet and by moving the hole apparatus.
Once the test began, the experimenter said he or she would
change the putting surface according to a pre-tested pattern
of increasing difficulty.
Participants were told that their goal on each course layout was to putt the ball into the smallest hole using the fewest strokes possible. In addition to the number of strokes,
they were told that the hole that received the ball would
be recorded, and that both strokes and the hole would be
summed to yield an overall performance score for each
layout.
Participants were then allowed to ‘‘warm up’’ by practicing on the first course layout three times. When finished
practicing, the experimenter directed participants to the
wall where the diagram of each course layout was displayed. Before they played each layout, participants were
instructed to examine the diagram and estimate how many
strokes they would need to complete it. They were also
instructed to predict which hole the ball would stop in. Participants were instructed to make their predictions on a
sheet while the experimenter set up each new layout.
After participants made their prediction for the first layout, the task proceeded with participants making a prediction for each new layout, putting until their ball stopped in
a hole, and then making their predictions for the next layout, until all eight layouts had been finished. After the last
putt, the experimenter announced that the study was complete, and provided participants with a full debriefing and
course credit as compensation for their time.
Results
The data were initially analyzed to examine if variability
due to the two different male and female experimenters
influenced the results. All of the performance and selfreport data were analyzed using a 3 (Blatant Cue) · 4
(Experimenter) between-subjects analysis of variance
(ANOVA). No main or interactive effects were found for
the experimenter variable. Thus, we collapsed this variable
to reflect the gender of the experimenter in order to test the
influence of the subtle cue on performance. Unless otherwise noted, all of the data were analyzed using a 3 (Blatant
Cue) · 2 (Subtle Cue) ANOVA.
Achievement: strokes
The number of strokes needed to complete each of the
eight holes of the golf course was summed to create one
overall performance score. The ANOVA revealed only a
significant main effect for the Blatant Cue manipulation,
F(2, 104) = 3.34, p < .04. As seen in Table 1, a planned contrast of the mean differences between each group showed
that when the task was framed as a measure of gender differences in athletic ability, female participants performed
significantly worse (M = 27.38, SD = 8.33) compared to
when the task was linked to racial differences in athletic
ability (M = 23.58, SD = 4.62) or to sports psychology
(M = 25.03, SD = 5.09), F(1, 104) = 5.67, p < .02. The difference between the racial-differences task frame and the
sport psychology control condition did not approach significance, F < 1. The subtle cue manipulation did not modTable 1
Average number of achieved and expected strokes required to complete
the course for the blatant threat cue conditions
Blatant cue
Strokes
Achieved
Expected
Gender
Race
Control
27.38a
25.90
23.58b
23.58
25.03b
24.15
p < .05
p < .10
Higher numbers indicate a poorer performance. Different superscripts
indicate which means are significantly different from each other a p < .05.
J. Stone, C. McWhinnie / Journal of Experimental Social Psychology 44 (2008) 445–452
erate the effect of blatant cue on achieved strokes, Blatant
Cue · Subtle Cue interaction F < 1. This latter finding does
not support the additive model of how blatant and subtle
cues impact performance. The main effect for the blatant
cue, however, supports both the ‘‘predominant’’ and ‘‘dual
process’’ predictions. Sorting these out requires analyzing
the effects of the subtle cue manipulation on achieved
accuracy.
449
dicted they would be less accurate when the experimenter
was male (M = 2.22) compared to female (M = 2.20). Neither the main effect nor the interaction with the blatant cue
manipulation reached significance, all ps < .12. As with the
achievement scores, the two sources of threat exerted independent influences on expectancies for strokes and accuracy on the overall course.
Correlational analyses
Achievement: accuracy
The hole they stopped the ball in on the final putt was
analyzed by assigning the small, medium and large holes
a score of 1, 2 and 3, respectively. To create an overall measure of accuracy, the scores received on each final putt were
summed and averaged across the eight course layouts. The
ANOVA revealed a main effect for the Subtle Cue manipulation, F(1, 104) = 3.93, p < .05, which as seen in Table 2,
revealed that on average, female participants were less
accurate (i.e., stopped the ball in a larger hole) when the
experimenter was male (M = 2.02) compared to when the
experimenter was female (M = 1.88). Neither the main
effect nor the interaction with the Blatant Cue manipulation reached significance, all ps < .12. When put together
with the data on achieved strokes, the results for achieved
accuracy provide support for the dual process model over
the ‘‘predominant’’ model of how multiple threat cues
impact performance.
Performance expectancies
Participants’ predictions for the number of strokes they
would need to complete each hole were summed and subjected to the ANOVA. The analysis revealed only a marginal main effect for the Blatant cue manipulation,
F(2, 104) = 2.26, p < .10. The data shown in Table 1 mirrored the number of strokes required to finish the course,
with participants told the task measured gender differences
in athletic ability making somewhat higher predictions
(M = 25.90) than those told the task measured racial differences in athletic ability (M = 23.18) or sports psychology
(M = 24.15). No other effects approached significance.
A similar analysis of which hole participants predicted
that the ball would stop in revealed only a significant main
effect for the Subtle Cue manipulation, F(1, 104) = 7.22,
p < .008. As shown in Table 2, female participants preTable 2
Average achieved and expected accuracy on the last putt of each course
layout for the subtle threat cue conditions
Subtle cue
Accuracy
Achieved
Expected
Male experimenter
Female experimenter
2.05
2.23
1.91
2.02
Lower numbers indicate higher accuracy (smaller hole).
Examination of the correlations within the stroke and
accuracy performance measures revealed that in general,
expectancies and achievement were moderately related.
For example, as seen in Table 3, the correlation between
expected and achieved strokes was moderate and significant, r(110) = .44, p < .0001, as was the correlation
between expected and achieved accuracy, r(110) = .60,
p < .0001. However, the performance measures were relatively less related to each other. For example, the correlation between achieved strokes and achieved accuracy was
significant but small, r(110) = .28, p < .003, while
expected strokes and expected accuracy were not related
to each other, r(110) = .03, p > .72. As predicted by a dual
process model, these patterns indicate that the two measures of performance were relatively independent of each
other at the within subjects level.
Discussion
The overall results provide support for a dual process
model of how multiple stereotype threat cues impact performance. When both blatant and subtle cues signal the
potential for a negative ingroup stereotype to characterize
the meaning of a poor performance, each source can induce
relatively independent processes that impact different
aspects of performance. The data suggest that blatant cues,
such as framing the task as a measure of a negative ingroup
stereotype, induced a prevention focus orientation whereby
targets became more conservative in their approach to the
task. However, their prevention strategies tended to interrupt the fluid processes that facilitate successful performance on this aspect of the task, and they performed
more poorly, even when an ingroup member (i.e., a female
experimenter) blatantly made the negative stereotype
salient.
Table 3
Bivariate correlations between the achieved and expected performance
measures
1
1.
2.
3.
4.
Achieved strokes
Expected strokes
Achieved accuracy
Expected accuracy
*
**
***
p < .01.
p < .005.
p < .0001.
2
3
4
—
.10
.03
—
.60***
—
—
.44***
.28**
.21*
450
J. Stone, C. McWhinnie / Journal of Experimental Social Psychology 44 (2008) 445–452
The effect of the blatant cue on performance also supports the hypothesized link between athletic ability and
negative gender stereotypes about female athletes. Specifically, White female participants required more strokes to
finish the course, and therefore performed more poorly,
when the task was framed as measuring gender differences
in athletic ability, compared to when the task was framed
as measuring racial differences in athletic ability or a
non-stereotype relevant attribute (i.e., sports psychology).
This is consistent with the hypothesis that negative group
comparisons to men represent a more prominent concern
for White females who play sports relative to negative
group comparisons to females from other racial or ethnic
groups. Given the long history of gender differences and
inequities at all levels of international sports competition,
it is likely that poor athletic ability operates as a ‘‘universal’’ negative stereotype about female athletes, which when
made salient by blatant cues in the performance context,
can reduce their performance in sports.
Also as predicted by the dual process model, the outgroup
gender of the experimenter caused participants to perform
less accurately on the final putt, and this occurred regardless
of how the task was framed. This supports the hypothesis
that when subtle threat cues are simultaneously present in
the situation, they are capable of influencing performance
through a separate mechanism. Subtle cues appear to operate primarily as distractions that create cognitive load
demands, which in turn, influence those aspects of task performance that depend upon working memory capacity. Trying to stop the ball in the smallest hole on the last putt
requires substantial concentration, but if the cognitive processes are disrupted by thoughts about how one is being evaluated by an outgroup member, accuracy can suffer. Thus,
when more than one source of stereotype threat is present
in a performance situation, each source can impact different
aspects of performance through different processes (e.g.,
Strack & Deutsch, 2004).
The pattern of correlations between the two achievement and expectancy measures provided further evidence
for a dual process interpretation of the multiple cue effects.
First, the small correlation observed between achieved
strokes and achieved accuracy, and the zero-order correlation between expected strokes and expected accuracy, indicates that these two aspects of performance were processed
somewhat independently of each other. However, the moderate correlations between expected and achieved strokes,
and between expected and achieved accuracy on the final
putt, suggests that participants were consciously and deliberating processing their progress toward these goals. This is
not surprising given the sequential nature of the task; for
each hole on the course, participants started by making a
prediction for the number of strokes they would use and
also for how accurate they would be on the last putt. They
then played the hole, and subsequently used their achieved
strokes and accuracy from the previous hole to generate
predictions for these outcomes on the next hole. Nevertheless, the reciprocation between achievement and expec-
tancy on strokes and accuracy was negatively impacted
by different threat cues, suggesting that these conscious
strategic processes were operating through separate channels. The threat induced by the blatant cue caused lower
expectations and performance on one set of task-relevant
skills, while the threat induced by the subtle cue lowered
expectations and performance on a different set of task-relevant skills.
If so, then the observed relationship between expectancy
and performance also suggests that the dual process effect
of multiple threat cues may be limited to situations in
which each cue is processed consciously and deliberately.
An intriguing possibility is that under some conditions, targets may process one or more multiple threat cues in a relatively heuristic or implicit manner. Such may be the case,
for example, on tasks like a math test or other cognitive
performance measure, during which targets are not asked
to predict their performance on each item before they
attempt it. When one or more cues are processed through
less deliberate mechanisms, the impact of multiple cues
may operate as predicted by a predominant or additive
model. Thus, an important direction for future research
is to explore how the presence of multiple threat cues influence performance on other types of tasks while varying the
type of cue and how it is processed.
Another potential limitation to the current research is
that it focused on novice golfers. Beilock and colleagues
(2006) have argued that stereotype threat can impact the
performance of experts and novices on tasks that require
sensimotor skills through different mechanisms. Specifically, because the lack of experience of novice golfers
requires them to concentrate more attention on their execution of the task, stereotype threat cues are most likely to
reduce their performance through distraction processes.
In contrast, because the performance of experts is more
proceduralized and automatic, the salience of a negative
stereotype is more likely to reduce performance by increasing their attention to the task via explicit monitoring or
‘‘choking under pressure’’ processes. Indeed, studies have
shown that when a single blatant threat cue is made salient
before a golf-putting task, the performance of experts is
reduced, unless they simultaneously perform a second task
that distracts them from monitoring their execution of the
primary task (Beilock et al., 2006). This might predict that
when both blatant and subtle threat cues are salient, the
distracting presence of the subtle threat cue could improve
the performance of experts on a sensimotor task like golf
putting. However, we believe subtle threat cues distract
because they represent a source of ambiguity about threat;
they attract attention because they represent a separate
source of evaluation apprehension. Thus, the distraction
processes induced by subtle threat cues are different than
those induced by backwards counting or other memory
intensive tasks. We might then expect multiple threat cues
to cause experts to perform as poorly as novices, assuming
that both groups are highly engaged in doing well on the
task (Stone, 2002; Stone et al., 1999).
J. Stone, C. McWhinnie / Journal of Experimental Social Psychology 44 (2008) 445–452
Finally, if multiple threat cues can operate as independent
sources of concern, then our findings have important implications for reducing the effect of negative stereotypes on the
performance of targets. For example, the data suggest that
the presence of a positive source cue, like an ingroup role
model, may not overcome the effect of a blatant threat cue
(Marx & Roman, 2002; Steele et al., 2002). Whereas it is possible that our female experimenters were not perceived to be
the type of ‘‘athletically gifted’’ role models that may imbue a
sense of confidence in novice female athletes (Marx &
Roman, 2002), another determining factor may be the attribute under investigation. For example, Li and colleagues
(2004) reported that females are more likely to view athleticism as a fixed entity than men. As the research by Aronson
and colleagues (2002) showed, viewing performance in a
domain as a fixed entity induces higher susceptibility to stereotype threat compared to perceived the domain as malleable. Thus, if athletic ability is perceived by women to be
immutable to effort, blatant cues may cause them to suffer
stereotype threat and perform more poorly even when in
the presence of a female role model. Overcoming this problem likely necessitates framing gender differences in athletic
ability as amendable through effort or as irrelevant to how
performance in sports is evaluated.
In conclusion, sports, like the academic domains of
math, computer science and engineering, have a long history of conveying the message that women are less capable
than men. Consequently, the domain of sports is replete
with negative stereotypes about the athletic ability of
females that place them at risk for stereotype threat when
they perform a sports task. The current data show that this
can occur in two different ways: By the presence of a male
who is in a position to evaluate their performance, and by
explicit statements about the poor athletic ability of
females. Importantly, each source of threat appeared to
operate independent of the other to simultaneously impact
different aspects of performance, and potentially through
different mechanisms. These dual processes suggest that
in some performance situations, stigmatized targets may
be forced to cope with more than one social identity threat
at a time while they attempt to show their potential.
Acknowledgments
The authors thank Stephanie Claudio, Kate Waliszewski, Ross Parnell and Scott Shanks for serving as experimenters on the project. We are also indebted to Anna
Chalabaev, Anna Woodcock, Toni Schmader, Joel Cooper, and Mark Zanna for their insightful comments on this
work.
References
Aronson, J., Fried, C. B., & Good, C. (2002). Reducing the effects of
stereotype threat on African-American college students by reshaping
theories of intelligence. Journal of Experimental Social Psychology, 38,
113–125.
451
Beilock, S. R., Jellison, W. A., Rydell, R. J., McConnell, A. R., & Carr, T.
H. (2006). On the causal mechanism of stereotype threat: Can skills
that don’t rely heavily on working memory still be threatened?
Personality and Social Psychology Bulletin, 32, 1059–1071.
Biernat, M., & Vescio, T. K. (2002). She swings, she hits, she’s great, she’s
benched: implications for gender based shifting standards for judgment
and behavior. Personality and Social Psychology Bulletin, 28, 66–77.
Croizet, J. C., Despres, G., Gauzins, M., Huguet, P., Leyens, J., & Meot,
A. (2004). Stereotype threat undermines intellectual performance by
triggering a disruptive mental load. Personality and Social Psychology
Bulletin, 30, 721–731.
Danso, H. A., & Esses, V. M. (2001). Black experimenters and the
intellectual test performance of White participants: the tables are
turned. Journal of Experimental Social Psychology, 37, 158–165.
Fredrickson, B. L., & Harrison, K. (2005). Throwing like a girl: selfobjectification predicts adolescent girls’ motor performance. Journal of
Sport & Social Issues, 29, 79–101.
Inzlicht, M., & Ben-Zeev, T. (2000). A threatening intellectual
environment: why females are susceptible to experiencing problemsolving deficits in the presence of males. Psychological Science, 11,
365–371.
Keller, J., & Dauenheimer, D. (2003). Stereotype threat in the
classroom: Dejection mediates the disrupting threat effect on
women’s math performance. Personality and Social Psychology
Bulletin, 29, 371–381.
Knight, J. L., & Guiliano, T. A. (2001). He’s a laker; she’s a looker: the
consequences of gender-stereotyped portrayals of male and female
athletes by the print media. Sex Roles, 45, 217–229.
Li, W., Harrison, L., & Solmon, M. (2004). College students’ implicit
theories of ability in sports: race and gender differences. Journal of
Sport Behavior, 27, 291–304.
Marx, D. M., & Goff, P. A. (2005). Clearing the air: the effect of
experimenter race on target’s test performance and subjective experience. British Journal of Social Psychology, 44, 645–657.
Marx, D. M., & Roman, J. S. (2002). Female role models: protecting
women’s math test performance. Personality and Social Psychology
Bulletin, 28, 1183–1193.
Roper, E. A. (2002). Women working in the applied domain: examining
the gender bias in applied sport psychology. Journal of Applied Sport
Psychology, 14, 53–66.
Sailes, G. A. (1996). An investigation of campus stereotypes: the myth of
Black athletic superiority and the dumb jock stereotype. In R. E.
Lapchick (Ed.), Sport in society: Equal opportunity or business as usual?
(pp 193–202). Thousand Oaks, CA: Sage.
Schmader, T., & Johns, M. (2003). Converging evidence that stereotype
threat reduces working memory capacity. Journal of Personality and
Social Psychology, 85, 440–452.
Seibt, B., & Forster, J. (2004). Stereotype threat and performance: how
self-stereotypes influence processing by inducing regulatory foci.
Journal of Personality and Social Psychology, 87, 38–56.
Sekaquaptewa, D., & Thompson, M. (2002). Solo status, stereotype
threat, and performance expectancies: their effects on women’s
performance. Psychological Science, 39, 68–74.
Spencer, S. J., Steele, C. M., & Quinn, D. M. (1999). Stereotype threat and
women’s math performance. Journal of Experimental Social Psychology, 35, 4–28.
Steele, C. M. (1997). A threat in the air: how stereotypes shape intellectual
identity and performance. American Psychologist, 52, 613–629.
Steele, C. M., & Aronson, J. (1995). Stereotype threat and the intellectual
test performance of African Americans. Journal of Personality and
Social Psychology, 69, 797–811.
Steele, C. M., Spencer, S. J., & Aronson, J. (2002). Contending with group
image: the psychology of stereotype and social identity threat. In M. P.
Zanna (Ed.). Advances in experimental social psychology (Vol. 34,
pp. 379–440). San Diego: Erlbaum.
Stone, J. (2002). Battling doubt by avoiding practice: the effects of
stereotype threat on self-handicapping in White athletes. Personality
and Social Psychology Bulletin, 28, 1667–1678.
452
J. Stone, C. McWhinnie / Journal of Experimental Social Psychology 44 (2008) 445–452
Stone, J., Lynch, C. I., Sjomeling, M., & Darley, J. M. (1999). Stereotype
threat effects on Black and White athletic performance. Journal of
Personality and Social Psychology, 77, 1213–1227.
Stone, J., Perry, Z. W., & Darley, J. M. (1997). ‘‘White men can’t jump’’:
evidence for the perceptual confirmation of racial stereotypes following
a basketball game. Basic and Applied Social Psychology, 19, 291–306.
Strack, F., & Deutsch, R. (2004). Reflective and impulsive determinants of social behavior. Personality and Social Psychology Review,
8, 220–247.
Tuggle, C. A., & Owen, A. (1999). A descriptive analysis of NBC’s
coverage of the centennial Olympics. Journal of Sport and Social Issues,
23, 171–182.
Eagly, A. H., & Carli, L. L. (2007). Through the labyrinth:
The truth about how women become leaders. Boston: Harvard Business School Press
Eagly, A. H., & Chaiken, S. (1993). The psychology of
attitudes. Fort Worth, TX: Harcourt Brace Jovanovich.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Eagly, A. H., & Chaiken, S. (1998). Attitude structure and
function. In D. T. Gilbert, S. T. Fiske, & G. Lindzey
(Eds.), The handbook of social psychology (4th ed., Vol. 1,
pp. 269 –322). New York: McGraw-Hill.
Eagly, A. H., Chen, S., Chaiken, S., & Shaw-Barnes, K.
(1999). The impact of attitudes on memory: An affair to
remember. Psychological Bulletin, 125, 64 – 89.
Eagly, A. H., & Crowley, M. (1986). Gender and helping
behavior: A meta-analytic review of the social psychological literature. Psychological Bulletin, 100, 283–308.
Eagly, A. H., Diekman, A. B., Johannesen-Schmidt, M. C.,
& Koenig, A. M. (2004). Gender gaps in sociopolitical
attitudes: A social psychological analysis. Journal of Personality and Social Psychology, 87, 796 – 816.
Eagly, A. H., Johannesen-Schmidt, M. C., & van Engen,
M. (2003). Transformational, transactional, and laissez-faire
leadership styles: A meta-analysis comparing women and
men. Psychological Bulletin, 129, 569 –591.
Eagly, A. H., & Johnson, B. T. (1990). Gender and leadership style: A meta-analysis. Psychological Bulletin, 108,
233–256.
Eagly, A. H., & Karau, S. (1991). Gender and the emergence of leaders: A meta-analysis. Journal of Personality
and Social Psychology, 60, 685–710.
Eagly, A. H., & Karau, S. J. (2002). Role congruity theory
of prejudice toward female leaders. Psychological Review,
109, 573–598.
Eagly, A. H., Karau, S. J., & Makhijani, M. G. (1995).
Gender and the effectiveness of leaders: A meta-analysis.
Psychological Bulletin, 117, 125–145.
Eagly, A. H., & Kite, M. E. (1987). Are stereotypes of
nationalities applied to both women and men? Journal of
Personality and Social Psychology, 53, 451– 462.
Eagly, A. H., & Steffen, V. J. (1986). Gender and aggressive
behavior: A meta-analytic review of the social psychological
literature. Psychological Bulletin, 100, 309–330.
644
Eagly, A. H., Wood, W., & Chaiken, S. (1978). Causal
inferences about communicators and their effect on opinion
change. Journal of Personality and Social Psychology, 36,
424 – 435.
Wood, W., & Eagly, A. H. (2002). A cross-cultural analysis of the behavior of women and men: Implications for the
origins of sex differences. Psychological Bulletin, 128,
699 –727.
Wood, W., & Eagly, A. H. (in press). Gender. In S. T.
Fiske, D. T. Gilbert, & G. Lindzey (Eds.), Handbook of
social psychology (5th ed.). New York: Wiley.
The His and Hers of Prosocial Behavior: An
Examination of the Social Psychology of
Gender
Alice H. Eagly
Northwestern University
Prosocial behavior consists of behaviors regarded as
beneficial to others, including helping, sharing, comforting,
guiding, rescuing, and defending others. Although women
and men are similar in engaging in extensive prosocial
behavior, they are different in their emphasis on particular
classes of these behaviors. The specialty of women is
prosocial behaviors that are more communal and
relational, and that of men is behaviors that are more
agentic and collectively oriented as well as strength
intensive. These sex differences, which appear in research
in various settings, match widely shared gender role
beliefs. The origins of these beliefs lie in the division of
labor, which reflects a biosocial interaction between male
and female physical attributes and the social structure. The
effects of gender roles on behavior are mediated by
hormonal processes, social expectations, and individual
dispositions.
Editor’s Note
Alice H. Eagly received the Award for Distinguished Scientific Contributions. Award winners are invited to deliver an
award address at the APA’s annual convention. A version
of this award address was delivered at the 117th annual
meeting, held August 6 –9, 2009, in Toronto, Ontario, Canada. Articles based on award addresses are reviewed, but
they differ from unsolicited articles in that they are expressions of the winners’ reflections on their work and their
views of the field.
November 2009 ● American Psychologist
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Keywords: prosocial behavior, gender, sex differences, altruism, helping
Gender fascinates the public and scientists alike, inspiring
continuing debate about how nature and nurture intertwine
in influencing female and male behavior. The fact that the
keyword gender garnered 24,169 hits in 2000 –2008 in the
PsycINFO database shows the thriving state of scholarship
on gender. These publications contain an abundance of information about male–female similarities and differences.
Although the aggregation of large amounts of such information in meta-analyses or other summaries is useful, such
approaches can also be limiting. If the puzzles of gender
are to be solved, the integration of male–female comparisons must be coordinated with effective theory. In its absence, variation in the direction and magnitude of these
differences and similarities can appear to be random and
can even give the impression that gender has little or no
effect on behavior. Yet, the experiences and observations
of everyday life suggest that gender remains a multifaceted
system of influences on personal choices, social interaction,
and societal institutions. In this article, I examine how
these influences operate in one domain of human behavior.
This domain is prosocial behavior, which consists of
behaviors consensually regarded as beneficial to others.
It includes actions such as helping, sharing, comforting,
guiding, rescuing, and defending (Batson, 1998;
Dovidio, Piliavin, Schroeder, & Penner, 2006). Much
prosocial behavior is directed to helping individuals, but
it can be directed as well to supporting a collective,
such as a group, organization, or nation. Although such
actions are not necessarily altruistic in the sense of being devoid of self-oriented motivation, they deliver help
to others.
A simple first question might be whether there is a more
helpful sex. If armchair analysis answers this question,
one’s first thoughts, be they implicit or explicit, might well
reflect gender stereotypes that ascribe kindness and concern
with others more to women than to men (e.g., Diekman &
Goodfriend, 2006; Williams & Best, 1990). Yet, probing
for second thoughts should bring to mind examples of
helpful men. What about heroic men who take enormous
risks for others and warriors who protect their tribe or nation from external assault? Given these disparate images, a
first step toward understanding the prosocial behavior of
women and men involves an examination of gender roles.
Subsequent steps involve explaining the origins of gender
roles and the processes by which they affect behavior.
Gender Roles as a Tool for Understanding Prosocial
Behavior
Elementary insights about social behavior follow from
scrutiny of a society’s gender roles, which are the shared
beliefs that apply to individuals on the basis of their soNovember 2009 ● American Psychologist
cially identified sex (Eagly, 1987). Gender role beliefs are
both descriptive and prescriptive in that they indicate what
men and women usually do and what they should do. The
descriptive aspect of gender roles, or stereotypes, tells people what is typical for their sex. Especially if a situation is
ambiguous or confusing, people tend to enact sex-typical
behaviors. The prescriptive aspect of gender roles tells people what is considered admirable for their sex in their cultural context. People may enact these desirable behaviors
to gain social approval or bolster their own esteem. To
varying extents, gender role beliefs are embedded both in
others’ expectations, thereby acting as social norms, and in
individuals’ internalized gender identities, thereby acting as
personal dispositions (Wood & Eagly, 2009, in press).
These culturally shared beliefs provide a general framework for understanding why male and female behavior can
be different or similar, depending on the behavior and its
circumstances.
Gender role beliefs imply different prosocial behaviors
for women and men. Following concepts introduced by
Bakan (1966), most beliefs about men and women can be
summarized in two dimensions, which are most often labeled communion, or connection with others, and agency,
or self-assertion. Women, more than men, are thought to
be communal—that is, friendly, unselfish, concerned with
others, and emotionally expressive. Men, more than
women, are thought to be agentic—that is, masterful, assertive, competitive, and dominant (e.g., Newport, 2001;
Spence & Buckner, 2000). Studies of gender stereotypes
have consistently found that their content is heavily saturated with communion and agency, with more minor
themes pertaining to other qualities (e.g., Kite, Deaux, &
Haines, 2007). This predominance of communion and
agency is widespread in world cultures (Williams & Best,
1990). To understand the relevance of these beliefs for
prosocial behavior, it is helpful to consider their implications for the types of social bonds that people form.
Social bonds can take a relational form by linking people to particular others in close relationships or a collective
form by linking people to groups and organizations
(Brewer & Gardner, 1996). This distinction between relational and collective interdependence corresponds to the
communal and agentic dimensions of gender stereotypes
(Gardner & Gabriel, 2004). By ascribing warm, sympathetic, and kind qualities to women, gender role beliefs
imply that women have a propensity for bonding with others in close, dyadic relationships. Expressive, affectionate
qualities facilitate friendships, romantic relationships, and
family relationships and convey cooperative interdependence with others (Fiske, Cuddy, Glick, & Xu, 2002).
In contrast, by ascribing assertive, ambitious, and competitive qualities to men, gender role beliefs imply a social
context in which people differ in status and men strive to
improve their hierarchical position (Baumeister & Sommer,
645
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
1997; Gardner & Gabriel, 2004). Such qualities are consistent with men’s directing of much of their prosocial behavior to collectives (Gilmore, 1990). Although independence
is also one of the agentic qualities commonly ascribed to
men, demonstrating a degree of independence in a group
setting can produce influence (Moscovici & Nemeth, 1974;
Shackelford, Wood, & Worchel, 1996) and potentially provide an opportunity for leadership (Eagly, Wood, & Fishbaugh, 1981). In general, superior social status is conveyed
by the agentic attributes ascribed to men, such as being
dominant and masterful (Ridgeway & Bourg, 2004), even
though these attributes are not as favorably evaluated as
the communal attributes ascribed to women (Eagly & Mladinic, 1994; Langford & MacKinnon, 2000).
In the next section of this article I classify prosocial
behaviors according to their agentic or communal emphasis. A gender role analysis suggests that prosocial behaviors are more common in women to the extent that these
behaviors have primarily a communal focus and more common in men to the extent that they have primarily an agentic focus. A corollary of this prediction is that prosocial
behaviors are more common in women if they have a relational emphasis (e.g., supporting or caring for an individual). A second corollary is that prosocial behaviors are
more common in men if they have a collective emphasis,
facilitate gaining status, or imply higher status. Yet another
consideration is that some differences in male and female
behavior reflect sex differences in physical size and
strength. Women’s lesser physical prowess can act as a
deterrent to their participation in highly strength-intensive
activities, which include some prosocial behaviors (Wood
& Eagly, 2002, in press).
These predictions should be understood as implying not
dichotomous male–female differences but general trends
(or main effects of participant sex) that emerge across situational and other individual factors that also affect prosocial behavior and that can moderate or compete with the
effects of gender roles. The logic of prediction for gender
effects is thus similar to that for other personal characteristics (see Leary & Hoyle, 2009). In particular, gender roles
influence behavior in conjunction with many other roles,
including those associated with other group memberships
(e.g., ethnicity, religion) and specific obligations (e.g., family, occupation).
Despite the myriad of influences on social behaviors,
gender roles are important, acting in part through others’
expectations and broader social norms. These external pressures range from subtle (e.g., stereotype threat) to obvious
(e.g., laws or norms forbidding one sex access to certain
roles or opportunities). Gender roles also act through individuals’ personal identification with their gender and are
intertwined with hormonal processes that facilitate masculine and feminine behavior (Wood & Eagly, 2009). In addition, all behaviors are contextually situated, and this con646
text can influence the salience of gender norms and the
accessibility of gender identities (e.g., Deaux & Major,
1987; Piliavin & Unger, 1985).
A convenient organization of trends in agentic and communal prosocial behavior classifies findings by their social
context: interactions with strangers, interactions in close
relationships, interactions in workplaces, and interactions in
other social settings. Meta-analyses are informative, as are
archival data and individual field and laboratory studies.
Invoking these rich sources of data, in the next section I
report male–female differences and similarities, organized
by gender role beliefs and social context. In a subsequent
section (The Origin and Consequences of Gender Roles), I
consider the causal relations in which these beliefs and
behaviors are embedded.
Male–female comparisons from meta-analyses appear in
this article as averaged findings in the d metric, defined as the
difference between the male and female mean values divided
by the pooled standard deviation (see Borenstein, Hedges,
Higgins, & Rothstein, 2009). Effect sizes from single studies,
which are less reliable, are omitted. In contemplating the effect sizes, readers should keep in mind that the cumulative
impact of small effects can be considerable. This insight was
compellingly explained by Abelson (1985, p. 133), who concluded that “small variance contributions of independent variables in single-shot studies grossly understate the variance
contribution in the long run” (see also Epstein, 1980;
Rosenthal, 1990). If studies’ measures are not “single-shot”
but are appropriately aggregated across multiple observations
of behaviors, effect magnitudes are generally larger.
Given these considerations, the most relevant baseline
for interpreting effect magnitudes for prosocial behavior
incorporates the methodological characteristics of its typical research paradigms. In this domain, single-shot studies
are common, depressing effect magnitudes. It is therefore
not surprising that averaging the effects from all available
meta-analyses of prosocial behaviors in social psychology,
regardless of hypothesis, yielded a d of only 0.37 (Richard,
Bond, & Stokes-Zoota, 2003).
Research Comparing Female and Male Prosocial
Behavior
Interactions With Strangers
Helping strangers, a domain that includes many agentic
behaviors, became a focus of social psychological research
in the wake of Darley and Latané’s (1968) research addressing the failure of bystanders to intervene in the infamous Kitty Genovese murder. Social psychologists then
carried out numerous field and laboratory experiments on
helping behavior (see Batson, 1998; Dovidio et al., 2006).
Many of these researchers, like Darley and Latané, studied
bystander interventions in emergency situations in which
another person appeared to be distressed or endangered
(e.g., helping a man who fell in the subway). Other types
November 2009 ● American Psychologist
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
of helping that attracted experimentation included assistance in response to requests (e.g., giving someone money
for the subway) as well as polite behaviors (e.g., helping
someone pick up dropped packages).
A meta-analysis of these experiments revealed that in
general men helped more than women (d ⫽ 0.34, Eagly &
Crowley, 1986; see Johnson et al., 1989, for cross-cultural
replication with a self-report questionnaire). Although all
of the behaviors assessed in these experiments required
some attentiveness to the needs of others, only a portion
required taking the initiative, thus calling on the assertive
qualities central to the male gender role. Therefore, the
studies were classified by whether a need merely presented
itself to bystanders (e.g., through observation that someone
was ill or distressed) or an explicit request to help was directed to them (e.g., an appeal for a charity donation).
When a need is merely present, helpers assert themselves
to deliver aid, whereas when a request is made, helpers
acquiesce to someone else’s wishes. A finding consistent
with the agentic theme of the male gender role was that
men were especially more helpful than women when helpers had to take the initiative (d ⫽ 0.55) than when helpers
had to acquiesce to a request (d ⫽ 0.07).
Many of these helping behaviors drew on agency’s implications for status—that is, the common, albeit eroding,
expectation that men are dominant over women. In a
prosocial context, male dominance implies directing benevolent protectiveness and politeness toward women. Men are
expected not only to protect women from dangers but to
deliver acts of courtesy such as helping them put on their
coats. With cultural roots in medieval codes of chivalry,
such norms have survived in common paternalistic beliefs
and behaviors (Glick & Fiske, 2001).
Aspects of the helping behavior findings suggest male
chivalry. Specifically, in experiments that had divided data
by the sex of the person receiving aid, men helped more
than women for female recipients of help (d ⫽ 0.27); this
effect slightly reversed for male recipients (d ⫽ ⫺0.08,
Eagly & Crowley, 1986). In a finding consistent with the
idea that men’s helping is driven in part by social norms
that can be made salient by others’ presence, another analysis showed that the tendency for men to help more than
women was substantial when the potential helpers were in
the presence of onlookers (d ⫽ 0.74) but not when they
were the only bystander (d ⫽ ⫺0.02).
Some prosocial behaviors, often labeled heroic, require
that the helper take considerable personal risk to aid another person (Becker & Eagly, 2004). Heroic acts of rescuing others in emergencies are consistent with the male gender role in that they are highly agentic in their requirement
for quick and decisive intervention that often places the
rescuer’s own life at risk. Many such actions also advantage men’s greater size and strength, as suggested by the
larger physical size of interveners than of noninterveners in
November 2009 ● American Psychologist
crimes and emergencies (Huston, Ruggiero, Conner, &
Geis, 1981).
Relevant archival data come from the Carnegie Hero
Fund Commission (2009), which recognizes individuals
who voluntarily risk their own lives while saving or attempting to save the life of another person. People whose
job roles or parental responsibilities require acts of rescuing are ineligible for this recognition. Men have received
the great majority of these heroism awards (91% in 1904 –
2008), and there is no evidence of systematic change in
this distribution over the years (e.g., 92% men in 2004 –
2008; W. F. Rutkowsky, Executive Director of the Carnegie Hero Fund Commission, personal communication, May
27, 2009). This disproportion is very unlikely to reflect a
bias against honoring eligible women (see Becker & Eagly,
2004). Replication of this pattern has emerged from the
Canadian government’s awarding of a similar Medal of
Bravery; 87% of these awards in 2004 –2008 have honored
men (Governor General of Canada, 2009). In addition, men
have strongly predominated in contemporary newspaper
accounts of heroic interventions (Lyons, 2005) and among
people recognized for intervening in dangerous criminal
events such as muggings and bank holdups (e.g., Huston et
al., 1981). Also, in the social psychological helpin...
Purchase answer to see full
attachment