More like this +
Cognitive Effects of Risperidone in Children with
Autism and Irritable Behavior
Aman, Michael G ; Hollway, Jill A ; McDougle, Christopher J ; Scahill, Lawrence ; Tierney, Elaine ; et
al. Journal of Child and Adolescent Psychopharmacology
Psychopharmacology;; New Rochelle 18.3 (Jun 2008): 227-36.
Full text - PDF
Abstract/Details
References 32
More like this
n
Search ProQuest...
;
k
& Cite
9 Email
Add to Selected items
Cited by (29)
Documents with shared references (13674)
Related items
Search with indexing terms
Subject
Child psychology
Pharmacology
Autism
P Save
Behavior disorders
Cognition & reasoning
MeSH subject
Adolescent
Autistic Disorder -- psychology
Child
Child Behavior Disorders -- drug therapy
Child Behavior Disorders -- psychology
Child, Preschool
Cognition -- physiology
Cognition Disorders -- chemically induced
Cognition Disorders -- psychology
Double-Blind Method
Female
Humans
Irritable Mood -- physiology
Male
Psychomotor Performance -- drug effects
Psychomotor Performance -- physiology
Risperidone -- adverse effects
Risperidone -- pharmacology
Autistic Disorder -- drug therapy
Cognition -- drug effects
Irritable Mood -- drug effects
Risperidone -- therapeutic use
Search
Ebook Central e-books
1.
The Development of Autism : A Self-Regulatory Perspective
2.
Autism, Brain, and Environment
Autism and Its Medical Management : A Guide for Parents and Professionals
3.
Contact Us
Terms and Conditions
Accessibility
Privacy Policy
Cookie Policy
Credits
Copyright © 2017 ProQuest LLC.
6
Analysis of Variance
Peter Ginter/Science Faction/Corbis
Chapter Learning Objectives
After reading this chapter, you should be able to do the following:
1. Explain why it is a mistake to analyze the differences between more than two groups with
multiple t tests.
2. Relate sum of squares to other measures of data variability.
3. Compare and contrast t test with analysis of variance (ANOVA).
4. Demonstrate how to determine significant differences among groups in an ANOVA with more
than two groups.
5. Explain the use of eta squared in ANOVA.
155
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 155
3/3/16 2:31 PM
Section 6.1
One-Way Analysis of Variance
Introduction
From one point of view at least, R. A. Fisher was present at the creation of modern statistical
analysis. During the early part of the 20th century, Fisher worked at an agricultural research station in rural southern England. Analyzing the effect of pesticides and fertilizers on crop yields,
he was stymied by independent t tests that allowed him to compare only two samples at a time.
In the effort to accommodate more comparisons, Fisher created analysis of variance (ANOVA).
Like William Gosset, Fisher felt that his work was important enough to publish, and like Gosset, he met opposition. Fisher’s came in the form of a fellow statistician, Karl Pearson. Pearson
founded the first department of statistical analysis in the world at University College, London.
He also began publication of what is—for statisticians at least—perhaps the most influential
journal in the field, Biometrika. The crux of the initial conflict between Fisher and Pearson was
the latter’s commitment to making one comparison at a time, with the largest groups possible.
When Fisher submitted his work to Pearson’s journal, suggesting that samples can be small
and many comparisons can be made in the same analysis, Pearson rejected the manuscript. So
began a long and increasingly acrimonious relationship between two men who became giants
in the field of statistical analysis and who nonetheless ended up in the same department at
University College. Gosset also gravitated to the department but managed to get along with
both of them. Joined a little later by Charles Spearman, collectively these men made enormous
contributions to quantitative research and laid the foundation for modern statistical analysis.
6.1 One-Way Analysis of Variance
In an experiment, measurements can vary for a variety of reasons. A study to determine whether
children will emulate the adult behavior observed in a video recording attributes the differences between those exposed to the recording and those not exposed to viewing the recording.
The independent variable (IV) is whether the children have seen the video. Although changes in
behavior (the DV) show the IV ’s effect, they can also reflect a variety of other factors. Perhaps
differences in age among the children prompt behavioral differences, or maybe variety in their
background experiences prompt them to interpret what they see differently. Changes in the
subjects’ behavior not stemming from the IV constitute what is called error variance.
When researchers work with human subjects, some level of error variance is inescapable.
Even under tightly controlled conditions where all members of a sample receive exactly the
same treatment, the subjects are unlikely to respond identically because subjects are complex
enough that factors besides the IV are involved. Fisher’s approach was to measure all the variability in a problem and then analyze it, thus the name analysis of variance.
Try It!: #1
To what does the one in one-way ANOVA
refer?
Any number of IVs can be included in an ANOVA. Initially, we are interested in the simplest form of the test,
one-way ANOVA. The “one” in one-way ANOVA refers to
the number of independent variables, and in that regard,
one-way ANOVA is similar to the independent t test. Both
employ just one IV. The difference is that in the independent t test the IV has just two groups, or levels, and ANOVA
can accommodate any number of groups more than one.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 156
3/3/16 2:31 PM
Section 6.1
One-Way Analysis of Variance
ANOVA Advantage
The ANOVA and the t test both answer the same question: Are there significant differences between groups?
When one sample is compared to a population (in the
study of whether social science students study significantly different numbers of hours than do all university students), we used the one-sample t test. When
two groups are involved (in the study of whether
problem-solving measures differ for married people
than for divorced people), we used the independent
t test. If the study involves more than two groups (for
example, whether working rural, semirural, suburban,
and urban adults completed significantly different
numbers of years of post-secondary education), why
not just conduct multiple t tests?
Joanna Zielska/Hemera/Thinkstock
If a researcher is analyzing how
children’s behavior changes as a result
of watching a video, the independent
variable (IV) is whether the children
have viewed the video. A change in
behavior is the dependent variable
(DV), but any behavior changes other
than those stemming from the IV
reflect the presence of error variance.
Suppose someone develops a group-therapy program
for people with anger management problems. The
research question is Are there significant differences in
the behavior of clients who spend (a) 8, (b) 16, and (c)
24 hours in therapy over a period of weeks? In theory,
we could answer the question by performing three t tests as follows:
1. Compare the 8-hour group to the 16-hour group.
2. Compare the 16-hour group to the 24-hour group.
3. Compare the 8-hour group to the 24-hour group.
The Problem of Multiple Comparisons
The three tests enumerated above represent all possible comparisons, but this approach presents two problems. First, all possible comparisons are a good deal more manageable with three
groups than, say, five groups. With five groups (labeled a through e) the number of comparisons
needed to cover all possible comparisons increases to 10, as Figure 6.1 shows. As the number of
comparisons to make increases, the number of tests required quickly becomes unwieldy.
Figure 6.1 Comparisons needed for five groups
Comparing Group A to Group B is comparison 1. Comparing Group D to Group E would be the tenth
comparison necessary to make all possible comparisons.
a
A
A
e
b
B
C
d
c
D
B
C
D
E
1
2
3
4
5
6
7
8
9
10
E
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 157
3/3/16 2:31 PM
One-Way Analysis of Variance
Section 6.1
The second problem with using t tests to make all possible comparisons is more subtle. Recall
that the potential for type I error (α) is determined by the level at which the test is conducted.
At p = 0.05, any significant finding will result in a type I error an average of 5% of the time.
However, the error probability is based on the assumption that each test is entirely independent, which means that each analysis is based on data collected from new subjects in a separate analysis. If statistical testing is performed repeatedly with the same data, the potential
for type I error does not remain fixed at 0.05 (or whatever level was selected), but grows. In
fact, if 10 tests are conducted in succession with the same data as with groups labeled a, b,
c, d, and e above, and each finding is significant, by the time the 10th test is completed, the
potential for alpha error grows to 0.40 (see Sprinthall, 2011, for how to perform the calculation). Using multiple t tests is therefore not a good option.
Variance in Analysis of Variance
When scores in a study vary, there are two potential explanations: the effect of the independent variable (the “treatment”) and the influence of factors not controlled by the researcher.
This latter source of variability is the error variance mentioned earlier.
The test statistic in ANOVA is called the F ratio (named for Fisher). The F ratio is treatment
variance divided by error variance. As was the case with the t ratio, a large F ratio indicates
that the difference among groups in the analysis is not random. When the F ratio is small and
not significant, it means the IV has not had enough impact to overcome error variability.
Variance Among and Within Groups
If three groups of the same size are all selected from one population, they could be represented
by the three distributions in Figure 6.2. They do not have exactly the same mean, but that is
because even when they are selected from the same population, samples are rarely identical.
Those initial differences among sample means indicate some degree of sampling error.
The reason that each of the three distributions has width is that differences exist within each
of the groups. Even if the sample means were the same, individuals selected for the same
sample will rarely manifest precisely the same level of whatever is measured. If a population
is identified—for example, a population of the academically gifted—and a sample is drawn
from that population, the individuals in the sample will not all have the same level of ability
despite the fact that all are gifted students. The subjects’ academic ability within the sample
will still likely have differences. These differences within are the evidence of error variance.
The treatment effect is represented in how the IV affects what is measured, the DV. For example, three groups of subjects are administered different levels of a mild stimulant (the IV) to
see the effect on level of attentiveness. The subsequent analysis will indicate whether the
samples still represent populations with the same mean, or whether, as is suggested by the
distributions in Figure 6.3, they represent unique populations.
The within-groups’ variability in these three distributions is the same as it was in the distributions in Figure 6.2. It is the among-groups’ variability that makes Figure 6.3 different. More
specifically, the difference between the group means is what has changed. Although some of
the difference remains from the initial sampling variability, differences between the sample
means after the treatment are much greater. F allows us to determine whether those differences are statistically significant.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 158
3/3/16 2:31 PM
One-Way Analysis of Variance
Section 6.1
Figure 6.2: Three groups drawn from the same population
A sample of three groups from the same population will have similar—but not identical—
distributions, where differences among sample means are a result of sampling error.
Figure 6.3: Three groups after the treatment
Once a treatment has been applied to sample groups from the same population, differences between
sample means greatly increase.
The Statistical Hypotheses in One-Way ANOVA
The statistical hypotheses are very much like they were for the independent t test, except that
they accommodate more groups. For the t test, the null hypothesis is written
H0: µ1 = µ2
It indicates that the two samples involved were drawn from populations with the same mean.
For a one-way ANOVA with three groups, the null hypothesis has this form:
H0: µ1 5 µ2 5 µ3
It indicates that the three samples were drawn from populations with the same mean.
Things have to change for the alternate hypothesis, however, because three groups do not
have just one possible alternative. Note that each of the following is possible:
a. HA: µ1 ? µ2 5 µ3
Sample 1 represents a population with a mean value different from the mean of
the population represented by Samples 2 and 3.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 159
3/3/16 2:31 PM
Section 6.1
One-Way Analysis of Variance
b. HA: µ1 5 µ2 ? µ3
Samples 1 and 2 represent a population with a mean value different from the
mean of the population represented by Sample 3.
c. HA: µ1 5 µ3 ? µ2
Samples 1 and 3 represent a population with a mean value different from the
population represented by Sample 2.
d. HA: µ1 ? µ2 ? µ3
All three samples represent populations with different means.
Try It!: #2
How many t tests would it take to make all
possible pairs of comparisons in a procedure with six groups?
Because the several possible alternative outcomes multiply rapidly when the number of groups increases, a
more general alternate hypothesis is given. Either all
the groups involved come from populations with the
same means, or at least one of them does not. So the
form of the alternate hypothesis for an ANOVA with any
number of groups is simply HA: not so.
Measuring Data Variability in the One-Way ANOVA
We have discussed several different measures of data variability to this point, including the
standard deviation (s), the variance (s2), the standard error of the mean (SEM), the standard
error of the difference (SEd), and the range (R). Analysis of variance presents a new measure
of data variability called the sum of squares (SS). As the name suggests, it is the sum of the
squared values. In the ANOVA, SS is the sum of the squares of the differences between scores
and means.
• One sum-of-squares value involves the differences between individual scores and
the mean of all the scores in all the groups. This is the called the sum of squares
total (SStot) because it measures all variability from all sources.
• A second sum-of-squares value indicates the difference between the means of the
individual groups and the mean of all the data. This is the sum of squares between
(SSbet). It measures the effect of the IV, the treatment effect, as well any differences
between the groups and the mean of all the data preceding the study.
• A third sum-of-squares value measures the difference between scores in the samples
and the means of those samples. These sum of squares within (SSwith) values
reflect the differences among the subjects in a group, including differences in the
way subjects respond to the same stimulus. Because this measure is entirely error
variance, it is also called the sum of squares error (SSerr).
All Variability from All Sources: Sum of Squares Total (SStot )
An example to follow will explore the issue of differences in the levels of social isolation people
in small towns feel compared to people in suburban areas, as well as people in urban areas. The
SStot will be the amount of variability people experience—manifested by the difference in social
isolation measures—in all three circumstances: small towns, suburban areas, and urban areas.
There are multiple formulas for SStot. Although they all provide the same answer, some make
more sense to consider than others that may be easier to follow when straightforward calculation is the issue. The heart of SStot is the difference between each individual score (x) and
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 160
3/3/16 2:31 PM
One-Way Analysis of Variance
Section 6.1
the mean of all scores, called the “grand” mean (MG). In the example to come, MG is the mean
of all social isolation measures from people in all three groups. The formula will we use to
calculate SStot follows.
Formula 6.1
SStot 5 ∑(x 2 MG)2
Where
x 5 each score in all groups
MG 5 the mean of all data from all groups, the “grand” mean
To calculate SStot, follow these steps:
1. Sum all scores from all groups and divide by the number of scores to determine the
grand mean, MG.
2. Subtract MG from each score (x) in each group, and then square the difference:
(x 2 MG)2
3. Sum all the squared differences: ∑(x 2 MG)2
The Treatment Effect: Sum of Squares Between (SSbet)
In the example we are using, SSbet is the differences in social isolation between rural, suburban, and urban groups. SSbet contains the variability due to the independent variable, or
what is often called the treatment effect, in spite of the fact that it is not something that
the researcher can manipulate in this instance. It will also contain any initial differences
between the groups, which of course represent error variance. Notice in Formula 6.2 that
SSbet is based on the square of the differences between the individual group means and the
grand mean, times the number in each group. For three groups labeled A, B, and C, the formula is below.
Formula 6.2
SSbet 5 (Ma 2 MG)2na 1 (Mb 2 MG)2nb 1 (Mc 2 MG )2nc
where
Ma 5 the mean of the scores in the first group (a)
MG 5 the same grand mean used in SStot
na 5 the number of scores in the first group (a)
To calculate SSbet,
1.
2.
3.
4.
5.
Determine the mean for each group: Ma, Mb, and so on.
Subtract MG from each sample mean and square the difference: (Ma 2 MG)2.
Multiply the squared differences by the number in each group: (Ma 2 MG)2na.
Repeat for each group.
Sum (∑) the results across groups.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 161
3/3/16 2:31 PM
Section 6.1
One-Way Analysis of Variance
The Error Term: Sum of Squares Within
When a group receives the same treatment but individuals within the group respond differently, their differences constitute error—unexplained variability. These differences can
spring from any uncontrolled variable. Since the only thing controlled in one-way ANOVA is
the independent variable, variance from any other source is error variance. In the example,
not all people in any group are likely to manifest precisely the same level of social isolation.
The differences within the groups are measured in the SSwith, the formula for which follows.
Formula 6.3
SSwith 5 ∑(xa 2 Ma )2 1 ∑(xb 2 Mb)2 1 ∑(xc 2 Mc)2
where
SSwith 5 the sum of squares within
xa 5 each of the individual scores in Group a
Ma 5 the score mean in Group a
To calculate SSwith, follow these steps:
1. Retrieve the mean (used for the SSbet earlier) for each of the groups.
2. Subtract the individual group mean (Ma for the Group A mean) from each score in
the group (xa for Group A)
3. Square the difference between each score in each group and its mean.
4. Sum the squared differences for each group.
5. Repeat for each group.
6. Sum the results across the groups.
The SSwith (or the SSerr) measures the fluctuations in subjects’ scores that are error variance.
All variability in the data (SStot) is either SSbet or SSwith. As a result, if two of three are known,
the third can be determined easily. If we calculate SStot and SSbet, the SSwith can be determined
by subtraction:
SStot 2 SSbet 5 SSwith
The difficulty with this approach, however, is that any calculation error in SStot or SSbet is
perpetuated in SSwith/SSerror. The other value of using Formula 6.3 is that, like the two preceding formulas, it helps to clarify that what is being determined is how much score variability is within each group. For the few problems done
entirely by hand, we will take the “high road” and use
Formula 6.3.
Try It!: #3
When will sum-of-squares values be
negative?
To minimize the tedium, the data sets here are relatively small. When researchers complete larger studies
by hand, they often shift to the alternate “calculation
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 162
3/3/16 2:31 PM
Section 6.1
One-Way Analysis of Variance
formulas” for simpler arithmetic, but in so
doing can sacrifice clarity. Happily, ANOVA is
one of the procedures that Excel performs,
and after a few simple longhand problems,
we can lean on the computer for help with
larger data sets.
Calculating the Sums of Squares
Consider the example we have been using:
A researcher is interested in the level of
social isolation people feel in small towns
(a), suburbs (b), and cities (c). Participants
randomly selected from each of those three
settings take the Assessment List of Nonnormal Environments (ALONE), for which
the following scores are available:
iStockphoto/Thinkstock
People may experience differences in social
isolation when they live in small towns instead
of suburbs or large cities.
a. 3, 4, 4, 3
b. 6, 6, 7, 8
c. 6, 7, 7, 9
We know we will need the mean of all the data (MG) as well as the mean for each group
(Ma, Mb, Mc), so we will start there. Verify that
∑x 5 70 and N 5 12, so MG 5 5.833.
For the small-town subjects,
∑xa 5 14 and na 5 4, so Ma 5 3.50.
For the suburban subjects,
∑xb 5 27 and nb 5 4, so Mb 5 6.750.
For the city subjects,
∑xc 5 29 and nc 5 4, so Mc 5 7.250.
For the sum-of-squares total, the formula is
SStot 5 ∑(x 2 MG)2
5 41.668
The calculations are listed in Table 6.1.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 163
3/3/16 2:31 PM
Section 6.1
One-Way Analysis of Variance
Table 6.1: Calculating the sum of squares total (SStot)
SStot 5 ∑ (x 2 MG)2 5 5.833
For the town data:
x2M
(x 2 M)2
3 2 5.833 5 22.833 8.026
4 2 5.833 5 21.833 3.360
4 2 5.833 5 21.833 3.360
3 2 5.833 5 22.833 8.026
For the suburb data:
x2M
(x 2 M)2
6 2 5.833 5 0.167 0.028
6 2 5.833 5 0.167 0.028
7 2 5.833 5 1.167 1.362
8 2 5.833 5 2.167 4.696
For the city data:
x2M
(x 2 M)2
6 2 5.833 5 0.167 0.028
6 2 5.833 5 0.167 0.028
7 2 5.833 5 1.167 1.362
9 2 5.833 5 3.167
10.030
SStot 5 41.668
For the sum of squares between, the formula is:
SSbet 5 (Ma 2 MG)2na 1 (Mb 2 MG)2nb 1 (Mc 2 MG)2nc
The SSbet for the three groups is as follows:
SSbet 5 (Ma 2 MG)2na 1 (Mb 2 MG)2nb 1 (Mc 2 MG)2nc
5 (3.5 2 5.833)2(4) 1 (6.75 2 5.833)2(4) 1 (7.25 2 5.833)2(4)
5 21.772 1 3.364 1 8.032
5 33.168
The SSwith indicates the error variance by determining the differences between individual
scores in a group and their means. The formula is
SSwith 5 ∑(xa 2 Ma)2 1 ∑(xb 2 Mb)2 1 ∑(xc 2 Mc)2
SSwith 5 8.504
Table 6.2 lists the calculations for SSwith.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 164
3/3/16 2:31 PM
Section 6.1
One-Way Analysis of Variance
Table 6.2: Calculating the sum of squares within (SSwith)
SSwith 5 ∑(xa 2 Ma)2 1 ∑(xb 2 Mb)2 1 ∑(xc 2 Mc)2
3,4,4,3
6,6,7,8
6,7,7,9
Ma 5 3.50, Mb 5 6.750, Mc 5 7.250
x2M
3 2 3.50 5 –0.50
4 2 3.50 5 0.50
4 2 3.50 5 0.50
3 2 3.50 5 –0.50
x2M
6 2 6.750 5 –0.750
6 2 6.750 5 –0.750
7 2 6.750 5 0.250
8 2 6.750 5 1.250
x2M
6 2 7.250 5 1.250
7 2 7.250 5 –0.250
7 2 7.250 5 –0.250
9 2 7.250 5 1.750
For the town data:
For the suburb data:
For the city data:
SSwith58.504
(x 2 M)2
0.250
0.250
0.250
0.250
(x 2 M)2
0.563
0.563
0.063
1.563
(x 2 M)2
1.563
0.063
0.063
3.063
Because we calculated the SSwith directly instead of determining it by subtraction, we can now
check for accuracy by adding its value to the SSbet. If the calculations are correct, SSwith 1 SSbet
5 SStot. For the isolation example, 8.504 1 33.168 5 41.672.
The calculation of SStot earlier found SStot 5 41.668. The difference between that value and the
SStot that we determined by adding SSbet to SSwith is just 0.004. That result is due to differences
from rounding and is unimportant.
We calculated equivalent statistics as early as Chapter 1, although we did not term them sums
of squares. At the heart of the standard deviation calculation are those repetitive x 2 M differences for each score in the sample. The difference values
are then squared and summed, much as they are when
calculating SSwith and SStot. Incidentally, the denominaTry It!: #4
tor in the standard deviation calculation is n 2 1, which
What will SStot 2 SSwith yield?
should look suspiciously like some of the degrees of freedom values we will discuss in the next section.
Interpreting the Sums of Squares
The different sums-of-squares values are measures of data variability, which makes them like
the standard deviation, variance measures, the standard error of the mean, and so on. Also
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 165
3/3/16 2:31 PM
One-Way Analysis of Variance
Section 6.1
like the other measures of variability, SS values can never be negative. But between SS and the
other statistics is an important difference. In addition to data variability, the magnitude of the
SS value reflects the number of scores involved. Because sums of squares are in fact the sum
of squared values, the more values there are, the larger the value becomes. With statistics like
the standard deviation, if more values are added near the mean of the distribution, s actually
shrinks. This cannot happen with the sum of squares. Additional scores, whatever their value,
will always increase the sum-of-squares value.
The fact that large SS values can result from large amounts of variability or relatively large
numbers of scores makes them difficult to interpret. The SS values become easier to gauge
if they become mean, or average, variability measures. Fisher transformed sums-of-squares
variability measures into mean, or average, variability measures by dividing each sum-ofsquares value by its degrees of freedom. The SS 4 df operation creates what is called the
mean square (MS).
In the one-way ANOVA, an MS value is associated with both the SSbet and the SSwith (SSerr).
There is no mean-squares total. Dividing the SStot by its degrees of freedom provides
a mean level of overall variability, but since the analysis is based on how betweengroups variability compares to within-groups variance, mean total variability would not
be helpful.
The degrees of freedom for each of the sums of squares calculated for the one-way ANOVA are
as follows:
• Though we do not calculate a mean measure of total variability, degrees of freedom
total allows us to check the other df values for accuracy later; dftot is N 2 1, where N
is the total number of scores.
• Degrees of freedom for between (dfbet) is k 2 1, where k is the number of groups:
SSbet 4 dfbet 5 MSbet
• Degrees of freedom for within (dfwith) is N – k, total number of scores minus number
of groups: SSwith 4 dfwith 5 MSwith
a. The sums of squares between and within should equal total sum of squares, as
noted earlier: SSbet 1 SSwith 5 SStot
b. Likewise, sum of degrees of freedom between and within should equal degrees
of freedom total: dfbet 1 dfwith 5 dftot
The F Ratio
The mean squares for between and within groups are the components of F, the test statistic
in ANOVA:
Formula 6.4
F 5 MSbet/MSwith
This formula allows one to determine whether the average treatment effect—MSbet—is substantially greater than the average measure of error variance—MSwith. Figure 6.4 illustrates
the F ratio, which compares the distance from the mean of the first distribution to the mean
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 166
3/3/16 2:31 PM
Section 6.1
One-Way Analysis of Variance
of the second distribution, the A variance, to the B and C variances, which indicate the differences within groups.
If the MSbet / MSwith ratio is large—it must be substantially greater than 1.0—the difference
between groups is likely to be significant. When that ratio is small, F is likely to be nonsignificant. How large F must be to be significant depends on the degrees of freedom for the problem, just as it did for the t tests.
Figure 6.4: The F ratio: comparing variance between groups (A) to variance
within groups (B 1 C)
The distance from the mean of the first distribution to the mean of the second distribution, the A
variance, to the B and C variances indicates the differences within groups.
A
B
C
The ANOVA Table
The results of ANOVA analysis are summarized in a table that indicates
•
•
•
•
•
the source of the variance,
the sums-of-squares values,
the degrees of freedom,
the mean square values, and
F.
With the total number of scores (N) 12, and degrees of freedom total (dftot) 5 N 2 1;
12 2 1 5 11. The number of groups (k) is 3 and between degrees of freedom (dfbet) 5 k 2 1, so
dfbet 5 2. Within degrees of freedom (dfwith) are N – k; 12 2 3 5 9.
Recall that MSbet 5 SSbet/dfbet and MSwith 5 SSwith/dfwith. We do not calculate MStot. Table 6.3
shows the ANOVA table for the social isolation problem.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 167
3/3/16 2:31 PM
Section 6.1
One-Way Analysis of Variance
Table 6.3: ANOVA table for social isolation problem
Source
Total
Between
SS
df
41.672
11
8.504
9
33.168
Within
2
MS
F
16.584
17.551
0.945
Verify that SSbet 1 SSwith 5 SStot, and dfbet 1 dfwith 5 dftot. The smallest value an SS can have is
0, which occurs if all scores have the same value. Otherwise, the SS and MS values will always
be positive.
Understanding F
The larger F is, the more likely it is to be statistically significant, but how large is large enough?
In the ANOVA table above, F 5 17.551.
The fact that F is determined by dividing MSbet by MSwith indicates that whatever the value
of F is indicates the number of times MSbet is greater than MSwith. Here, MSbet is 17.551 times
greater than MSwith, which seems promising; to be sure, however, it must be compared to a
value from the critical values of F (Table 6.4; Table B.3 in Appendix B).
As with the t test, as degrees of freedom increase, the critical values decline. The difference
between t and F is that F has two df values, one for the MSbet, the other for the MSwith. In Table
6.3, the critical value is at the intersection of dfbet across the top of the table and dfwith down
the left side. For the social isolation problem, these are 2 (k 2 1) across the top and 9 (N 2 k)
down the left side.
The value in regular type at the intersection of 2 and 9 is 4.26 and is the critical value when
testing at p = 0.05. The value in bold type is for testing at p = 0.01.
• The critical value indicates that any ANOVA test with 2 and 9 df that has an F value
equal to or greater than 4.26 is statistically significant.
• The social isolation differences among the three groups are probably not due to
sampling variability.
• The statistical decision is to reject H0.
The relatively large value of F—it is more than four times the critical value—indicates that the
differences in social isolation are affected by where respondents live. The amount of withingroup variability, the error variance, is small relative to the treatment effect.
Try It!: #5
If the F in an ANOVA is 4.0 and the MSwith 5
2.0, what will be the value of MSbet?
Table 6.4 provides the critical values of F for a variety of research scenarios. When computer software
completes ANOVA, the answer it generates typically
provides the exact probability that a specified value
of F could have occurred by chance. Using the most
common standard, when that probability is 0.05 or
less, the result is statistically significant. Performing
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 168
3/3/16 2:31 PM
Section 6.1
One-Way Analysis of Variance
Table 6.4: The critical values of F
df numerator
df denominator
1
2
3
4
5
6
7
8
9
10
2
18.51
98.49
19.00
99.01
19.16
99.17
19.25
99.25
19.30
99.30
19.33
99.33
19.35
99.36
19.37
99.38
19.38
99.39
19.40
99.40
4
7.71
21.20
6.94
18.00
6.59
16.69
6.39
15.98
6.26
15.52
6.16
15.21
6.09
14.98
6.04
14.80
6.00
14.66
5.96
14.55
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
10.13
34.12
6.61
16.26
5.99
13.75
5.59
12.25
5.32
11.26
5.12
10.56
4.96
10.04
4.84
9.65
4.75
9.33
4.67
9.07
4.60
8.86
4.54
8.68
4.49
8.53
4.45
8.40
4.41
8.29
4.38
8.18
4.35
8.10
4.32
8.02
4.30
7.95
4.28
7.88
4.26
7.82
4.24
7.77
4.21
7.68
4.21
7.68
4.20
7.64
4.18
7.60
4.17
7.56
9.55
30.82
5.79
13.27
5.14
10.92
4.74
9.55
4.46
8.65
4.26
8.02
4.10
7.56
3.98
7.21
3.89
6.93
3.81
6.70
3.74
6.51
3.68
6.36
3.63
6.23
3.59
6.11
3.55
6.01
3.52
5.93
3.49
5.85
3.47
5.78
3.44
5.72
3.42
5.66
3.40
5.61
3.39
5.57
3.35
5.49
3.35
5.49
3.34
5.45
3.33
5.42
3.32
5.39
9.28
29.46
5.41
12.06
4.76
9.78
4.35
8.45
4.07
7.59
3.86
6.99
3.71
6.55
3.59
6.22
3.49
5.95
3.41
5.74
3.34
5.56
3.29
5.24
3.24
5.29
3.20
5.19
3.16
5.09
3.13
5.01
3.10
4.94
3.07
4.87
3.05
4.82
3.03
4.76
3.01
4.72
2.99
4.68
2.96
4.60
2.96
4.60
2.95
4.57
2.93
4.54
2.92
4.51
9.12
28.71
5.19
11.39
4.53
9.15
4.12
7.85
3.84
7.01
3.63
6.42
3.48
5.99
3.36
5.67
3.26
5.41
3.18
5.21
3.11
5.04
3.06
4.89
3.01
4.77
2.96
4.67
2.93
4.58
2.90
4.50
2.87
4.43
2.84
4.37
2.82
4.31
2.80
4.26
2.78
4.22
2.76
4.18
2.74
4.14
2.73
4.11
2.71
4.07
2.70
4.04
2.69
4.02
9.01
28.24
5.05
10.97
4.39
8.75
3.97
7.46
3.69
6.63
3.48
6.06
3.33
5.64
3.20
5.32
3.11
5.06
3.03
4.86
2.96
4.69
2.90
4.56
2.85
4.44
2.81
4.34
2.77
4.25
2.74
4.17
2.71
4.10
2.68
4.04
2.66
3.99
2.64
3.94
2.62
3.90
2.60
3.85
2.59
3.82
2.57
3.78
2.56
3.75
2.55
3.73
2.53
3.70
8.94
27.67
4.95
10.67
4.28
8.47
3.87
7.19
3.58
6.37
3.37
5.80
3.22
5.39
3.09
5.07
3.00
4.82
2.92
4.62
2.85
4.46
2.79
4.32
2.74
4.20
2.70
4.10
2.66
4.01
2.63
3.94
2.60
3.87
2.57
3.81
2.55
3.76
2.53
3.71
2.51
3.67
2.49
3.63
2.47
3.59
2.46
3.56
2.45
3.53
2.43
3.50
2.42
3.47
8.89
27.49
4.88
10.46
4.21
8.26
3.79
6.99
3.50
6.18
3.29
5.61
3.14
5.20
3.01
4.89
2.91
4.64
2.83
4.44
2.76
4.28
2.71
4.14
2.66
4.03
2.61
3.93
2.58
3.84
2.54
3.77
2.51
3.70
2.49
3.64
2.46
3.59
2.44
3.54
2.42
3.50
2.40
3.46
2.39
3.42
2.37
3.39
2.36
3.36
2.35
3.33
2.33
3.30
8.85
27.49
4.82
10.29
4.15
8.10
3.73
6.72
3.44
6.03
3.23
5.47
3.07
5.06
2.95
4.74
2.85
4.50
2.77
4.30
2.70
4.14
2.64
4.00
2.59
3.89
2.55
3.79
2.51
3.71
2.48
3.63
2.45
3.56
2.42
3.51
2.40
3.45
2.37
3.41
2.36
3.36
2.34
3.32
2.32
3.29
2.31
3.26
2.29
3.23
2.28
3.20
2.27
3.17
Values in regular type indicate the critical value for p 5 .05; Values in bold type indicate the critical value for p 5 .01
8.81
27.34
4.77
10.16
4.10
7.98
3.68
6.72
3.39
5.91
3.18
5.35
3.02
4.94
2.90
4.63
2.80
4.39
2.71
4.19
2.65
4.03
2.59
3.89
2.54
3.78
2.49
3.68
2.46
3.60
2.42
3.52
2.39
3.46
2.37
3.40
2.34
3.35
2.32
3.30
2.30
3.26
2.28
3.22
2.27
3.18
2.25
3.15
2.24
3.12
2.22
3.09
2.21
3.07
8.79
27.23
4.74
10.05
4.06
7.87
3.64
6.62
3.64
6.62
3.14
5.26
2.98
4.85
2.85
4.54
2.75
4.30
2.67
4.10
2.60
3.94
2.54
3.80
2.49
3.69
2.45
3.59
2.41
3.51
2.38
3.43
2.35
3.37
2.32
3.31
2.30
3.26
2.27
3.21
2.25
3.17
2.24
3.13
2.22
3.09
2.20
3.06
2.19
3.03
2.18
3.00
2.16
2.98
Source: Critical values of F. (n.d.). Retrieved from http://faculty.vassar.edu/lowry/apx_d.html
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 169
3/3/16 2:31 PM
Locating the Difference: Post Hoc Tests and Honestly Significant Difference (HSD)
Section 6.2
calculations by hand without statistical software, however, requires the additional step of
comparing F to the critical value to determine statistical significance. When the calculated
value is the same as, or larger than, the table value, it is statistically significant.
6.2 Locating the Difference: Post Hoc Tests and Honestly
Significant Difference (HSD)
When a t test is statistically significant, only one explanation of the difference is possible:
the first group probably belongs to a different population than the second group. Things are
not so simple when there are more than two groups. A significant F indicates that at least
one group is significantly different from at least one other group in the study, but unless the
ANOVA considers only two groups, there are a number of possibilities for the statistical significance, as we noted when we listed all the possible HA outcomes earlier.
The point of a post hoc test, an “after this” test conducted following an ANOVA, is to determine which groups are significantly different from which. When F is significant, a post hoc
test is the next step.
There are many post hoc tests. Each of them has particular strengths, but one of the more
common, and also one of the easier to calculate, is one John Tukey developed called HSD, for
“honestly significant difference.” Formula 6.5 produces a value that is the smallest difference
between the means of any two samples that can be statistically significant:
Formula 6.5
HSD 5 x Ñ
MSwith
n
where
x 5 a table value indexed to the number of groups (k) in the problem and the degrees
of freedom within (dfwith) from the ANOVA table
MSwith 5 the value from the ANOVA table
n 5 the number in any group when the group sizes are equal
As long as the number in all samples is the same, the value from Formula 6.5 will indicate the
minimum difference between the means of any two groups that can be statistically significant. An alternate formula for HSD may be used when group sizes are unequal:
Formula 6.6
HSD 5 x Ñ ¢
MSwith
1
1
≤ ¢ 1 ≤
2
n1 n2
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 170
3/3/16 2:31 PM
Locating the Difference: Post Hoc Tests and Honestly Significant Difference (HSD)
Section 6.2
The notation in this formula indicates that the HSD value is for the group-1-to-group-2 comparison (n1, n2). When sample sizes are unequal, a separate HSD value must be completed for
each pair of sample means in the problem.
To compute HSD for equal sample sizes, follow these steps:
1. From Table 6.5, locate the value of x by moving across the top of the table to the
number of groups/treatments (k 5 3), and then down the left side for the within
degrees of freedom (dfwith 5 9). The intersecting values for 3 and 9 are 3.95 and
5.43. The smaller of the two is the value when p = 0.05. The post hoc test is always
conducted at the same probability level as the ANOVA, p = 0.05 in this case.
2. The calculation is 3.95 times the result of the square root of 0.945 (the MSwith) divided
by 4 (n).
3.95 Ñ
0.954
5 1.920
4
This value is the minimum absolute value of the difference between the means of two statistically significant samples. The means for social isolation in the three groups are as follows:
Ma 5 3.50 for small town respondents
Mb 5 6.750 for suburban respondents
Mc 5 7.250 for city respondents
To compare small towns to suburbs this procedure is as follows:
Ma 2 Mb 5 3.50 2 6.75 5 23.25.
This difference exceeds 1.92 and is significant.
To compare small towns to cities, note that
Ma 2 Mc 5 3.50 2 7.25 5 23.75.
This difference exceeds 1.92 and is significant.
To compare suburbs to cities,
Mb 2 Mc 5 6.75 2 7.25 5 20.50.
This difference is less than 1.92 and is not significant.
When several groups are involved, sometimes it is helpful to create a table that presents all
the differences between pairs of means. Table 6.6 repeats the HSD results for the social isolation problem.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 171
3/3/16 2:31 PM
Section 6.2
Locating the Difference: Post Hoc Tests and Honestly Significant Difference (HSD)
Table 6.5: Tukey’s HSD critical values: q (alpha, k, df )
k 5 Number of Treatments
df
2
3
4
5
6
7
8
9
10
5
3.64
5.70
4.60
6.98
5.22
7.80
5.67
8.42
6.03
8.91
6.33
9.32
6.58
9.67
6.80
9.97
6.99
10.24
7
3.34
4.95
4.16
5.92
4.68
6.54
5.06
7.01
5.36
7.37
5.61
7.68
5.82
7.94
6.00
8.17
6.16
8.37
6
8
9
10
11
12
13
14
15
16
17
18
3.46
5.24
3.26
4.75
3.20
4.60
3.15
4.48
3.11
4.39
3.08
4.32
3.06
4.26
3.03
4.21
3.01
4.17
3.00
4.13
2.98
4.10
2.97
4.07
4.34
6.33
4.04
5.64
3.95
5.43
3.88
5.27
3.82
5.15
3.77
5.05
3.73
4.96
3.70
4.89
3.67
4.84
3.65
4.79
3.63
4.74
3.61
4.70
4.90
7.03
4.53
6.20
4.41
5.96
4.33
5.77
4.26
5.62
4.20
5.50
4.15
5.40
4.11
5.32
4.08
5.25
4.05
5.19
4.01
5.14
4.00
5.09
5.30
7.56
4.89
6.62
4.76
6.35
4.65
6.14
4.57
5.97
4.51
5.84
4.45
5.73
4.41
5.63
4.37
5.56
4.33
5.49
4.30
5.43
4.28
5.38
5.63
7.97
5.17
6.96
5.02
6.66
4.91
6.43
4.82
6.25
4.75
6.10
4.69
5.98
4.64
5.88
4.59
5.80
4.56
5.72
4.52
5.66
4.49
5.60
5.90
8.32
5.40
7.24
5.24
6.91
5.12
6.67
5.03
6.48
4.95
6.32
4.88
6.19
4.83
6.08
4.78
5.99
4.74
5.92
4.70
5.85
4.67
5.79
6.12
8.61
5.60
7.47
5.43
7.13
5.30
6.87
5.20
6.67
5.12
6.51
5.05
6.37
4.99
6.26
4.94
6.16
4.90
6.08
4.86
6.01
4.65
5.73
4.79
5.89
24
2.92
3.96
3.53
4.55
3.90
4.91
4.17
5.17
4.37
5.37
4.54
5.54
4.68
5.69
30
40
2.89
3.89
2.86
3.82
3.49
4.45
3.44
4.37
3.85
4.80
3.79
4.70
4.10
5.05
4.04
4.93
4.30
5.24
4.23
5.11
4.62
5.69
4.46
5.40
4.39
5.26
*The critical values for q corresponding to alpha 5 0.05 (top) and alpha 5 0.01 (bottom)
5.19
6.53
5.13
6.41
5.08
6.31
5.03
6.22
4.99
6.15
5.39
6.81
5.32
6.67
5.25
6.54
5.20
6.44
5.15
6.35
5.11
6.27
5.01
6.09
4.47
5.55
4.45
5.51
5.27
6.67
5.49
6.99
4.90
5.97
4.25
5.33
4.23
5.29
5.35
6.84
5.60
7.21
4.77
5.84
3.98
5.05
3.96
5.02
5.46
7.05
5.74
7.49
5.07
6.20
3.59
4.67
3.58
4.64
5.59
7.33
5.92
7.86
4.96
6.08
2.96
4.05
2.95
4.02
5.77
7.68
6.49
9.10
4.82
5.94
19
20
6.32
8.87
4.60
5.54
4.52
5.39
4.92
6.02
4.81
5.81
4.72
5.65
4.63
5.50
5.04
6.14
4.92
5.92
4.82
5.76
4.73
5.60
Source: Tukey’s HSD critical values (n.d.). Retrieved from http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 172
3/3/16 2:32 PM
Section 6.3
Completing ANOVA with Excel
Table 6.6: Presenting Tukey’s HSD results in a table
HSD 5 x Ñ
3.95 Ñ
MSwith
n
0.954
5 1.920
4
Any difference between pairs of means 1.920 or greater is a statistically significant difference.
Small towns
M 5 3.500
Suburbs
M 5 6.750
Cities
M 5 7.250
Small towns
M 5 3.500
Diff 5 3.250
Diff 5 3.750
Suburbs
M 5 6.750
Cities
M 5 7.250
Diff 5 0.500
The mean differences of 3.250 and 3.750 are statistically significant.
The values in the cells in Table 6.6 indicate the results of the post hoc test for differences between
each pair of means in the study. Results indicate that the respondents from small towns
expressed a significantly lower level of social isolation than those in either the suburbs or cities.
Results from the suburban and city groups indicate that social isolation scores are higher in the
city than in the suburbs, but the difference is not large enough to be statistically significant.
6.3 Completing ANOVA
with Excel
The ANOVA by longhand involves enough calculated means, subtractions, squaring of differences,
and so on that letting Excel do the ANOVA work can
be very helpful. Consider the following example:
A researcher is comparing the level of optimism
indicated by people in different vocations during
an economic recession. The data are from laborers,
clerical staff in professional offices, and the professionals in those offices. The optimism scores for the
individuals in the three groups are as follows:
Laborers: 33, 35, 38, 39, 42, 44, 44, 47, 50, 52
Clerical staff: 27, 36, 37, 37, 39, 39, 41, 42, 45, 46
Professionals: 22, 24, 25, 27, 28, 28, 29, 31, 33, 34
iStockphoto/Thinkstock
Using Excel to complete ANOVA makes
it easier to calculate the means,
differences, and other values of
data from studies such as the level
of optimism indicated by people in
different vocations during a recession.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 173
3/3/16 2:32 PM
Completing ANOVA with Excel
Section 6.3
1. First create the data file in Excel. Enter “Laborers,” “Clerical staff,” and “Professionals”
in cells A1, B1, and C1 respectively.
2. In the columns below those labels, enter the optimism scores, beginning in cell A2
for the laborers, B2 for the clerical workers, and C2 for the professionals. After
entering the data and checking for accuracy, proceed with the following steps.
3. Click the Data tab at the top of the page.
4. On the far right, choose Data Analysis.
5. In the Analysis Tools window, select ANOVA Single Factor and click OK.
6. Indicate where the data are located in the Input Range. In the example here, the
range is A2:C11.
7. Note that the default setting is “Grouped by Columns.” If the data are arrayed along rows
instead of columns, change the setting. Because we designated A2 instead of A1 as the
point where the data begin, there is no need to indicate that labels are in the first row.
8. Select Output Range and enter a cell location where you wish the display of the output to begin. In the example in Figure 6.5, the output results are located in A13.
9. Click OK.
Widen column A to make the output easier to read. The result resembles the screenshot in
Figure 6.5.
Figure 6.5: ANOVA in Excel
Results of ANOVA performed using Excel
Source: Microsoft Excel. Used with permission from Microsoft.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 174
3/3/16 2:32 PM
Section 6.3
Completing ANOVA with Excel
Results appear in two tables. The first provides descriptive statistics. The second table looks
like the longhand table we created earlier, except that the column titled “P-value” indicates
the probability that an F of this magnitude could have occurred by chance.
Note that the P-value is 4.31E-06. The “E-06” is scientific notation, a shorthand way of
indicating that the actual value is p 5 0.00000431, or 4.31 with the decimal moved 6
decimals to the left. The probability easily exceeds the p 5 0.05 standard for statistical
significance.
Apply It!
Analysis of Variance and Problem-Solving Ability
A psychological services organization is interested in how long a group of randomly selected
university graduates will persist in a series of cognitive tasks they are asked to complete
when the environment is varied. Forty graduate students are recruited from a state university and told that they are to evaluate the effectiveness of a series of spatial relations tasks
that may be included in a test of academic aptitude. The students are asked to complete a
series of tasks, after which they will be asked to evaluate the tasks. What is actually being
measured is how long subjects will persist in these tasks when environmental conditions
vary. Group 1’s treatment is recorded hip-hop in the background. Group 2 performs tasks
with a newscast in the background. Group 3 has classical music in the background, and
Group 4 experiences a no-noise environment. The dependent variable is how many minutes
subjects persist before stopping to take a break. Table 6.7 displays the measured results.
Table 6.7: Results of task persistence under varied background conditions
1: Hip-hop
2: Newscast
3: Classical music
4: No noise
49
57
77
65
73
69
77
73
57
68
65
62
61
45
53
61
53
65
61
73
57
69
73
77
82
85
93
79
73
89
82
85
61
81
89
77
81
77
69
77
Next, the test results are analyzed in Excel, which produces the information displayed in
Table 6.8.
(continued)
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 175
3/3/16 2:32 PM
Section 6.3
Completing ANOVA with Excel
(continued)
Table 6.8: Excel analysis of task persistence results
Summary
Group
Count
Sum
Average
Variance
1: Hip-hop
10
594
59.4
73.82
3: Classical music
10
822
82.2
36.40
2: Newscast
10
4: No noise
654
65.4
ANOVA
10
Source of variation
SS
df
MS
F
P-value
Fcrit
3063.6
3
1021.1
16.72
5.71E-07
2.87
Between groups
Within groups
2198.4
Total
5262.0
750
65.60
36
39
75.0
61.07
68.44
The research organization first asks: Is there a significant difference? The null hypothesis
states that there is no difference in how long respondents persist, that the background differences are unrelated to persistence. The calculated value from the Excel procedure is F 516.72.
That value is larger than the critical value of F0.05 (3,36) 5 2.87, so the null hypothesis is rejected.
Those in at least one of the groups work a significantly different amount of time before stopping than those in other groups.
The significant F prompts a second question: Which group(s) is/are significantly different
from which other(s)? Answering that question requires the post hoc test.
HSD 5 x Ñ
MSwith
n
x 5 3.81 (based on k 5 4, dfwith 5 36, and p 5 0.05)
MSwith 5 61.07, the value from the ANOVA table
n 5 10, the number in one group when group sizes are equal
HSD 5 3.81 Ñ
5 9.42
61.07
10
This value is the minimum difference between the means of two significantly different samples. The difference in means between the groups appears below:
A 2 B 5 26.0
A 2 C 5 222.8
A 2 D 5 215.6
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 176
3/3/16 2:32 PM
Section 6.4
Determining the Practical Importance of Results
B 2 C 5 216.8
B 2 D 5 29.6
C 2 D 5 7.2
Table 6.9 makes these differences a little easier to interpret. The in-cell values are the differences between the respective pairs of means:
Table 6.9: Mean differences between pairs of groups in task persistence
A. Hip-hop
M1 5 59.4
B. Newscast
M2 5 65.4
1: Hip-hop
M1 5 59.4
6.0
2: Newscast
M2 5 65.4
C. Classical music
M3 5 82.2
3: Classical music
M3 5 82.2
4: No noise
M4 5 75.0
22.8
16.8
D. N
o noise
M4 5 75.0
15.6
9.6
7.2
The differences in the amount of time respondents work before stopping to rest are not significant between environments A and B and between C and D; the absolute values of those
differences do not exceed the HSD value of 9.42. The other four comparisons (in red) are all
statistically significant.
The data indicate that those with hip-hop as background noise tended to work the least
amount of time before stopping, and those with the classical music background persisted
the longest, but that much would have been evident from just the mean scores. The one-way
ANOVA completed with Excel indicates that at least some of the differences are statistically
significant, rather than random; the type of background noise is associated with consistent
differences in work-time. The post hoc test makes it clear that two comparisons show no significant difference, between classical music and no background sound, and between hip-hop
and the newscast.
Apply It! boxes written by Shawn Murphy
6.4 Determining the Practical Importance of Results
Potentially, three central questions could be associated with an analysis of a variance. Whether
questions 2 and 3 are addressed depends upon the
answer to question 1:
1. Are any of the differences statistically significant? The answer depends upon how the calculated F value compares to the critical value
from the table.
Try It!: #6
If the F in ANOVA is not significant, should
the post hoc test be completed?
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 177
3/3/16 2:32 PM
Determining the Practical Importance of Results
Section 6.4
2. If the F is significant, which groups are significantly different from each other? That
question is answered by a post hoc test such as Tukey’s HSD.
3. If F is significant, how important is the result? The question is answered by an effectsize calculation.
If F is not statistically significant, questions 2 and 3 are nonissues.
After addressing the first two questions, we now turn our attention to the third question,
effect size. With the t test in Chapter 5, omega-squared answered the question about how
important the result was. There are similar measures for analysis of variance, and in fact, several
effect-size statistics have been used to explain the
importance of a significant ANOVA result. Omegasquared (ω2) and partial eta-squared (η2) (where
the Greek letter eta [η] is pronounced like “ate a” as
in “ate a grape”) are both quite common in socialscience research literature. Both effect-size statistics are demonstrated here, the omega-squared to
be consistent with Chapter 5, and—because it is
easy to calculate and quite common in the literature—we will also demonstrate eta-squared. Both
statistics answer the same question: Because some
of the variance in scores is unexplained, in other
words error variance, how much of the score variance can be attributed to the independent variable
which, in this recent example, is the background
environment? The difference between the statisDaniel Gale/Hemera/Thinkstock
tics is that omega-squared answers the question
In a study of social isolation based
for the population of all such problems, while the
on where people live (i.e., the
eta-squared result is specific to the particular data
respondents’ location, such as a busy
set.
city) what is the independent variable
(IV)? What is the dependent variable
(DV)?
In the social isolation problem, the question was
whether residents of small towns, suburban areas,
and cities differ in their measures of social isolation. The respondents’ location is the IV. Eta-squared estimates how much of the difference in
social isolation is related to where respondents live.
The η2 calculation involves only two values, both retrievable from the ANOVA table. Formula 6.7 shows the eta-squared calculation:
Formula 6.7
η2 5
SSbet
SStot
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 178
3/3/16 2:32 PM
Determining the Practical Importance of Results
Section 6.4
The formula indicates that eta-squared is the ratio of between-groups variability to total variability. If there were no error variance, all variance would be due to the independent variable,
and the sums of squares for between-groups variability and for total variability would have
the same values; the effect size would be 1.0. With human subjects, this effect-size result
never happens because scores always fluctuate for reasons other than the IV, but it is important to know that 1.0 is the upper limit for this effect size and for omega-squared as well.
The lower limit is 0, of course—none of the variance is explained. But we also never see etasquared values of 0 because the only time the effect size is calculated is when F is significant,
and that can only happen when the effect of the IV is great enough that the ratio of MSbet to
MSwith exceeds the critical value; some variance will always be explained.
For the social isolation problem, SSbet 5 33.168 and SStot 5 41.672, so
η2 5
33.168
5 0.796
41.672
According to these data, about 80% of the variance in social isolation scores relates to whether
the respondent lives in a small town, a suburb, or a city. Note that this amount of variance is
unrealistically high, which can happen when numbers are contrived.
Omega-squared takes a slightly more conservative approach to effect sizes and will always
have a lower value than eta-squared. The formula for omega-squared is:
Formula 6.8
ω2 5
SSbet 2 (k 2 1)MSwith
SStot 1 MSwith
ω2 5
SSbet 2 (k 2 1)MSwith
29.278
33.168 2 (2).945
5
5
5 0.687
SStot 1 MSwith
41.672 1 .945
41.617
Compared to η2, the numerator is reduced by the value of the df between times MSwith, and the
denominator is increased by the SStot plus MSwith. The error term plays a more prominent part
in this effect size than in η2, thus the more conservative value. Completing the calculations for
ω2 yields the following:
The omega-squared value indicates that about 69% of the variability in social isolation can be
explained by where the subject lives. This value is 10% less than the eta-squared value explains.
The advantage to using omega-squared is that the researcher can say, “in all situations where
social isolation is studied as a function of where the subject lives, the location of the subject’s
home will explain about 69% of the variance.” On the other hand, when using eta-squared, the
researcher is limited to saying, “in this instance, the location of the subject’s home explained
about 79% of the variance in social isolation.” Those statements indicate the difference between
being able to generalize compared to being restricted to the present situation.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 179
3/3/16 2:32 PM
Section 6.4
Determining the Practical Importance of Results
Apply It!
Using ANOVA to Test Effectiveness
A researcher is interested in the relative impact
that tangible reinforcers and verbal reinforcers
have on behavior. The researcher, who describes
the study only as an examination of human
behavior, solicits the help of university students.
The researcher makes a series of presentations
on the growth of the psychological sciences with
an invitation to listeners to ask questions or
make comments whenever they wish. The three
levels of the independent variable are as follows:
Wavebreakmedia Ltd/Wavebreak Media/Thinkstock
1. no response to students’ interjections,
except to answer their questions
2. a tangible reinforcer—a small piece of candy—offered after each comment/question
3. verbal praise offered for each verbal interjection
The volunteers are randomly divided into three groups of eight each and asked to report for
the presentations, to which students are invited to respond. Note that there are three independent groups: Those who participate are members of only one group. The three options
described represent the three levels of a single independent variable, the presenter’s response
to comments or questions by the subjects. The dependent variable is the number of interjections by subjects over the course of the presentations.
The null hypothesis (H0: µ1 5 µ2 5 µ3) maintains that response rates will not vary from group
to group, that in terms of verbal comments, the three groups belong to the same population. The alternate hypothesis (HA: not so) maintains that non-random differences will occur
between groups—that, as a result of the treatment, at least one group will belong to some
other population of responders.
Each subject’s number of responses during the experiment is indicated in Table 6.10.
Table 6.10: Number of responses given three different levels of
reinforcer
No response
Tangible reinforcers
Verbal reinforcers
14
18
13
13
19
18
15
16
12
12
15
16
18
17
13
17
18
15
16
15
14
17
13
16
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 180
3/3/16 2:32 PM
Section 6.5
Conditions for the One-Way ANOVA
Completing the analysis with Excel yields the following summary (Table 6.11), with descriptive
statistics first:
Table 6.11: Summary of Excel analysis for the reinforcer study
Group
No Response
Tangible Reinf.
Verbal Reinf.
ANOVA
Source of variation
Between Groups
Within Groups
Count
Sum
Average
Variance
8
119
14.875
6.982143
14.875
2.125000
8
132
8
SS
14.0833333
85.75
16.500
119
df
2
21
3.142857
MS
F
P-value
Fcrit
7.041666667
1.72449
0.202565
3.4668
4.083333333
With an F 5 1.72, results are not statistically significant for a value less than F0.05 (2,21) 5 3.47.
The statistical decision is to “fail to reject” H0. Note that the p value reported in the results is
the probability that the particular value of F could have occurred by chance. In this instance,
there is a 0.20 probability (1 chance in 5) that an F value this large (1.72) could occur by
chance in a population of responders. That p value would need to be p # 0.05 in order for the
value of F to be statistically significant. There are differences between the groups, certainly,
but those differences are more likely explained by sampling variability than by the effect of
the independent variable.
Apply It! boxes written by Shawn Murphy
6.5 Conditions for the One-Way ANOVA
As we saw with the t tests, any statistical test requires that certain conditions be met. The
conditions might include characteristics such as the scale of the data, the way the data are
distributed, the relationships between the groups in the analysis, and so on. In the case of the
one-way ANOVA, the name indicates one of the conditions. Conditions for the one-way ANOVA
include the following:
• The one-way ANOVA test can accommodate just one independent variable.
• That one variable can have any number of categories, but can have only one IV.
In example of rural, suburban, and city isolation, the IV was the location of the
respondents’ residence. We might have added more categories, such as rural,
semirural, small town, large town, suburbs of small cities, suburbs of large cities,
and so on (all of which relate to the respondents’ residence) but like the independent t test, we cannot add another variable, such as the respondents’ gender, in a
one-way ANOVA.
• The categories of the IV must be independent.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 181
3/3/16 2:32 PM
ANOVA and the Independent t Test
Section 6.6
• The groups involved must be independent. Those who are members of one group
cannot also be members of another group involved in the same analysis.
• The IV must be nominal scale. Because the IV must be nominal scale, sometimes
data of some other scale are reduced to categorical data to complete the analysis. If
someone wants to know whether differences in social isolation are related to age,
age must be changed from ratio to nominal data prior to the analysis. Rather than
using each person’s age in years as the independent variable, ages are grouped into
categories such as 20s, 30s, and so on. Grouping by category is not ideal, because by
reducing ratio data to nominal or even ordinal scale, the differences in social isolation between 20- and 29-year-olds, for example, are lost.
• The DV must be interval or ratio scale. Technically, social isolation would need to be
measured with something like the number of verbal exchanges that a subject has
daily with neighbors or co-workers, rather than using a scale of 1–10 to indicate the
level of isolation, which is probably an example of ordinal data.
• The groups in the analysis must be similarly distributed, that is, showing homogeneity of variance, a concept discussed in Chapter 5. It means that the groups should all
have reasonably similar standard deviations, for example.
• Finally, using ANOVA assumes that the samples are drawn from a normally distributed population.
To meet all these conditions may seem difficult. Keep in mind, however, that normality and
homogeneity of variance in particular represent ideals more than practical necessities. As it
turns out, Fisher’s procedure can tolerate a certain amount of deviation from these requirements, which is to say that this test is quite robust. In extreme cases, for example, when calculated skewness or kurtosis values reach 62.0, ANOVA would probably be inappropriate.
Absent that, the researcher can probably safely proceed.
6.6 ANOVA and the Independent t Test
The one-way ANOVA and the independent t test share several assumptions although they
employ distinct statistics—the sums of squares for ANOVA and the standard error of the difference for the t test, for example. When two groups are involved, both tests will produce the
same result, however. This consistency can be illustrated by completing ANOVA and the independent t test for the same data.
Suppose an industrial psychologist is interested in how people from two separate divisions
of a company differ in their work habits. The dependent variable is the amount of work completed after hours at home, per week, for supervisors in marketing versus supervisors in
manufacturing. The data follow:
Marketing: 3, 4, 5, 7, 7, 9, 11, 12
Manufacturing: 0, 1, 3, 3, 4, 5, 7, 7
Calculating some of the basic statistics yields the results listed in Table 6.12.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 182
3/3/16 2:32 PM
Section 6.6
ANOVA and the Independent t Test
Table 6.12: Statistical results for work habits study
M
s
SEM
Marketing
7.25
3.240
1.146
Manufacturing
3.75
2.550
0.901
First, the t test gives
t5
M1 2 M2
SEd
5
SEd
MG
1.458
5.50
7.25 2 3.75
5 2.401; t0.05(14) 5 2.145
4.9511458
The difference is significant. Those in marketing (M1) take significantly more work home than
those in manufacturing (M2).
The ANOVA test proceeds as follows:
• For all variability from all sources (SStot), verify that the result of subtracting
MG from each score in both groups, squaring the differences, and summing the
squares 5 168:
SStot 5 S(x 2 MG)2 5 168
• For the SSbet, verify that subtracting the grand mean from each group mean,
squaring the difference, and multiplying each result by the number in the particular
group 5 49:
SSbet 5 (Ma 2 MG)2na 1 (Mb 2 MG)2nb 5 (7.25 2 5.50)2(8) 1 (3.75 2 5.50)2(8) 5 24.5
• For the SSwith, take each group mean from each score in the group, square the difference, and then sum the squared differences as follows to verify that SSwith 5 119:
SSwith 5 S(xa1 2 Ma)2 1 . . . (xa8 2 Ma)2 1 S(xb1 2 Mb)2 . . . (xb8 2 Ma)2 5 119
Table 6.13 summarizes the results.
Table 6.13: ANOVA results for work habit study
Source
SS
df
Total
168
15
Within
119
14
Between
49
1
MS
F
Fcrit
49
5.765
F0.05(1,14) 5 4.60
8.5
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 183
3/3/16 2:32 PM
Section 6.7
The Factorial ANOVA
Try It!: #7
What is the relationship between the values of t and F if both are performed for the
same two-group test?
Like the t test, ANOVA indicates that the difference in
the amount of work completed at home is significantly
different for the two groups, so at least both tests draw
the same conclusion, statistical significance. Even so,
more is involved than just the statistical decision to
reject H0.
Consider the following:
• Note that the calculated value of t 5 2.401 and the calculated value of F 5 5.765.
• If the value of t is squared, it equals the value of F: 2.4012 5 5.765.
• The same is true for the critical values:
T0.05(14) 5 2.145, 2.1452 5 4.60
F0.05(1,14) 5 4.60
Gosset’s and Fisher’s tests draw exactly equivalent conclusions when two groups are tested.
The ANOVA tends to be more work, so people ordinarily use the t test for two groups, but both
tests are entirely consistent.
6.7 The Factorial ANOVA
In the language of statistics, a factor is an independent variable, and a factorial ANOVA
is an ANOVA that includes multiple IVs. We noted that fluctuations in the DV scores not
explained by the IV emerge as error variance. In the t-test/ANOVA example above, any differences in the amount of work taken home not related to the division between marketing
and manufacturing—differences in workers’ seniority, for example—become part of SSwith
and then the MSwith error. As long as a t test or a one-way ANOVA is used, the researcher cannot account for any differences in work taken home that are not associated with whether the
subject is from marketing or manufacturing, or whatever IV is selected. There can only be one
independent variable.
The factorial ANOVA contains multiple IVs. Each one can account for its portion of variability
in the DV, thereby reducing what would otherwise become part of the error variance. As long
as the researcher has measures for each variable, the number of IVs has no theoretical limit.
Each one is treated as we treated the SSbet: for each IV, a sum-of-squares value is calculated
and divided by its degrees of freedom to produce a mean square. Each mean square is divided
by the same MSwith value to produce F so that there are separate F values for each IV.
The associated benefit of adding more IVs to the analysis is that the researcher can more
accurately reflect the complexity inherent in human behavior. One variable rarely explains
behavior in any comprehensive way. Including more IVs is often a more informative view of
why DV scores vary. It also usually contributes to a more powerful test. Recall from Chapter 4
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 184
3/3/16 2:32 PM
Writing Up Statistics
Section 6.8
that power refers to the likelihood of detecting significance. Because assigning what would
otherwise be error variance to the appropriate IV reduces the error term, factorial ANOVAs
are often more likely to produce significant F values than one-way ANOVAs; they are often
more powerful tests.
In addition, IVs in combination sometimes affect the DV differently than they do when they
are isolated, a concept called an interaction. The factorial ANOVA also calculates F values for
these interactions. If a researcher wanted to examine the impact that marital status and college graduation have on subjects’ optimism about the economy, data would be gathered on
subjects’ marital status (married or not married) and their college education (graduated or
did not graduate). Then SS values, MS values, and F ratios would be calculated for
• marital status,
• college education, and
• the two IVs in combination, the interaction of the factors.
In the manufacturing versus marketing example, perhaps gender and department interact so
that females in marketing respond differently than females in manufacturing, for example.
The factorial ANOVA has not been included in this text, but it is not difficult to understand.
The procedures involved in calculating a factorial ANOVA are more numerous, but they are
not more complicated than the one-way ANOVA. Excel accommodates ANOVA problems with
up to two independent variables.
6.8 Writing Up Statistics
Any time a researcher has multiple groups or levels of a nominal scale variable (ethnic groups,
occupation type, country of origin, preferred language) and the question is about their differences on some interval or ratio scale variable (income, aptitude, number of days sober, number of parking violations), the question can be analyzed using some form of ANOVA. Because
it is a test that provides tremendous flexibility, it is well represented in research literature.
To examine whether a language is completely forgotten when exposure to that language is
severed in early childhood, Bowers, Mattys, and Gage (2009) compared the performance of
subjects with no memory of exposure to a foreign language in their early childhood to other
subjects with no exposure when the language is encountered in adulthood. They compared
the performance with phonemes of the forgotten language (the DV) by those exposed to
Hindi (one group of the IV) or Zulu (a second group of the IV) to the performance of adults of
the same age who had no exposure to either language (a third group of the IV). They found
that those with the early Hindi or Zulu exposure learned those languages significantly more
quickly as adults.
Butler, Zaromb, Lyle, and Roediger III (2009) used ANOVA to examine the impact that viewing
film clips in connection with text reading has on student recall of facts when some of the film
facts are inconsistent with text material. This experiment was a factorial ANOVA with two
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 185
3/3/16 2:32 PM
Summary and Resources
IVs. One independent variable had to do with the mode of presentation including text alone,
film alone, film and text combined. A second IV had to do with whether students received a
general warning, a specific warning, or no warning that the film might be inconsistent with
some elements of the text. The DV was the proportion of correct responses students made to
questions about the content. Butler et al. found that learner recall improved when film and
text were combined and when subjects received specific warnings about possible misinformation. When the film facts were inconsistent with the text material, receiving a warning
explained 37% of the variance in the proportion of correct responses. The type of presentation explained 23% of the variance.
Summary and Resources
Chapter Summary
This chapter is the natural extension of Chapters 4 and 5. Like the z test and the t test,
analysis of variance is a test of significant differences. Also like the z test and t test, the IV in
ANOVA is nominal, and the DV is interval or ratio. With each procedure—whether z, t, or F—
the test statistic is a ratio of the differences between groups to the differences within groups
(Objective 3).
ANOVA and the earlier procedures, do differ, of course. The variance statistics are sums of
squares and mean squares values. But perhaps the most important difference is that ANOVA
can accommodate any number of groups (Objectives 2 and 3). Remember that trying to
deal with multiple groups in a t test introduces the problem of increasing type I error when
repeated analyses with the same data indicate statistical significance. One-way ANOVA lifts
the limitation of a one-pair-at-a-time comparison (Objective 1).
The other side of multiple comparisons, however, is the difficulty of determining which comparisons are statistically significant when F is significant. This problem is solved with the
post hoc test. This chapter used Tukey’s HSD (Objective 4). There are other post hoc tests,
each with its strengths and drawbacks, but HSD is one of the more widely used.
Years ago, the emphasis in scholarly literature was on whether a result was statistically significant. Today, the focus is on measuring the effect size of a significant result, a statistic that
in the case of analysis of variance can indicate how much of the variability in the dependent
variable can be attributed to the effect of the independent variable. We answered that question with eta squared (η2). But neither the post hoc test nor eta squared is relevant if the F is
not significant (Objective 5).
The independent t test and the one-way ANOVA both require that groups be independent.
What if they are not? What if we wish to measure one group twice over time, or perhaps
more than twice? Such dependent group procedures are the focus of Chapter 7, which will
provide an elaboration of familiar concepts. For this reason, consider reviewing Chapter 5
and the independent t-test discussion before starting Chapter 7.
The one-way ANOVA dramatically broadens the kinds of questions the researcher can
ask. The procedures in Chapter 7 for non-independent groups represent the next incremental step.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 186
3/3/16 2:32 PM
Summary and Resources
Key Terms
analysis of variance (ANOVA) Name given
to Fisher’s test allowing a research study
to detect significant differences among any
number of groups.
error variance Variability in a measure
stemming from a source other than the variables introduced into the analysis.
eta squared A measure of effect size for
ANOVA. It estimates the amount of variability in the DV explained by the IV.
factor An alternate name for an independent variable, particularly in procedures that
involve more than one.
factorial ANOVA An ANOVA with more than
one IV.
F ratio The test statistic calculated in an
analysis of variance problem. It is the ratio of
the variance between the groups to the variance within the groups.
interaction Occurs when the combined
effect of multiple independent variables
is different than the variables acting
independently.
mean square The sum of squares divided
by the relevant degrees of freedom. This
division allows the mean square to reflect a
mean, or average, amount of variability from
a source.
one-way ANOVA Simplest variance analysis, involving only one independent variable.
Similar to the t test.
post hoc test A test conducted after a
significant ANOVA or some similar test that
identifies which among multiple possibilities
is statistically significant.
sum of squares The variance measure
in analysis of variance. It is the sum of the
squared deviations between a set of scores
and their mean.
sum of squares between The variability
related to the independent variable and any
measurement error that may occur.
sum of squares error Another name for
the sum of squares within because it refers
to the differences after treatment within the
same group, all of which constitute error
variance.
sum of squares total Total variance from
all sources.
sum of squares within Variability stemming from different responses from individuals in the same group. Because all the
individuals in a particular group receive the
same treatment, differences among them
constitute error variance.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 187
3/3/16 2:32 PM
Summary and Resources
Review Questions
Answers to the odd-numbered questions are provided in Appendix A.
1. Several people selected at random are given a story problem to solve. They take
3.5, 3.8, 4.2, 4.5, 4.7, 5.3, 6.0, and 7.5 minutes. What is the total sum of squares for
these data?
2. Identify the following symbols and statistics in a one-way ANOVA:
a.
b.
c.
d.
The statistic that indicates the mean amount of difference between groups.
The symbol that indicates the total number of participants.
The symbol that indicates the number of groups.
The mean amount of uncontrolled variability.
3. A study theorizes that manifested aggression differs by gender. A researcher finds
the following data from Measuring Expressed Aggression Numbers (MEAN):
Males: 13, 14, 16, 16, 17, 18, 18, 18
Females: 11, 12, 12, 14, 14, 14, 14, 16
Complete the problem as an ANOVA. Is the difference statistically significant?
4. Complete Question 3 as an independent t test, and demonstrate the relationship
between t2 and F.
a. Is there an advantage to completing the problem as an ANOVA?
b. If there were three groups, why not just complete three t tests to answer
questions about significance?
5. Even with a significant F, a two-group ANOVA never needs a post hoc test.
Why not?
6. A researcher completes an ANOVA in which the number of years of education
completed is analyzed by ethnic group. If η2 5 0.36, how should that be
interpreted?
7. Three groups of clients involved in a program for substance abuse attend weekly sessions for 8 weeks, 12 weeks, and 16 weeks. The DV is the number of drug-free days.
8 weeks: 0, 5, 7, 8, 8
12 weeks: 3, 5, 12, 16, 17
16 weeks: 11, 15, 16, 19, 22
a. Is F significant?
b. What is the location of the significant difference?
c. What does the effect size indicate?
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 188
3/3/16 2:32 PM
Summary and Resources
8. For Question 7, answer the following:
a.
b.
c.
d.
What is the IV?
What is the scale of the IV?
What is the DV?
What is the scale of the DV?
9. For an ANOVA problem, k 5 4 and n 5 8.
If SSbet 5 24.0
and SSwith 5 72
a. What is F?
b. Is the result significant?
10. Consider this partially completed ANOVA table:
SS
df
Between
Within
Total
a.
b.
c.
d.
e.
f.
g.
63
94
MS
2
F
Fcrit
3
What must be the value of N 2 k?
What must be the value of k?
What must be the value of N?
What must the SSbet be?
Determine the MSbet.
Determine F.
What is Fcrit?
Answers to Try It! Questions
1. The one in one-way ANOVA refers to the fact that this test accommodates just one
independent variable. One-way ANOVA contrasts with factorial ANOVA, which can
include any number of IVs.
2. A t test with six groups would need 15 comparisons. The answer is the number of
groups (6) times the number of groups minus 1 (5), with the product divided by 2:
6 3 5 5 30 / 2 5 15.
3. The only way SS values can be negative is if there has been a calculation error.
Because the values are all squared values, if they have any value other than 0, they
must be positive.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 189
3/3/16 2:32 PM
Summary and Resources
4. The difference between SStot and SSwith is the SSbet.
5. If F 5 4 and MSwith 5 2, then MSbet must 5 8 because F 5 MSbet 4 MSwith.
6. The answer is neither. If F is not significant, there is no question of which group is
significantly different from which other group because any variability may be nothing more than sampling variability. By the same token, there is no effect to calculate
because, as far as we know, the IV does not have any effect on the DV.
7. t2 5 F
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_06_ch06_155-190.indd 190
3/3/16 2:32 PM
Repeated Measures
Designs for Interval Data
7
Karen Kasmauski/Corbis
Chapter Learning Objectives
After reading this chapter, you should be able to do the following:
1. Explain how initial between-groups differences affect t test or analysis of variance.
2. Compare the independent t test to the dependent-groups t test.
3. Complete a dependent-groups t test.
4. Explain what “power” means in statistical testing.
5. Compare the one-way ANOVA to the within-subjects F.
6. Complete a within-subjects F.
191
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_07_ch07_191-226.indd 191
3/3/16 12:26 PM
Section 7.1
Reconsidering the t and F Ratios
Introduction
Tests of significant difference, such as the t test and analysis of variance, take two basic forms,
depending upon the independence of the groups. Up to this point, the text has focused only
on independent-groups tests: tests where those in one group cannot also be subjects in other
groups. However, dependent-groups procedures, in which the same group is used multiple
times, offer some advantages.
This chapter focuses on the dependent-groups equivalents of the independent t test and the
one-way ANOVA. Although they answer the same questions as their independent-groups
equivalents (are there significant differences between groups?), under particular circumstances these tests can do so more efficiently and with more statistical power.
7.1 Reconsidering the t and F Ratios
The scores produced in both the independent t and the one-way ANOVA are ratios. In the case
of the t test, the ratio is the result of dividing the difference between the means of the groups
by the standard error of the difference:
t5
M1 2 M2
SEd
With ANOVA, the F ratio is the mean square between (MSbet) divided by the mean square
within (MSwith):
F5
MSbet
MSwith
With either t or F, the denominator in the ratio reflects how much scores vary within (rather
than between) the groups of subjects involved in the study. These differences are easy to see
in the way the standard error of the difference is calculated for a t test. When group sizes are
equal, recall that the formula is
SEd 5 Î (SEM )2 1 (SEM )2
1
with
SEM 5
2
s
√n
and s, of course, a measure of score variation in any group.
So the standard error of the difference is based on the standard error of the mean, which in
turn is based on the standard deviation. Therefore, score variance within in a t test has its root
in the standard deviation for each group of scores. If we reverse the order and work from the
standard deviation back to the standard error of the difference, we note the following:
• When scores vary substantially in a group, the result is a large standard deviation.
• When the standard deviation is relatively large, the standard error of the mean must
likewise be large because the standard deviation is the numerator in the formula for SEM.
• A large standard error of the mean results in a large standard error of the difference because that statistic is the square root of the sum of the squared standard errors of the mean.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_07_ch07_191-226.indd 192
3/3/16 12:26 PM
Section 7.1
Reconsidering the t and F Ratios
• When the standard error of the difference is
large, the difference between the means has
to be correspondingly larger for the result to
be statistically significant. The table of critical
values indicates that no t ratio (the ratio of the
differences between the means and the standard error of the difference) less than 1.96 to 1
is going to be significant, and even that value
requires an infinite sample size.
Try It!: #1
If the size of the group affects the size of
the standard deviation, what then is the
relationship between sample size and
error in a t test?
Error Variance
The point of the preceding discussion is that the value of t in the t test—and for F in an
ANOVA—is greatly affected by the amount of variability within the groups involved. Other
factors being equal, when the variability within the groups is extensive, the values of t and F
are diminished and less likely to be statistically significant than when groups have relatively
little variability within them.
These differences within groups stem from differences in the way individuals within the
samples react to whatever treatment is the independent variable; different people respond
differently to the same stimulus. These differences represent error variance—the outcome
whenever scores differ for reasons not related to the IV.
But within-group differences are not the only source of error variance in the calculation of
t and F. Both t test and ANOVA assume that the groups involved are equivalent before the
independent variable is introduced. In a t test where the impact of relaxation therapy on clients’ anxiety is the issue, the test assumes that before the therapy is introduced, the treatment group which receives the therapy and the control group which does not both begin with
equivalent levels of anxiety. That assumption is the key to attributing any differences after the
treatment to the therapy, the IV.
Confounding Variables
In comparisons like the one studying the effects
of relaxation therapy, the initial equivalence of
the groups can be uncertain, however. What if the
groups had differences in anxiety before the therapy
was introduced? The employment circumstances of
each group might differ, and perhaps those threatened with unemployment are more anxious than
the others. What if age-related differences exist
between groups? These other influences that are
not controlled in an experiment are sometimes
called confounding variables.
A psychologist who wants to examine the impact
that a substance abuse program has on addicts’
behavior might set up a study as follows. Two
groups of the same number of addicts are selected,
Greg Smith/Corbis
In a study of the impact of substance
abuse programs on addicts’ behavior,
confounding variables could include
ethnic background, age, or social class.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_07_ch07_191-226.indd 193
3/3/16 12:26 PM
Dependent-Groups Designs
Section 7.2
and one group participates in the substance-abuse program. After the program, the psychologist measures the level of substance abuse in both groups to observe any differences.
The problem is that the presence or absence of the program is not the only thing that might
prompt subjects to respond differently. Perhaps subjects’ background experiences are different. Perhaps ethnic-group, age, or social-class differences play a role. If any of those differences affect substance-abuse behavior, the researcher can potentially confuse the influence
of those factors with the impact of the substance-abuse program (the IV). If those other differences are not controlled and affect the dependent variable, they contribute to error variance. Error variance exists any time dependent-variable (DV) scores fluctuate for reasons
unrelated to the IV.
Thus, the variability within groups reflects error variance, and any difference between groups
that is not related to the IV represents error variance. A statistically significant result requires
that the score variance from the independent variable be substantially greater than the error
variance. The factor(s) the researcher controls must contribute more to score values than the
factors that remain uncontrolled.
7.2 Dependent-Groups Designs
Ideally, any before-the-treatment differences between the groups in a study will be minimal.
Recall that random selection entails every member of a population having an equal chance
of being selected. The logic behind random selection dictates that when groups are randomly
drawn from the same population, they will differ only by chance; as sample size increases,
probabilities suggest that they become increasingly
similar in characteristic to the population. No sample,
however, can represent the population with complete
Try It!: #2
fidelity, and sometimes the chance differences affect
How does the use of random selection
the way subjects respond to the IV.
enable us to control error variance in statistical testing?
One way researchers reduce error variance is to adopt
what are called dependent-groups designs. The independent t test and the one-way ANOVA required independent groups. Members of one group could not also be members of other groups in the
same study. But in the case of the t test, if the same group is measured, exposed to a treatment,
and then measured again, the study controls an important source of error variance. Using the
same group twice makes the initial equivalence of the two groups no longer a concern. Other
aspects being equal, any score difference between the first and second measure should indicate only the impact of the independent variable.
The Dependent-Samples t Tests
One dependent-groups test where the same group is measured twice is called the before/after
t test. An alternative is called the matched-pairs t test, where each participant in the first group
is matched to someone in the second group who has a similar characteristic. The before/after
t test and the matched-pairs t test both have the same objective—to control the error variance
that is due to initial between-groups differences. Following are examples of each test.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_07_ch07_191-226.indd 194
3/3/16 12:26 PM
Section 7.2
Dependent-Groups Designs
• The before/after design: A researcher is interested in the impact that positive
reinforcement has on employees’ sales productivity. Besides the sales commission,
the researcher introduces a rewards program that can result in increased vacation time. The researcher gauges sales productivity for a month, introduces the
rewards program, and gauges sales productivity during the second month for the
same people.
• The matched-pairs design: A school counselor is interested in the impact that verbal
reinforcement has on students’ reading achievement. To eliminate between-groups
differences, the researcher selects 30 people for the treatment group and matches
each person in the treatment group to someone in a control group who has a similar
reading score on a standardized test. The researcher then introduces the verbal
reinforcement program to those in the treatment group for a specified period of time
and then compares the performance of students in the two groups.
Although the two tests are set up differently, both calculate the t statistic the same way. The differences
between the two approaches are conceptual, not mathematical. They have the same purpose—to control
between-groups score variation stemming from nonrelevant factors.
Try It!: #3
How do the before/after t test and the
matched-pairs t test differ?
Calculating t in a Dependent-Groups Design
The dependent-groups t may be calculated using several methods. Each method takes into
account the relationship between the two sets of scores. One approach is to calculate the
correlation between the two sets of scores and then to use the strength of the correlation
as a mechanism for determining between-groups error variance: the higher the correlation
between the two sets of scores, the lower the error variance. Because this text has yet to discuss correlation, for now we will use a t statistic that employs “difference scores.” The different approaches yield the same answer.
The distribution of difference scores came up in Chapter 5 when it introduced the independent t test. Recall that the point of that distribution is to determine the point at which the
difference between a pair of sample means (M1 2 M2) is so great that the most probable
explanation is that the samples came from different populations.
Dependent-groups tests use that same distribution, but rather than the difference between
the means of the two groups (M1 2 M2), the numerator in the t ratio is the mean of the differences between each pair of scores. If that mean is sufficiently different from the mean
of the population of difference scores (which, recall, is 0), the t value is statistically significant; the first set of measures belongs to a different population than the second
set of measures. That may seem odd since in a before/after test, both sets of measures
come from the same subjects, but the explanation is that those subjects’ responses (the
DV) were altered by the impact of the independent variable; their responses are now
different.
The denominator in the t ratio is another standard error of the mean value, but in this case, it
is the standard error of the mean of the difference scores. The researcher checks for significance using the same criteria as for the independent t:
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_07_ch07_191-226.indd 195
3/3/16 12:26 PM
Section 7.2
Dependent-Groups Designs
• A critical value from the t table, determined by degrees of freedom, defines the point
at which the calculated t value is statistically significant.
• The degrees of freedom are the number of pairs of scores minus 1 (n 2 1).
The dependent-groups t test statistic uses this formula:
Formula 7.1
t5
Md
SEMd
where
Md 5 the mean of the difference scores
SEMd 5 the standard error of the mean for the difference scores
The steps for completing the test are as follows:
1. From the two scores for each subject, subtract the second from the first to determine
the difference score, d, for each pair.
2. Determine the mean of the d scores:
Sd
Md 5
number of pairs
3. Calculate the standard deviation of the d values, sd.
4. Calculate the standard error of the mean for the difference scores, SEMd, by dividing
sd by the square root of the number of pairs of scores,
SEMd 5
sd
Î number of pairs
5. Divide Md by SEMd, the standard error of the mean for the difference scores:
t5
Md
SEMd
Figure 7.1 depicts these steps.
The following is an example of a dependent-measures t test: A psychologist is investigating
the impact that verbal reinforcement has on the number of questions university students
ask in a seminar. Ten upper-level students participate in two seminars where a presentation
is followed by students’ questions. In the first seminar, the instructor provides no feedback
after a student asks the presenter a question. In the second seminar, the instructor offers
feedback—such as “That’s an excellent question” or “Very interesting question” or “Yes, that
had occurred to me as well”—after each question.
Is there a significant difference between the number of questions students ask in the first
seminar compared to the number of questions students ask in the second seminar? Problem
7.1 shows the number of questions asked by each student in both seminars and the solution
to the problem.
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_07_ch07_191-226.indd 196
3/3/16 12:26 PM
Section 7.2
Dependent-Groups Designs
Figure 7.1: Steps for calculating the before/after t test
Subtract the second score
from the first for each pair
to determine d
Determine the mean of
the d score; Md
Determine Sd by taking
the standard deviation
of the d scores
Divide Sd by the square
root of the number of
pairs to determine SEMd
Divide Md by SEMd to
determine t
Problem 7.1: Calculating the before/after t test
1
2
3
4
5
6
7
8
9
10
Seminar 1
Seminar 2
1
3
0
3
0
2
1
3
2
1
2
d
22
2
22
3
21
4
0
1
5
4
3
1
21
0
0
22
22
22
1
Sd 5 211
(continued)
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
tan82773_07_ch07_191-226.indd 197
3/3/16 12:26 PM
Section 7.2
Dependent-Groups Designs
Problem 7.1: Calculating the before/after t test (continued)
1. Determine the difference between each pair of scores, d, using subtraction.
2. Determine the mean of the difference, the d values (Md).
11
Sd
5
5 21.1
Md 5
10
10
3. Calculate the standard deviation of the d values (Sd). Verify that
Sd 5 1.101.
4. Just as the standard error of the mean in the earlier test was s√n, determine
standard error of the mean for the difference scores (SEMd) by dividing the
result of step 3 by the square root of the number of pairs. Verify that
SEMd 5
sd
Î np
5
1.101
Î 10
5 0.348
5. Divide Md by SEMd to determine t.
t5
Md
SEMd
52
1.1
0.348
5 23.161
6. As noted ea...
Purchase answer to see full
attachment