Unformatted Attachment Preview
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
What do you
want to do?
Make
inferences
Describe
How many
variables?
Univariate
Bivariate
What level
of data?
Nominal
Ordinal
Central
tendency
Central
tendency
Mode
Dispersion
Form
Form
Skew
16304_CH06_Walker.indd 146
T
I
F
F
A
NMedian
Y
Dispersion
Range,
Index of
dispersion
Kurtosis
L
I
D
D
E
L
L
,
Kurtosis
Multivariate
Interval/
Ratio
Central
tendency
Mean
Dispersion
1Average
Absolute
5deviation
6
8
T Skew
S
Variance,
Standard
deviation
Form
Skew
Kurtosis
7/12/12 10:02:10 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
Chapter 6
The Form of a Distribution
Learning Objectives
L
Understand the number of modes,I skewness, and kurtosis as they relate to
explaining a data set.
D
■■ Explain the difference between the mode and the number of modes.
D
■■ Interpret the values of skewness and kurtosis as they relate to univariate
E
analysis.
■■ Discuss the importance of the normal
L curve in statistics.
■■ Describe the properties of the normal curve.
L
The final univariate descriptive statistic,,the form of the distribution, ties together the
■■
central tendency and dispersion of the data. Three characteristics make up the form of
the distribution: the number of modes, the symmetry, and the kurtosis. In addressing
the form of a distribution, a polygon canTgenerally be used to represent these characteristics visually.
I
F
6-1 Moments of a Distribution
F
In some statistics books and other places,A
distributions and the form of distributions are
referred to in terms of the moments of the distribution. There are four moments that are
N
considered important to a distribution. Moments are calculated as follows:
Y
S1X 2 X2i
1N
where 1 X 2 X 2 represents the deviations5from the mean (as has been the case in Chapters 4 and 5), N is the total number of cases in the distribution, and i is the moment
6
being calculated.
8 the mean is always zero, the first moment
Since the sum of the deviations around
is always zero. If X is taken to the second
T power in the formula above, you can see
that this is the formula for the variance—thus, the second moment is the variance.
The third moment is usually associatedSwith the skew of the distribution, although
the exact formula is to divide the formula for the third moment by the variance to
the power of 1.5. Similarly, the kurtosis of a distribution is associated with the fourth
147
16304_CH06_Walker.indd 147
7/12/12 10:02:10 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
148 Chapter 6
n
The Form of a Distribution
moment, although the exact formula is to divide the formula for the fourth moment by
the variance squared. The mean and variance were discussed in “Measures of Central
Tendency” and “Measures of Dispersion” (Chapters 4 and 5). The skew and kurtosis
of a distribution are discussed in this chapter, together with the third measure of the
form of a distribution: the number of modes.
6-2
Number of Modes
L
The first measure of the form of a distribution is the number of modes. The number
I
of modes is important to higher-order analyses because it is indicative of the normalD multivariate statistical procedures, a
ity of the distribution. To use many bivariate and
unimodal distribution is preferred.
D
In determining the number of modes, a slight deviation from determining the
E
mode may be necessary. Recall from the discussion of central tendency that it is common to count only the highest frequency in aLdistribution as the mode, even though
some people argue that all peaks of a distribution
L should be considered. For determining the number of modes in an analysis of form, it may be more beneficial to look at
,
peaks rather than to find the one, highest value. Consider, for example, the distribution
in Figure 6-1. Even though there is only one highest value, there are three peaks in
the distribution. These peaks may make the data
T unsuitable for certain statistical procedures unless transformations are made. In this distribution, all three modes should
I
probably be counted in evaluating the form of the distribution even though the mode
F
is actually only 4.
F
A
N
Y
45
40
35
30
25
20
15
10
5
0
1
2
3
4
Figure 6-1
16304_CH06_Walker.indd 148
5
1
5
6
8
T
S6
7
8
9
10
Polymodal Distribution
7/12/12 10:02:10 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
6-3
6-3
Skewness 149
Skewness
The next characteristic of the form of the distribution is the degree of symmetry
(skewness) of the distribution. This measure of the form of a distribution has three categories: symmetrical, positively skewed, and negatively skewed. A fully symmetrical
distribution has mirror-image sides such that the distribution could be split at the mean
and the sides folded over each other for a perfect match. In Figure 6-2, it is easy to see
the symmetry in the distribution. This is the histogram from Figure 4-7. The frequenL
cies displayed in this distribution are very balanced: categories 1 and 7 have the same
I
frequency, as do 2 and 6, and 3 and 5. Category
4 has the highest frequency level. It
is easy to see that this distribution could
be
folded
in half and the two sides would
D
match perfectly. This distribution is therefore, a perfectly symmetrical distribution. In
D
actual research, however, it is not common to see a perfectly symmetrical distribution.
E be only close to symmetrical or not at all
More typically, the distribution will either
symmetrical.
L
It should be noted here that the number of modes does not necessarily affect the
L
skew of the distribution. A distribution that is bimodal could still be cut in half where
the distribution mirrors itself. The only,difference in this case is that the mode and
other measures of central tendency would not be the same.
T
I
F
F
A
N
Y
1
2
3
1
5
6
8
T
S4
5
6
7
Figure 6-2 Histogram and Normal Curve for a Symmetrical Distribution
16304_CH06_Walker.indd 149
7/12/12 10:02:10 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
150 Chapter 6
n
The Form of a Distribution
Analysis of Skew
If a distribution is such that one side is different from the other, it is said to be skewed.
In skewed distributions, there is no point that can be drawn in the polygon where it could
be divided into two similar parts. If the point of the curve is to the left of the graph, it
is said to be negatively skewed (the tail of the graph points to the negative end of the
scale, smaller positive numbers). In Figure 6-3, “children” is an example of a negatively skewed distribution. Here, the point of the curve is toward category 1 or the left of
L of the graph, it is said to be positively
the graph. If the point of the curve is to the right
skewed (the tail of the graph points toward theI positive end of the scale, larger positive
numbers). In Figure 6-3, “gun-wher” is an example of a positively skewed distribution.
D 12.5 or the right of the graph.
Here, the point of the curve points toward category
D output. A value of 0 means there
SPSS provides measures of skew in frequency
is no skew to the data. Skew values of zero are
E almost never obtained, however, and
a distribution is considered symmetrical if the skew value in SPSS is between 21 and
L if it has a skew greater than +1.00
1.1 A distribution is generally considered skewed
or less than (a greater negative number) 21.00.
L The magnitude of the number will
represent the degree of skew. When conducting
, research, it is desirable to obtain a
distribution that has a skew as close to zero as possible. If the skew is outside +1 to
21, the distribution may be too skewed to work with, and efforts should be made to
get the distribution closer to normal. This is T
done through transformations, which is
addressed in the discussion on regression.
I
The frequency distribution that has been used with the other univariate measures
F is 20.477, which means that the disis shown in Table 6-1. Here the value of skew
tribution is not perfectly symmetrical but thatFit exhibits an acceptable level of skew
(it is within the acceptable range of 0 to 21.00). There is some negative skew to this
A
distribution, as exhibited by the negative value, but it is not enough to warrant addiN value had been less than 21.00 (e.g.,
tional analyses or give cause for concern. If this
22.77), it might have been necessary to transform
Y the distribution.
1
5
6
8
T
S
30
400
300
20
200
10
100
0
Std. Dev. = 0.22
Mean = 1.95
N = 324.00
1.00
CHILDREN
1.50
2.00
0
Std. Dev. = 1.85
Mean = 2.2
N = 47.00
0.0
2.5
5.0
7.5
10.0
12.5
GUN_WHER
Figure 6-3 Negatively and Positively Skewed Distributions
16304_CH06_Walker.indd 150
7/12/12 10:02:11 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
6-3
Skewness 151
What is your highest level of education?
Value Label
Value
Frequency
Percent
Valid
Percent
Cumulative
Percent
Less than High School
1
16
4.6
4.8
4.8
GED
2
59
17.0
17.6
22.3
High School Graduate
3
8
2.3
2.4
24.7
Some College
4
117
33.7
34.8
59.5
College Graduate
5
Post Graduate
6
Missing
Total
Valid
N
Missing
Mean
Median
Mode
L
72
I 64
D11
347
D
E
336
L11
L4.08
, 4.00
20.7
21.4
81.0
18.4
19.0
100.0
3.2
100.0
100.00
4
Std. Deviation
1.460
Variance
Skewness
Std. Error of Skewness
Kurtosis
Std. Error of Kurtosis
Table 6-1
T2.131
I2.477
F .133
F2.705
A .265
SPSS Output
N of Measures of Form
Y
Although quantitative measures of skewness and kurtosis will almost always be
available when conducting actual research, an estimate of the skew of a distribution
1 If the mean and the median are different,
can be made even without a skew calculation.
the distribution is at least somewhat skewed,
5 although it is not possible to tell if it is
beyond +1 or 21. Additionally, the skew is in the direction of the mean. For example,
6
if the distribution is positively skewed, the mean will be larger than the median, but
8 smaller than the median. It should also be
if the skew is negative, the mean will be
noted that the mode is generally on the opposite
side of the median from the mean in
T
skewed distributions. This is not always the case, however, and should not be treated
S
as a rule. MacGillivray (1981) discusses the conditions under which each of these
examples would fall.
16304_CH06_Walker.indd 151
7/12/12 10:02:11 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
152 Chapter 6
6-4
n
The Form of a Distribution
Kurtosis
The last characteristic of the form of a distribution is the kurtosis. For kurtosis, think
again of stacking blocks (or beer cans) on top of each other to represent the frequency
of the categories in a histogram. The kurtosis is the extent to which cases are piled up
around the measure of central tendency or in the tails of the distribution. If most of the
values in the distribution are very close to the measure of central tendency, the distribution is said to be leptokurtic (as shown on the left in Figure 6-4). If most of the
L
values in the distribution are out in the tails, the distribution is said to be platykurtic,
I in the distribution are such that they
as shown on the right in Figure 6-4. If the values
represent a distribution such as that shown in D
Figure 6-2, the distribution is said to be
mesokurtic, as shown in the center of Figure 6-4. It is desirable to have a mesokurtic
D
distribution in research; otherwise, the data may have to be transformed.
E
L
L
,
T
I
1
2
3
4
5
6
7
1
2
3
4
5
6
7
7
1
2
3
4
5
6
F
Figure 6-4 Leptokurtic, Mesokurtic, and Platykurtic Distributions
F
A
The shape of these curves also offers anN
opportunity to talk about variance and
standard deviation. As discussed in “Measures of Dispersion” (Chapter 5), the variance and standard deviation dictate the shapeY
of the distribution. In a leptokurtic distribution, the variance and standard deviation would be smaller than in a mesokurtic
distribution. The variance and standard deviation
1 of a platykurtic distribution would
be larger than either a mesokurtic or leptokurtic distribution. This is one application of
5
the variance and standard deviation. A more expanded
discussion of this application is
presented later in the section on the normal curve.
6
8
T as skew. A value between +1 and 21
In SPSS, kurtosis is measured in the same way
represents a mesokurtic distribution. Positive S
numbers greater than 1 represent leptoAnalysis of Kurtosis
kurtic curves. Negative numbers less than 21 (a greater negative number) represent
platykurtic curves. As with skew, it is desirable to get the kurtosis as close as possible
16304_CH06_Walker.indd 152
7/12/12 10:02:11 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
6-6
Design of the Normal Curve 153
to zero, using transformations if necessary. Examining the kurtosis value in Table 6-1
shows that the distribution is mesokurtic because the value (20.705) is between 21.00
and 0. If this value had been 21.705, the distribution would have been platykurtic.
6-5
The Importance of Skew and Kurtosis
It is important to know the skew and kurtosis because some statistical procedures do not
work well with skewed data or data that isLnot mesokurtic. If data in a research project is
found to be skewed or kurtose, it may be necessary to transform the data. Initially, you
I
must remember two things about transformation.
First, if the data is not within acceptable tolerances for skew and kurtosis, theD
data will need to be transformed prior to using
some statistical procedures. Second, after
D making transformations, recheck both the
skew and kurtosis. Transforming the data may bring one of these within acceptable tolE If that happens, you will need to choose
erances but may make the other unacceptable.
another transformation. You should thenLrecheck the skew and kurtosis again. This
process should continue until you reach a point where both the skew and kurtosis are
L
acceptable. If it is not possible to get both the skew and kurtosis in an acceptable range,
, analysis procedure that is not susceptible to
you may need to consider using a different
nonnormal curves.
6-6
T
Design of the Normal
Curve
I
Extending the concepts of the frequencyF
distribution, graphical representation of data,
and measures of central tendency, dispersion, and form brings us to the point of disF
cussing a key concept in statistical analysis, the normal curve. At this point, you
Aof the normal curve; that is covered in more
should not be concerned with applications
detail in the chapters of the book on inferential
analysis. The purpose of the present
N
discussion is to introduce the properties of the normal curve.
An introduction to the normal curve Y
is included in descriptive analyses rather than
inferential analyses for two reasons. First, the normal curve can be used to provide
an interpretation of the variance and standard
1 deviation. Second, the normal curve is
important to a number of statistical procedures that will be discussed before reaching
5
information on inferential statistical procedures.
An example of relatively normally 6
distributed data can be shown in grades in a
course (see Figure 6-5). Say that most people
8 taking the course score a C on the first
test. This would be the modal grade (the top part of the curve). There are those who
receive high A’s, but there would be onlyTa few of these; they would be at the positive
end of the curve. There are also those who
Sreceive very low F’s, but these are also few;
they would be at the negative end of the curve. Most people would be in between these
two extremes, with more people making scores around C’s than other grades, and
16304_CH06_Walker.indd 153
7/12/12 10:02:11 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
154 Chapter 6
n
The Form of a Distribution
L
I
D
D
E
L Curve
Figure 6-5 Normal
L
more people making B’s and D’s than A’s and, F’s. For the sake of argument, though,
say that no one received a 100 and there were a few people who did not take the test.
Therefore, the ends of the tails will never completely touch the baseline.
This type of data represents a special formTof distribution called the normal curve.
This type of curve or distribution is very much like those that have been used in this
I
chapter and at the end of Chapter 4. A normal curve is special because it has certain
F
characteristics. First, a normal curve is symmetrical
in that it can be folded in half,
and both sides would be exactly the same (asF
in Figure 6-2, where the frequencies of
category 1 and 3 are the same as those of category 5 and 7, respectively). Note, though,
A
that this does not mean this curve cannot be kurtose. Some symmetrical distributions
are leptokurtic or platykurtic. This is shown N
in Figure 6-4, where each of those distributions was symmetrical even if it was kurtose.
Y Also, a normal curve is unimodal;
there is one, and only one, peak. This peak is at the maximum frequency of the data
distribution, so that the mean, median, and mode all have the same value. The normal
1 From the peak, the tails of a normal
curve shown in Figure 6-5 has only one mode.
curve fall off on both ends and extend to infinity,
5 always getting closer to the baseline
but never touching it. This is shown in Figure 6-5, where the bottom part of the curve
6
straightens out and runs relatively parallel to the X axis. You may say this makes no
8 the normal curve not have an end?
sense; all distributions have an end, so why would
The answer lies in the scientific process. TakeT
the example of computers. Less than 20
years ago, scientists and engineers thought they had achieved the ultimate when they
S
were able to reach 640K of random access memory (RAM) in a computer. They felt
that this was the maximum that could be achieved and all that anyone would ever need.
To them, the distribution limits were set. We now know, of course, that 640K was only
16304_CH06_Walker.indd 154
7/12/12 10:02:11 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
6-6
Design of the Normal Curve 155
the beginning and that computers are far beyond that now. It would have been foolish,
then, to have the curve touch the line at 640K; it should not touch the line because we
do not know what will come in the future. A final characteristic of the normal curve
that merits discussion is that the area under a normal curve is always the same, regardless of the data set. The area under the normal curve is 1.00, or 100% of all values in
the distribution. This is extremely important in the section of this book concerning
inferential analyses because of its importance in estimating the placement of a sample
distribution within a population or another
L sample.
The area under the normal curve also offers the opportunity to put the variance
I
and standard deviation into practice. Say, for example, that a researcher was examinD before they committed another crime or
ing the time prisoners were out on parole
returned to prison on a technical violation.
D If the time each parolee took before being
reincarcerated was plotted, it might look as in Figure 6-6. There were a few people
E
who returned to prison right away, most of the parolees who returned to prison did it
within two to four years, and some tookL
longer. Some had not recidivated at the time
of the research, so the end of the distribution
L is open.
An analysis of the central tendency would put the mean of this distribution at 36
, line. This is good information: the avermonths, which is represented by the vertical
age length of time for parolees to be reincarcerated is three years. It is obvious from
this distribution, however, that not all ofTthe parolees were reincarcerated at the same
time. The span of time runs from a couple of months to more than five years. To get a
Frequency
I
F
F
A
N
Y
12
24
1
5
6
8
T
S 36
48
60
Months
Figure 6-6 Distribution of Time to Reincarceration for Parolees
16304_CH06_Walker.indd 155
7/12/12 10:02:12 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
156 Chapter 6
n
The Form of a Distribution
more accurate picture of the distribution of parolees, we might want to know, on average, how far each of them is from the mean.
To calculate this, the procedure would be to take each person (each dot in Figure
6-6) and determine how far from the mean it is. This could be completed by measuring
the distance with a ruler, but because this is a numeric scale, it could also be completed
by subtracting each value from the mean. This operation would be noted as X 2 X.
Summing all of these values, represented by S 1 X 2 X 2 , would provide us with the
total of the distance from each value to the mean.
L But remember that the sum of each
value subtracted from the mean is zero, so that does not help us. A solution to this is
I
to square each value before summing it. This calculation, S 1 X 2 X 2 2, will give a positive value. Knowing the total distance betweenDthe mean and each value is good, but a
simpler value would be the average of the distance
D from each value to the mean. The
calculation for this procedure is:
E
S 1 X 2 XL2 2
N
L
The only problem with this calculation is that the value that would be obtained is not
,
on the same scale as the original data. To return this value to the original scale, take
the square root of the value. Of course, you recognize that this is the procedure for
calculating the variance and, ultimately, the T
standard deviation for the distribution.
Although this procedure can be completed for any distribution, it is particularly imporI
tant for a normal curve because of what the standard
deviation represents.
Because the area under the normal curveFis always 100% and because standard
deviations represent standard distances in relation
F to the mean of the distribution, standard deviations can be used to calculate the area under the normal curve for particular
values. For example, between the mean and 1 A
standard deviation under a normal curve
lies 34.13% of all the data. Between the mean
N and 2 standard deviations is 47.72%
of all the data. Between the mean and 3 standard
Y deviations is 49.87% of all the data.
That covers 49.87% of the possible 50% of one half of the curve. Since the normal
curve is symmetrical, the same values can be obtained whether counting from the left
or right of the mean. This also means that the1values could be doubled from one side
and the area under the whole normal curve could be determined for values plus or
5
minus a certain number of standard deviations from the mean. If the figures above are
6 between 21 and 1 standard deviation,
doubled, the result would be 68.26% of the data
95.44% of the data between 22 and 2 standard
8 deviations, and 99.74% of the data
between 23 and 3 standard deviations.
T
This is very useful to know because it is now possible to determine what percentSstandard deviation within the distribuage of the scores lie between the mean and any
tion. The problem is that researchers are not always interested in determining values
(or determining the area under a normal curve) only for values directly on a particular
16304_CH06_Walker.indd 156
7/12/12 10:02:12 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
6-6
Design of the Normal Curve 157
standard deviation. Often, calculations need to be made that allow any value in the
distribution to be converted to standard deviation units such that the area under the
normal curve can be calculated for that value. The procedure for doing this is to calculate a Z score for that value. Z scores convert any value in the distribution to standard
deviation units. The formula for calculating a Z score is
Z5
X2X
s
L
where X is the number to be converted to a Z score, X the mean of the distribution, and
I
s is the standard deviation of the distribution.
The process of converting values to D
Z scores is simply a method of deriving standard scores. To be able to make comparisons
D between things, it is necessary to set a
standard on which both items may be measured. If measuring distance, we could use
E
feet or meters; weight could be in pounds or kilograms. But what about social data
L in terms of crime rates. Calculating crime
such as crime? Crime is generally discussed
rates is simply a process of standardizingLcrime in terms of the population of a city—
taking numbers that may be difficult to compare between different things and making
,
them comparable. The same can be done with the mean, standard deviation, and normal curve through Z scores. This standardizes scores based on the normal curve.
What this means is that if a researcher
T knows the mean and standard deviation
of a population, Z scores can be used to calculate the distance between any value and
I
the mean. Using the area under the normal curve allows inferential analyses (see the
F to be used to determine the chances that
chapters in this book on inferential analyses)
any number in a distribution will be in aF
sample drawn from that population.
The process of calculating a Z score, determining the area under the normal curve
A
between that value and the mean, and examining
how that score relates to the mean, is
Nfrom the mean. This determines whether the
rather simple. First, subtract the raw score
number is above or below the mean: negative
Y numbers are below the mean, positive
numbers above the mean. Then divide the number by the standard deviation to determine how many standard deviation units the number is above or below the mean. The
answer obtained from this calculation is1the score’s deviation from the mean in standard units (the Z score in standard deviation
5 units). Note that you may get a negative
number; all this means is that the Z score is to the left of the mean. It does not affect
the calculations at all and can be dropped6in the remainder of the procedure.
The next step is to take this number8and turn to Table B-1 in Appendix B, “Statistical Tables.” The row to use corresponds to this number in column a of the table.
T
The columns to use within this row depend on what area under the normal curve is
S area between the score and the mean, look
being examined. If you are looking for the
to column b in the table; if examining the area beyond that score, which is farther into
the tail of the distribution, use column c.
16304_CH06_Walker.indd 157
7/12/12 10:02:12 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
158 Chapter 6
n
The Form of a Distribution
Suppose that the number of complaints against a police department averaged 12
complaints a month with a standard deviation of 1.5. Then suppose that the chief wants
you to determine what percentage of the scores fell between the mean and the current
month’s score of 14. This would require determining the area between the mean and
14. To do this, first calculate a Z score, as shown below:
Z5
5
X2X
s L
14 2I 12
1.5
D
2
1.5
D
E
5 1.33 L
L
Then look at column a in Table B-1 in Appendix B, “Statistical Tables,” and find 1.33.
,
5
You are looking for the area between the score and the mean, so you would use column b to get the percentage of the area under the normal curve that falls between the
mean and 14. This value is 0.4082, so the areaT
in this distribution between 12 and 14 is
40.82%. What does this mean? It means that almost 41% of the months in the current
I
year had fewer complaints recorded than those in the current month.
F many months had more complaints
But what if the chief wanted to know how
recorded than the current month? To determine
F the percentage of scores greater than
14, you would calculate the Z score exactly as before. That number, 1.33, is then found
A
in column a of Table B-1 in Appendix B, “Statistical Tables.” Since you are trying to
N column c in the table would be used.
determine the number of scores greater than 14,
The value in column c that corresponds to a Y
Z score of 1.33 is 0.0918, so 9.18% of
the scores in the distribution are greater than 14, which means that almost 10% of the
months had more complaints recorded than the current month. Actually, because the
1 not necessary to complete the second
area between the mean and 14 was known, it was
procedure. Because it was already determined
5that the percentage of scores between
the mean and 14 was 40.82%, and because the area under the normal curve is always
6
100% and the area under half of the normal curve is 50%, the 40.08 could have been
8
subtracted from 50 to arrive at the score of 9.18%—the
area beyond that value.
These procedures work whether using scores
above
the mean or scores below the
T
mean. As stated previously, 21 standard deviation is the same as 1 standard deviation;
S looking for the area under the normal
thus, it would be the same percentage if we were
curve representing less than 10. But what if you wanted to know how many scores
were higher than one score and lower than another: for example, how many people
16304_CH06_Walker.indd 158
7/12/12 10:02:12 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
6-6
Design of the Normal Curve 159
scored higher than 90 and lower than 62 on a test. This case would require determining
the percentage greater than the score of 90 and the percentage less than the score of
62. The final pieces of information needed here are that the mean of the distribution is
76 and the standard deviation is 10.5. The areas beyond these two scores can be determined simply by adding together the percentages in Table B-1 in Appendix B, “Statistical Tables.” The Z score calculations for these two values would be as follows:
Z5
X2X L
X2X
Z 5
s
s
I
90 2 76 D
62 2 76
5
5
10.5
10.5
D
2
22
E
5
5
1.5
1.5
L
5 1.33
5 21.33
L
The results of these calculations would ,be taken to Table B-1 in Appendix B, “Statistical Tables.” Finding 1.33 in column a and wanting to determine the area under
the normal curve beyond this value, we would look in column c. The value found in
T
column c is 0.0918. This means that 9.18% of the scores are higher than 90. Since 62
has the same value, but negative, it also Ihas a value in column c of 0.0918, so 9.18%
of the scores are lower than 62. Adding these
F two values together would establish what
percentage of people scored greater than 90 and less than 62:
F
A = 0.1836
0.0918 + 0.0918
N
So 18.36% of the people in the class scored higher than 90 (making an A) or lower
Y
than 62 (making an F). It is also possible to determine the percentage of scores that fall
between two scores. In the example we have been using, the percentage of scores that
fall between 90 and 62 can be determined
1 by adding together the areas between them
and the mean. Looking at column b in Table B-1 in Appendix B, “Statistical Tables,”
5
the percentage of scores between the mean of 76 and a score of 90 (the Z score of
6 the same Z score value, the percentage of
1.33) is 40.82%. Since the score of 62 has
scores between 62 and 76 is also 40.82%.8Adding those percentages together produces
a percentage of 81.64% as shown below:
T
S = 0.8164
0.4082 + 0.4082
So 81.64% of the class scored a B, C, or D.
16304_CH06_Walker.indd 159
7/12/12 10:02:12 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
160 Chapter 6
n
The Form of a Distribution
Points to Remember about the Normal Curve
Two Z scores that will become important later are 1.96 and 2.58. These correspond to
95% and 99% of the area under the normal curve, respectively. These are important
because researchers often want to know where 95% or 99% of the values in the distribution fall, or they may want to compare two values and see if they fall within 95% or
99% of the values in the distribution.
Another point you should remember is that the normal curve is a theoretical ideal.
Normal distributions do not really exist, and L
the best we can do is to gather data that
is close to a normal curve but not exact. The conclusions
drawn when using the theory
I
of the normal curve, then, will only be estimates, not hard facts.
Finally, you should realize that not all D
data sets even come close to a normal
Dtendency (see Chapter 4), some districurve. In the discussion on measures of central
butions are multimodal and almost jagged looking.
E There are many distributions that
are so skewed they are J shaped and do not conform at all to a normal distribution.
L
There are even a few distributions that are absolutely
flat. Each of these distributions
requires special treatment, which is discussedLin the chapters on bivariate and multivariate analysis.
,
6-7
Conclusion
T
In this chapter we completed the description ofIone variable at a time from a distribution
(univariate descriptive analysis) and you have learned how to describe a variable such
F
that it can be relayed to another person accurately, and in a form that allows the person
F
to get a mental image of the distribution. Describing
single variables can take many
forms: frequency distributions and graphs, measures
of central tendency, measures
A
of dispersion, and the form of the distribution. These all have the goal of describing
N
the attributes of the distribution and determining if a variable is suitable for further
Y
analyses.
In the next chapters, we put together two variables—bivariate descriptive analyses—and describe what relationship might exist between them. The success of those
analyses depends on the successful univariate1description of data.
6-8
Key Terms
form
kurtosis
leptokurtic
mesokurtic
moments
negatively skewed
normal curve
16304_CH06_Walker.indd 160
5
6
8
number of modes
platykurtic
T
positively skewed
S
skew
symmetry
Z score
7/12/12 10:02:12 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
6-10
6-9
Exercises 161
Summary of Equations
Moments of a Distribution
S1X 2 X2i
N
Z Score
LX 2 X
I s
D
6-10 Exercises
D (from the gang database):
1. Use the following frequency tables
a. To discuss the number of modes.
E
b. To determine the skew and kurtosis and discuss whether the distribution
L
is positively skewed or negatively skewed and whether it is leptokurtic,
mesokurtic, or platykurtic. L
,
HOME: What type of house do you live in?
Z5
Value Label
House
Duplex
Trailer
Apartment
Other
Missing
Total
N
Mean
Std. Error of Mean
Median
Mode
Std. Deviation
Variance
Skewness
Std. Error of Skewness
Kurtosis
Std. Error of Kurtosis
Range
16304_CH06_Walker.indd 161
Value
1
2
3
4
5
Frequency
T
280
I 3
F34
F21
2
A
3
343
N
ValidY
Missing
1
5
6
8
T
S
Percent
81.6
.9
9.9
6.1
.6
.9
100.0
Valid
Percent
82.4
.9
10.0
6.2
.6
Cumulative
Percent
82.4
83.2
93.2
99.4
100.0
100.0
340
3
1.41
.051
1
1
0.945
.892
2.001
.132
2.613
.264
5
7/12/12 10:02:12 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
162 Chapter 6
n
The Form of a Distribution
ARREST: How many times have you been arrested?
Value
Frequency
0
243
1
23
2
10
3
3
Valid
Percent
Cumulative
Percent
70.8
86.2
86.2
6.7
8.2
94.3
2.9
3.5
97.9
.9
1.1
98.9
Percent
L
2
.6
I
1
.3
61 D
17.8
343 D 100.0
E282
Valid
Missing L 61
L .30
, .093
5
24
Missing
Total
N
Mean
Std. Error of Mean
Median
.7
99.6
.4
100.0
100.0
0
Mode
0
T 1.567
I 2.455
F 12.692
F .145
A187.898
.289
N
24
Y
Std. Deviation
Variance
Skewness
Std. Error of Skewness
Kurtosis
Std. Error of Kurtosis
Range
TENURE: How long have you lived at your current address (months)?
Value
1
2
3
4
16304_CH06_Walker.indd 162
1
Frequency
5
14
6
6
8
4
4 T
S
Percent
Valid
Percent
Cumulative
Percent
4.1
4.3
4.3
1.7
1.8
6.1
1.2
1.2
7.3
1.2
1.2
8.6
7/12/12 10:02:13 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
6-10
Exercises 163
TENURE: How long have you lived at your current address (months)?
Cumulative
Percent
Frequency
Percent
5
6
1.7
1.8
10.4
6
6
1.7
1.8
12.2
7
1
.3
.3
12.5
8
L 3
I 2
1
D 1
D11
E 1
L 5
L 1
, 30
.9
.9
13.5
.6
.6
14.1
.3
.3
14.4
.3
.3
14.7
3.2
3.4
18.0
.3
.3
18.3
1.5
1.5
19.9
.3
.3
20.2
8.7
9.2
29.4
1
.3
.3
29.7
1
.3
.3
30.0
.3
.3
30.3
6.4
6.7
37.0
.3
.3
37.3
3.5
3.7
41.0
7.0
7.3
48.3
4.1
4.3
52.6
.3
.3
52.9
2.3
2.4
55.4
9
10
11
12
14
18
21
24
30
31
84
T 1
I 22
F 1
F12
A24
14
N 1
Y 8
96
18
5.2
5.5
60.9
108
1 4
5 9
6 11
21
813
T11
S
1.2
1.2
62.1
2.6
2.8
64.8
3.2
3.4
68.2
6.1
6.4
74.6
3.8
4.0
78.6
3.2
3.4
82.0
32
36
42
48
60
72
76
120
132
144
156
168
16304_CH06_Walker.indd 163
Valid
Percent
Value
7/12/12 10:02:13 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
164 Chapter 6
n
The Form of a Distribution
TENURE: How long have you lived at your current address (months)?
Value
Frequency
Percent
Valid
Percent
Cumulative
Percent
170
5
1.5
1.5
83.5
180
7
2.0
2.1
85.6
182
2
.6
.6
86.2
186
1
.3
.3
86.5
4.1
4.3
90.8
.3
.3
91.1
7.0
7.3
98.5
.9
.9
99.4
.6
.6
100.0
192
198
204
216
240
Missing
Total
N
L
14
I
1
D
24
3 D
2 E
16 L
343 L
Valid
,
Missing
Mean
Std. Error of Mean
Median
Mode
Std. Deviation
Variance
Skewness
Std. Error of Skewness
Kurtosis
Std. Error of Kurtosis
Range
16304_CH06_Walker.indd 164
4.7
100.0
100.0
327
16
T 88.77
3.880
I 72
F 24
F 70.164
A 4923.055
N .365
Y .135
21.284
1
5
6
8
T
S
.269
239
7/12/12 10:02:13 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
6-10
Exercises 165
SIBS: How many brothers and sisters do you have?
Value
39
11.4
11.5
11.5
39.9
40.5
52.1
2
79
23.0
23.4
75.4
3
L39
I 17
13
D
6
D
4
E 1
L 1
L 1
, 1
11.4
11.5
87.0
7
9
10
12
15
Mean
Std. Error of Mean
Median
Mode
Std. Deviation
Variance
Skewness
Std. Error of Skewness
Kurtosis
Std. Error of Kurtosis
Range
16304_CH06_Walker.indd 165
Cumulative
Percent
137
6
N
Valid
Percent
1
4
Missing
Percent
0
5
Total
Frequency
5.0
5.0
92.0
3.8
3.8
95.9
1.7
1.8
97.6
1.2
1.2
98.8
.3
.3
99.1
.3
.3
99.4
.3
.3
99.7
.3
.3
100.0
5
1.5
343
100.0
T
I
Missing
F
F
A
N
Y
Valid
1
5
6
8
T
S
338
5
1.94
.098
1
1
1.801
3.245
2.664
.133
12.027
.265
15
7/12/12 10:02:13 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
166 Chapter 6
n
The Form of a Distribution
2. For a distribution with a mean of 50 and a standard deviation of 10:
a. Calculate the Z score for a score of 40.
b. Determine the area under the normal curve between the mean and this value.
c. Determine the area under the normal curve beyond this value.
d. Calculate a Z score for a score of 65.
e. Determine the area under the normal curve between the mean and this value.
f. Determine the area under the normal curve beyond this value.
g. Determine the area under the normal
L curve between 40 and 60.
h. Determine the area under the normal curve outside 40 and 60.
I
3. Say a parole board has a policy that it will only release prisoners who meet
D a minimum number of good-time
a minimum amount of time served, have
points, and have made an acceptableD
score on tests in their drug awareness
class. If the mean of this distribution is 90, the minimum acceptable score on
E
these criteria is 70, and the standard deviation of scores is 15:
a. Calculate the Z score for a score ofL90.
b. Calculate the Z score for a score ofL50.
c. Determine the area under the normal curve between the mean and the val,
ues in parts a and b.
d. Determine the area under the normal curve beyond these values.
e. Calculate a Z score for a score of 70.
T
f. Determine the area under the normal curve between the mean and this valI
ue.
g. Determine the area under the normal
F curve beyond this value.
F
6-11 References
A
N mode inequality and skew for a class
MacGillivray, H. L. (1981). The mean, median,
of densities. Australian Journal of Statistics,
Y23, 247.
6-12 For Further Reading1
Pearson, K. (1894). On the dissection of asymmetrical
frequency-curves: General
5
theory. Philosophical Transactions of the Royal Society of London (Series A, Vol.
6
185). London: Cambridge University Press.
8 frequency curves in general: Types
Pearson, K. (1895). Classification of asymmetrical
actually occurring. Philosophical Transactions
T of the Royal Society of London
(Series A, Vol. 186). London: Cambridge University Press.
S
16304_CH06_Walker.indd 166
7/12/12 10:02:13 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
6-13
Note 167
6-13 Note
1. SPSS uses the convention of +1 to 21 as the measure of acceptable skew.
This is based on a particular formula that “standardizes” skew and kurtosis
scores. There is a similar (and popular) formula that calculates skew and kurtosis where the acceptable range is +3 to 23. Programs such as SAS use this
formula. It is important, then, that you understand which formula is being used
(what the acceptable range is) for these values before making a judgment about
L
them.
I
D
Criminal Justice on the Web
Dto make full use of today’s teaching and techVisit http://criminaljustice.jbpub.com/Stats4e
nology! Our interactive Companion Website
Ehas been designed to specifically complement
Statistics in Criminology and Criminal Justice: Analysis and Interpretation, 4th Edition. The
L
resources available include a Glossary, Flashcards,
Crossword Puzzles, Practice Quizzes,
Weblinks, and Student Data Sets. Test yourself
L today!
,
T
I
F
F
A
N
Y
1
5
6
8
T
S
16304_CH06_Walker.indd 167
7/12/12 10:02:13 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
Appendix B
Statistical Tables
L
I
D
D
E
L
Table B-1 Area Under the Normal Curve
L
a
b
c
b
a
c
Area
Area
Area
Area ,
between beyond
between beyond
Z
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
0.11
0.12
0.13
0.14
0.15
0.16
0.17
0.18
0.19
0.20
0.21
0.22
0.23
0.24
–
X and Z
0.0000
0.0040
0.0080
0.0120
0.0160
0.0199
0.0239
0.0279
0.0319
0.0359
0.0398
0.0438
0.0478
0.0517
0.0557
0.0596
0.0636
0.0675
0.0714
0.0753
0.0793
0.0832
0.0871
0.0910
0.0948
Z
0.5000
0.4960
0.4920
0.4880
0.4840
0.4801
0.4761
0.4721
0.4681
0.4641
0.4602
0.4562
0.4522
0.4483
0.4443
0.4404
0.4364
0.4325
0.4286
0.4247
0.4207
0.4168
0.4129
0.4090
0.4052
Z
0.25
0.26
0.27
0.28
0.29
0.30
0.31
0.32
0.33
0.34
0.35
0.36
0.37
0.38
0.39
0.40
0.41
0.42
0.43
0.44
0.45
0.46
0.47
0.48
0.49
–
X and Z
0.0987
0.1026
0.1064
0.1103
0.1141
0.1179
0.1217
0.1255
0.1293
0.1331
0.1368
0.1406
0.1443
0.1480
0.1517
0.1554
0.1591
0.1628
0.1664
0.1700
0.1736
0.1772
0.1808
0.1844
0.1879
Z
0.4013
0.3974
0.3936
0.3897
0.3859
0.3821
0.3783
0.3745
0.3707
0.3669
0.3632
0.3594
0.3557
0.3520
0.3483
0.3446
0.3409
0.3372
0.3336
0.3300
0.3264
0.3228
0.3192
0.3156
0.3121
a
Z
0.50
0.51
0.52
0.53
0.54
0.55
0.56
0.57
0.58
0.59
0.60
0.61
0.62
0.63
0.64
0.65
0.66
0.67
0.68
0.69
0.70
0.71
0.72
0.73
0.74
T
I
F
F
A
N
Y
1
5
6
8
T
S
b
Area
between
–
X and Z
0.1915
0.1950
0.1985
0.2019
0.2054
0.2088
0.2123
0.2157
0.2190
0.2224
0.2257
0.2291
0.2324
0.2357
0.2389
0.2422
0.2454
0.2486
0.2517
0.2549
0.2580
0.2611
0.2642
0.2673
0.2704
c
Area
beyond
Z
0.3085
0.3050
0.3015
0.2981
0.2946
0.2912
0.2877
0.2843
0.2810
0.2776
0.2743
0.2709
0.2676
0.2643
0.2611
0.2578
0.2546
0.2514
0.2483
0.2451
0.2420
0.2389
0.2358
0.2327
0.2296
a
Z
0.75
0.76
0.77
0.78
0.79
0.80
0.81
0.82
0.83
0.84
0.85
0.86
0.87
0.88
0.89
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
b
Area
between
–
X and Z
c
Area
beyond
Z
0.2734 0.2266
0.2764 0.2236
0.2794 0.2206
0.2823 0.2177
0.2852 0.2148
0.2881 0.2119
0.2910 0.2090
0.2939 0.2061
0.2967 0.2033
0.2995 0.2005
0.3023 0.1977
0.3051 0.1949
0.3078 0.1922
0.3106 0.1894
0.3133 0.1867
0.3159 0.1841
0.3186 0.1814
0.3212 0.1788
0.3238 0.1762
0.3264 0.1736
0.3289 0.1711
0.3315 0.1685
0.3340 0.1660
0.3365 0.1635
0.3389 0.1611
(continues)
501
16304_APPB_Walker.indd 501
7/7/12 9:59:19 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
502 Appendix B
n
Statistical Tables
Table B-1 Area Under the Normal Curve (continued)
a
Z
1.00
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.13
1.14
1.15
1.16
1.17
1.18
1.19
1.20
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.28
1.29
1.30
1.31
1.32
1.33
1.34
1.35
1.36
1.37
1.38
1.39
1.40
1.41
1.42
1.43
1.44
b
Area
between
–
X and Z
0.3413
0.3438
0.3461
0.3485
0.3508
0.3531
0.3554
0.3577
0.3599
0.3621
0.3643
0.3665
0.3686
0.3708
0.3729
0.3749
0.3770
0.3790
0.3810
0.3830
0.3849
0.3869
0.3888
0.3907
0.3925
0.3944
0.3962
0.3980
0.3997
0.4015
0.4032
0.4049
0.4066
0.4082
0.4099
0.4115
0.4131
0.4147
0.4162
0.4177
0.4192
0.4207
0.4222
0.4236
0.4251
c
Area
beyond
Z
0.1587
0.1562
0.1539
0.1515
0.1492
0.1469
0.1446
0.1423
0.1401
0.1379
0.1357
0.1335
0.1314
0.1292
0.1271
0.1251
0.1230
0.1210
0.1190
0.1170
0.1151
0.1131
0.1112
0.1093
0.1075
0.1056
0.1038
0.1020
0.1003
0.0985
0.0968
0.0951
0.0934
0.0918
0.0901
0.0885
0.0869
0.0853
0.0838
0.0823
0.0808
0.0793
0.0778
0.0764
0.0749
a
Z
1.45
1.46
1.47
1.48
1.49
1.50
1.51
1.52
1.53
1.54
1.55
1.56
1.57
1.58
1.59
1.60
1.61
1.62
1.63
1.64
1.65
1.66
1.67
1.68
1.69
1.70
1.71
1.72
1.73
1.74
1.75
1.76
1.77
1.78
1.79
1.80
1.81
1.82
1.83
1.84
1.85
1.86
1.87
1.88
1.89
b
Area
between
–
X and Z
0.4265
0.4279
0.4292
0.4306
0.4319
0.4332
0.4345
0.4357
0.4370
0.4382
0.4394
0.4406
0.4418
0.4429
0.4441
0.4452
0.4463
0.4474
0.4484
0.4495
0.4505
0.4515
0.4525
0.4535
0.4545
0.4554
0.4564
0.4573
0.4582
0.4591
0.4599
0.4608
0.4616
0.4625
0.4633
0.4641
0.4649
0.4656
0.4664
0.4671
0.4678
0.4686
0.4693
0.4699
0.4706
c
Area
beyond
Z
0.0735
0.0721
0.0708
0.0694
0.0681
0.0668
0.0655
0.0643
0.0630
0.0618
0.0606
0.0594
0.0582
0.0571
0.0559
0.0548
0.0537
0.0526
0.0516
0.0505
0.0495
0.0485
0.0475
0.0465
0.0455
0.0446
0.0436
0.0427
0.0418
0.0409
0.0401
0.0392
0.0384
0.0375
0.0367
0.0359
0.0351
0.0344
0.0336
0.0329
0.0322
0.0314
0.0307
0.0301
0.0294
b
Area
between
–
X and Z
a
Z
1.90
1.91
1.92
1.93
1.94
1.95
1.96
1.97
1.98
1.99
2.00
2.01
2.02
2.03
2.04
2.05
2.06
2.07
2.08
2.09
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21
2.22
2.23
2.24
2.25
2.26
2.27
2.28
2.29
2.30
2.31
2.32
2.33
2.34
0.4713
0.4719
0.4726
0.4732
0.4738
0.4744
0.4750
0.4756
0.4761
0.4767
0.4772
0.4778
0.4783
0.4788
0.4793
0.4798
0.4803
0.4808
0.4812
0.4817
0.4821
0.4826
0.4830
0.4834
0.4838
0.4842
0.4846
0.4850
0.4854
0.4857
0.4861
0.4864
0.4868
0.4871
0.4875
0.4878
0.4881
0.4884
0.4887
0.4890
0.4893
0.4896
0.4898
0.4901
0.4904
L
I
D
D
E
L
L
,
T
I
F
F
A
N
Y
1
5
6
8
T
S
c
Area
beyond
Z
0.0287
0.0281
0.0274
0.0268
0.0262
0.0256
0.0250
0.0244
0.0239
0.0233
0.0228
0.0222
0.0217
0.0212
0.0207
0.0202
0.0197
0.0192
0.0188
0.0183
0.0179
0.0174
0.0170
0.0166
0.0162
0.0158
0.0154
0.0150
0.0146
0.0143
0.0139
0.0136
0.0132
0.0129
0.0125
0.0122
0.0119
0.0116
0.0113
0.0110
0.0107
0.0104
0.0102
0.0099
0.0096
a
Z
2.35
2.36
2.37
2.38
2.39
2.40
2.41
2.42
2.43
2.44
2.45
2.46
2.47
2.48
2.49
2.50
2.51
2.52
2.53
2.54
2.55
2.56
2.57
2.58
2.59
2.60
2.61
2.62
2.63
2.64
2.65
2.66
2.67
2.68
2.69
2.70
2.71
2.72
2.73
2.74
2.75
2.76
2.77
2.78
2.79
b
Area
between
–
X and Z
0.4906
0.4909
0.4911
0.4913
0.4916
0.4918
0.4920
0.4922
0.4925
0.4927
0.4929
0.4931
0.4932
0.4934
0.4936
0.4938
0.4940
0.4941
0.4943
0.4945
0.4946
0.4948
0.4949
0.4951
0.4952
0.4953
0.4955
0.4956
0.4957
0.4959
0.4960
0.4961
0.4962
0.4963
0.4964
0.4965
0.4966
0.4967
0.4968
0.4969
0.4970
0.4971
0.4972
0.4973
0.4974
c
Area
beyond
Z
0.0094
0.0091
0.0089
0.0087
0.0084
0.0082
0.0080
0.0078
0.0075
0.0073
0.0071
0.0069
0.0068
0.0066
0.0064
0.0062
0.0060
0.0059
0.0057
0.0055
0.0054
0.0052
0.0051
0.0049
0.0048
0.0047
0.0045
0.0044
0.0043
0.0041
0.0040
0.0039
0.0038
0.0037
0.0036
0.0035
0.0034
0.0033
0.0032
0.0031
0.0030
0.0029
0.0028
0.0027
0.0026
(continues)
16304_APPB_Walker.indd 502
7/7/12 9:59:19 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
Statistical Tables 503
Table B-1 Area Under the Normal Curve (continued)
a
b
c
Area
Area
between beyond
–
X and Z
Z
Z
2.80
2.81
2.82
2.83
2.84
2.85
2.86
2.87
2.88
2.89
2.90
2.91
2.92
2.93
2.94
0.4974
0.4975
0.4976
0.4977
0.4977
0.4978
0.4979
0.4979
0.4980
0.4981
0.4981
0.4982
0.4982
0.4983
0.4984
0.0026
0.0025
0.0024
0.0023
0.0023
0.0022
0.0021
0.0021
0.0020
0.0019
0.0019
0.0018
0.0018
0.0017
0.0016
a
Z
2.95
2.96
2.97
2.98
2.99
3.00
3.01
3.02
3.03
3.04
3.05
3.06
3.07
3.08
3.09
b
c
Area
Area
between beyond
–
X and Z
Z
0.4984
0.4985
0.4985
0.4986
0.4986
0.4987
0.4987
0.4987
0.4988
0.4988
0.4989
0.4989
0.4989
0.4990
0.4990
0.0016
0.0015
0.0015
0.0014
0.0014
0.0013
0.0013
0.0013
0.0012
0.0012
0.0011
0.0011
0.0011
0.0010
0.0010
Table B-2 Values of Chi-Square
a
Z
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.17
3.18
3.19
3.20
3.21
3.22
3.23
3.24
L
I
D
D
E
L
L
,
b
c
Area
Area
between beyond
–
X and Z
Z
0.4990
0.4991
0.4991
0.4991
0.4992
0.4992
0.4992
0.4992
0.4993
0.4993
0.4993
0.4993
0.4994
0.4994
0.4994
0.0010
0.0009
0.0009
0.0009
0.0008
0.0008
0.0008
0.0008
0.0007
0.0007
0.0007
0.0007
0.0006
0.0006
0.0006
a
Z
b
Area
between
–
X and Z
c
Area
beyond
Z
3.25
3.30
3.35
3.40
3.45
3.50
3.60
3.70
3.80
3.90
4.00
4.50
5.00
5.50
0.4994
0.4995
0.4996
0.4997
0.4997
0.4998
0.4998
0.4999
0.4999
0.49995
0.49997
0.4999966
0.4999997
0.4999999
0.0006
0.0005
0.0004
0.0003
0.0003
0.0002
0.0002
0.0001
0.0001
0.00005
0.00003
0.0000034
0.0000003
0.0000001
Probability (Top Row) and Significance (Bottom Row)
df
1
0.999
0.99
0.0001
0.01
10.827
6.635
2
13.815
9.210
3
16.268
11.345
4
18.465
13.277
5
20.517
15.086
6
22.457
16.812
7
24.322
18.475
8
26.125
20.090
9
27.877
21.666
10
29.588
23.209
11
31.264
24.725
12
32.909
26.217
13
34.528
27.688
14
36.123
29.141
15
37.697
30.578
T0.95
I0.05
F3.841
5.991
F7.815
A
9.488
11.070
N
12.592
Y
14.067
0.90
0.80
0.70
0.10
0.20
0.30
2.706
1.642
1.074
4.605
3.219
2.408
6.251
4.624
3.665
7.779
5.989
4.878
9.236
7.289
6.064
10.645
8.558
7.231
12.017
9.803
8.383
15.507
13.362
11.030
9.524
16.919
14.684
12.242
10.656
1
5
18.307
19.675
6
21.026
8
22.362
T
23.685
S
24.996
15.987
13.442
11.781
17.275
14.631
12.899
18.549
15.812
14.011
19.812
16.985
15.119
21.064
18.151
16.222
22.307
19.311
17.322
(continues)
16304_APPB_Walker.indd 503
7/7/12 9:59:20 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
504 Appendix B
n
Statistical Tables
Table B-2 Values of Chi-Square (continued)
Probability (Top Row) and Significance (Bottom Row)
df
16
0.999
0.99
0.95
0.90
0.80
0.70
0.0001
0.01
0.05
0.10
0.20
0.30
32.000
26.296
23.542
20.465
18.418
39.252
17
40.790
33.409
27.587
24.769
21.615
19.511
18
42.312
34.805
28.869
25.989
22.760
20.601
19
43.820
36.191
30.144
27.204
23.900
21.689
27
55.476
46.963
L
I
31.410
D
32.671
33.924D
35.172E
36.415L
37.652
L
38.885
,
40.113
28
56.893
48.278
41.337
20
45.315
37.566
21
46.797
38.932
22
48.268
40.289
23
49.728
41.638
24
51.179
42.980
25
52.620
44.314
26
54.052
45.642
29
58.302
49.588
42.557
30
59.703
50.892
43.773
T
I
F
F
A
N
Y
28.412
25.038
22.775
29.615
26.171
23.858
30.813
27.301
24.939
32.007
28.429
26.018
33.196
29.553
27.096
34.382
30.675
28.172
35.563
31.795
29.246
36.741
32.912
30.319
37.916
34.027
31.391
39.087
35.139
32.461
40.256
36.250
33.530
1
5
6
8
T
S
16304_APPB_Walker.indd 504
7/7/12 9:59:20 AM
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
Statistical Tables 505
Table B-3 Student’s t Distribution
Level of Significance for One-Tailed Test
0.10
0.05
0.025
0.01
0.005
0.0005
Level of Significance for Two-Tailed Test
df
0.20
0.10
0.05
0.02
0.01
0.001
636.620
1
3.078
6.314
12.706
31.821
63.657
2
1.886
2.920
4.303
6.965
9.925
31.598
3
1.638
2.353
4.541
5.841
12.941
13
1.350
1.771
L3.182
I2.776
2.571
D
2.447
D
2.365
2.306
E
2.262
L2.228
L2.201
2.179
,2.160
14
1.345
1.761
2.145
2.131
T2.120
I2.110
F2.101
2.093
F2.086
2.080
A
2.074
N
2.069
2.064
Y
4
1.533
2.132
5
1.476
2.015
6
1.440
1.943
7
1.415
1.895
8
1.397
1.860
9
1.383
1.833
10
1.372
1.812
11
1.363
1.796
12
1.356
1.782
15
1.341
1.753
16
1.337
1.746
17
1.333
1.740
18
1.330
1.734
19
1.328
1.729
20
1.325
1.725
21
1.323
1.721
22
1.321
1.717
23
1.319
1.714
24
1.318
1.711
3.747
4.604
8.610
3.365
4.032
6.859
3.143
3.707
5.959
2.998
3.499
5.405
2.896
3.355
5.041
2.821
3.250
4.781
2.764
3.169
4.587
2.718
3.106
4.437
2.681
3.055
4.318
2.650
3.012
4.221
2.624
2.977
4.140
2.602
2.947
4.073
2.583
2.921
4.015
2.567
2.898
3.965
2.552
2.878
3.922
2.539
2.861
3.883
2.528
2.845
3.850
2.518
2.831
3.819
2.508
2.819
3.792
2.500
2.807
3.767
2.492
2.797
3.745
25
1.316
1.708
2.060
2.485
2.787
3.725
26
1.315
1.706
2.056
2.479
2.779
3.707
27
1.314
1.703
2.473
2.771
3.690
28
1.313
1.701
29
1.311
1.699
30
1.310
1.697
40
1.303
1.684
60
1.296
1.671
120
1.289
1.658
∞
1.282
1.645
12.052
52.048
2.045
62.042
82.021
2.000
T1.980
1.960
S
2.467
2.763
3.674
2.462
2.756
3.659
2.457
2.750
3.646
2.423
2.704
3.551
2.390
2.660
3.460
2.358
2.617
3.373
2.326
2.576
3.291
(continues)
16304_APPB_Walker.indd 505
7/7/12 9:59:20 AM
16304_APPB_Walker.indd 506
5.9874
5.5914
5.3177
5.1174
4.9646
4.8443
4.7472
4.6672
4.6001
4.5431
4.494
4.4513
4.4139
4.3807
4.3512
4.3248
4.3009
4.2793
4.2597
4.2417
4.2252
4.21
4.196
4.183
4.1709
4.0847
4.0012
3.9201
3.8415
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
120
∞
df2=1
2
3
4
5
Distribution of f; p = .05
2.8524
2.81
2.7729
2.7401
2.7109
3.0069
2.9647
2.9277
2.8951
2.8661
1
5
6
8
T
S
3.2317
3.1504
3.0718
2.9957
3.369
3.3541
3.3404
3.3277
3.3158
2.8387
2.7581
2.6802
2.6049
2.9752
2.9604
2.9467
2.934
2.9223
2.606
2.5252
2.4472
2.3719
2.7426
2.7278
2.7141
2.7014
2.6896
2.8401
2.8167
2.7955
2.7763
2.7587
2.4495
2.3683
2.2899
2.2141
2.5868
2.5719
2.5581
2.5454
2.5336
2.6848
2.6613
2.64
2.6207
2.603
2.3359
2.2541
2.175
2.0986
2.4741
2.4591
2.4453
2.4324
2.4205
2.5727
2.5491
2.5277
2.5082
2.4904
2.7413
2.6987
2.6613
2.6283
2.599
3.0946
2.9961
2.9153
2.8477
2.7905
2.249
2.1665
2.0868
2.0096
2.3883
2.3732
2.3593
2.3463
2.3343
2.4876
2.4638
2.4422
2.4226
2.4047
2.6572
2.6143
2.5767
2.5435
2.514
3.0123
2.9134
2.8321
2.7642
2.7066
4.2067
3.787
3.5005
3.2927
3.1355
2.1802
2.097
2.0164
1.9384
2.3205
2.3053
2.2913
2.2783
2.2662
2.4205
2.3965
2.3748
2.3551
2.3371
2.5911
2.548
2.5102
2.4768
2.4471
2.948
2.8486
2.7669
2.6987
2.6408
4.1468
3.7257
3.4381
3.2296
3.0717
2.124
2.0401
1.9588
1.8799
2.2655
2.2501
2.236
2.2229
2.2107
2.366
2.3419
2.3201
2.3002
2.2821
2.5377
2.4943
2.4563
2.4227
2.3928
2.8962
2.7964
2.7144
2.6458
2.5876
4.099
3.6767
3.3881
3.1789
3.0204
3.9999
3.5747
3.2839
3.0729
2.913
2.7876
2.6866
2.6037
2.5342
2.4753
2.4247
2.3807
2.3421
2.308
2.2776
4.06
3.6365
3.3472
3.1373
2.9782
2.8536
2.7534
2.671
2.6022
2.5437
2.4935
2.4499
2.4117
2.3779
2.3479
2.0772
1.9926
1.9105
1.8307
2.2197
2.2043
2.19
2.1768
2.1646
2.321
2.2967
2.2747
2.2547
2.2365
T
I
F
F
A
N
Y
3.0725
3.0491
3.028
3.0088
2.9912
3.2389
3.1968
3.1599
3.1274
3.0984
3.2039
3.1059
3.0254
2.9582
2.9013
3.3567
3.2592
3.1791
3.1122
3.0556
4.2839
3.866
3.5806
3.3738
3.2172
2.0035
1.9174
1.8337
1.7522
2.1479
2.1323
2.1179
2.1045
2.0921
2.2504
2.2258
2.2036
2.1834
2.1649
1.9245
1.8364
1.7505
1.6664
2.0716
2.0558
2.0411
2.0275
2.0148
2.1757
2.1508
2.1282
2.1077
2.0889
2.3522
2.3077
2.2686
2.2341
2.2033
2.7186
2.6169
2.5331
2.463
2.4034
3.9381
3.5107
3.2184
3.0061
2.845
1.8389
1.748
1.6587
1.5705
1.9898
1.9736
1.9586
1.9446
1.9317
2.096
2.0707
2.0476
2.0267
2.0075
2.2756
2.2304
2.1906
2.1555
2.1242
2.6464
2.5436
2.4589
2.3879
2.3275
3.8742
3.4445
3.1503
2.9365
2.774
1.7929
1.7001
1.6084
1.5173
1.9464
1.9299
1.9147
1.9005
1.8874
2.054
2.0283
2.005
1.9838
1.9643
2.2354
2.1898
2.1497
2.1141
2.0825
2.609
2.5055
2.4202
2.3487
2.2878
3.8415
3.4105
3.1152
2.9005
2.7372
1.7444
1.6491
1.5543
1.4591
1.901
1.8842
1.8687
1.8543
1.8409
2.0102
1.9842
1.9605
1.939
1.9192
2.1938
2.1477
2.1071
2.0712
2.0391
2.5705
2.4663
2.3803
2.3082
2.2468
3.8082
3.3758
3.0794
2.8637
2.6996
2.4901
2.3842
2.2966
2.2229
2.1601
2.1058
2.0584
2.0166
1.9795
1.9464
2.5309
2.4259
2.3392
2.2664
2.2043
2.1507
2.104
2.0629
2.0264
1.9938
1.6928
1.5943
1.4952
1.394
1.8533
1.8361
1.8203
1.8055
1.7918
1.9645
1.938
1.9139
1.892
1.8718
1.6373
1.5343
1.429
1.318
1.8027
1.7851
1.7689
1.7537
1.7396
1.9165
1.8894
1.8648
1.8424
1.8217
3.7398
3.3043
3.0053
2.7872
2.6211
3.7743
3.3404
3.0428
2.8259
2.6609
L
I
D
D
E
L
L
,
3.4668
3.4434
3.4221
3.4028
3.3852
3.6337
3.5915
3.5546
3.5219
3.4928
4.3874
3.9715
3.6875
3.4817
3.3258
4.5337
4.1203
3.8379
3.6331
3.478
1.5766
1.4673
1.3519
1.2214
1.7488
1.7306
1.7138
1.6981
1.6835
1.8657
1.838
1.8128
1.7896
1.7684
2.0589
2.0107
1.9681
1.9302
1.8963
2.448
2.341
2.2524
2.1778
2.1141
3.7047
3.2674
2.9669
2.7475
2.5801
1.5089
1.3893
1.2539
1
1.6906
1.6717
1.6541
1.6376
1.6223
1.8117
1.7831
1.757
1.733
1.711
2.0096
1.9604
1.9168
1.878
1.8432
2.4045
2.2962
2.2064
2.1307
2.0658
3.6689
3.2298
2.9276
2.7067
2.5379
n
3.5874
3.4903
3.4105
3.3439
3.2874
4.7571
4.3468
4.0662
3.8625
3.7083
506 Appendix B
3.9823
3.8853
3.8056
3.7389
3.6823
5.1433
4.7374
4.459
4.2565
4.1028
df1=1
2
3
4
5
6
7
8
9
10
12
15
20
24
30
40
60
120
∞
161.4476 199.5
215.7073 224.5832 230.1619 233.986 236.7684 238.8827 240.5433 241.8817 243.906 245.9499 248.0131 249.0518 250.0951 251.1432 252.1957 253.2529 254.3144
18.5128 19
19.1643 19.2468 19.2964 19.3295 19.3532 19.371
19.3848 19.3959 19.4125 19.4291 19.4458 19.4541 19.4624 19.4707 19.4791 19.4874 19.4957
10.128
9.5521
9.2766
9.1172
9.0135
8.9406
8.8867
8.8452
8.8123
8.7855
8.7446
8.7029
8.6602
8.6385
8.6166
8.5944
8.572
8.5494
8.5264
7.7086
6.9443
6.5914
6.3882
6.2561
6.1631
6.0942
6.041
5.9988
5.9644
5.9117
5.8578
5.8025
5.7744
5.7459
5.717
5.6877
5.6581
5.6281
6.6079
5.7861
5.4095
5.1922
5.0503
4.9503
4.8759
4.8183
4.7725
4.7351
4.6777
4.6188
4.5581
4.5272
4.4957
4.4638
4.4314
4.3985
4.365
Table B-4
© Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION.
Statistical Tables
7/7/12 9:59:20 AM