### Unformatted Attachment Preview

FM.indd Page xxv 11/9/11 3:58:39 PM user-s163
user-F452
FM.indd Page i 11/10/11 3:45:17 PM user-s163
user-F452
The Basic Practice
of Statistics
SIXTH EDITION
D AV I D S . M O O R E
Purdue University
WILLIAM I. NOTZ
The Ohio State University
MICHAEL A. FLIGNER
The Ohio State University
W. H. Freeman and Company
New York
FM.indd Page ii 11/9/11 3:58:32 PM user-s163
Publisher: Ruth Baruth
Acquisitions Editor: Karen Carson
Executive Marketing Manager: Jennifer Somerville
Developmental Editors: Andrew Sylvester and Leslie Lahr
Senior Media Acquisitions Editor: Roland Cheyney
Senior Media Editor: Laura Capuano
Associate Editor: Katrina Wilhelm
Assistant Media Editor: Catriona Kaplan
Editorial Assistant: Tyler Holzer
Photo Editor: Cecilia Varas
Photo Researcher: Elyse Rieder
Cover and Text Designer: Blake Logan
Senior Project Editor: Mary Louise Byrd
Illustrations: Macmillan Solutions
Production Coordinator: Susan Wein
Composition: Aptara®, Inc.
Printing and Binding: Quad Graphics
Library of Congress Control Number:
2011934674
Student Edition (Hardcover w/cd) Student Edition (Paperback w/cd) Student Edition (Looseleaf w/cd)
ISBN-13: 978-1-4641-0254-7
ISBN-13: 978-1-4641-0434-3
ISBN-13: 978-1-4641-0433-6
ISBN-10: 1-4641-0254-6
ISBN-10: 1-4641-0434-4
ISBN-10: 1-4641-0433-6
© 2013, 2010, 2007, 2004 by W. H. Freeman and Company
All rights reserved
Printed in the United States of America
First printing
W. H. Freeman and Company
41 Madison Avenue
New York, NY 10010
Houndmills, Basingstoke RG21 6XS, England
www.whfreeman.com
user-F452
FM.indd Page iii 11/18/11 11:54:13 PM user-s163
user-F452
Brief Contents
Pa r t I
1
Exploring Data
Exploring Data: Variables and Distributions
CHAPTER 1
Picturing Distributions with Graphs 3
CHAPTER 2
Pa r t I I I
Describing Distributions with
Numbers 39
Quantitative Response Variable
Inference about a Population
Mean 437
Two-Sample Problems 465
Categorical Response Variable
CHAPTER 20 Inference about a Population
Proportion 493
The Normal Distributions 69
Exploring Data: Relationships
CHAPTER 4
Scatterplots and Correlation 97
CHAPTER 19
CHAPTER 5
Regression
125
CHAPTER 6
Two-Way Tables*
CHAPTER 7
Exploring Data: Part I Review
Pa r t I I
From Exploration to
Inference
197
159
175
CHAPTER 21
Comparing Two Proportions
CHAPTER 22
Inference about Variables: Part III
Review 533
Pa r t I V
Inference about
Relationships
Producing Data
Producing Data: Sampling
199
Producing Data: Experiments
Commentary: Data Ethics*
Probability and Sampling Distributions 246
CHAPTER 10 Introducing Probability 259
CHAPTER 9
435
CHAPTER 18
CHAPTER 3
CHAPTER 8
Inference about
Variables
Sampling Distributions 285
CHAPTER 12
General Rules of Probability* 307
Binomial Distributions* 331
Foundations of Inference
CHAPTER 14 Confidence Intervals: The Basics
351
CHAPTER 15
Tests of Significance: The Basics
369
CHAPTER 16
Inference in Practice
CHAPTER 17
From Exploration to Inference: Part II
Review 417
551
CHAPTER 23
Two Categorical Variables:
The Chi-Square Test 553
CHAPTER 24
Inference for Regression
CHAPTER 25
One-Way Analysis of Variance:
Comparing Several Means 623
Pa r t V
Optional Companion
Chapters
223
CHAPTER 11
515
CHAPTER 13
587
(available on the BPS CD and online)
391
CHAPTER 26
Nonparametrics Tests
26-3
CHAPTER 27
Statistical Process Control
CHAPTER 28
Multiple Regression*
CHAPTER 29
More about Analysis of Variance 29-3
27-3
28-3
*Starred material is not required for later parts of the text.
iii
FM.indd Page iv 11/9/11 3:58:32 PM user-s163
user-F452
Detailed Table of Contents
To the Instructor viii
Media and Supplements xix
About the Authors xxiv
To the Student xxvi
Pa r t I
CHAPTER 4
Scatterplots and Correlation
1
Exploring Data
CHAPTER 1
Picturing Distributions with Graphs 3
Individuals and variables 3
Categorical variables: pie charts and bar graphs
Quantitative variables: histograms 11
Interpreting histograms 15
Quantitative variables: stemplots 20
Time plots 23
6
Measuring center: the mean 40
Measuring center: the median 41
Comparing the mean and the median 42
Measuring spread: the quartiles 43
The five-number summary and boxplots 45
Spotting suspected outliers* 48
Measuring spread: the standard deviation 49
Choosing measures of center and spread 51
Using technology 53
Organizing a statistical problem 55
Regression lines 125
The least-squares regression line 128
Using technology 130
Facts about least-squares regression 132
Residuals 135
Influential observations 139
Cautions about correlation and regression 142
Association does not imply causation 144
CHAPTER 6
Two-Way Tables* 159
Marginal distributions 160
Conditional distributions 162
Simpson’s paradox 166
CHAPTER 7
Exploring Data: Part I Review
Part I summary 177
Test yourself 180
Supplementary exercises
175
191
69
Density curves 69
Describing density curves 73
Normal distributions 75
The 68–95–99.7 rule 77
The standard Normal distribution 80
Finding Normal proportions 81
Using the standard Normal table 83
Finding a value given a proportion 86
*Starred material is not required for later parts of the text.
iv
Explanatory and response variables 97
Displaying relationships: scatterplots 99
Interpreting scatterplots 101
Adding categorical variables to scatterplots 104
Measuring linear association: correlation 106
Facts about correlation 108
CHAPTER 5
Regression 125
CHAPTER 2
Describing Distributions with Numbers 39
CHAPTER 3
The Normal Distributions
97
Pa r t I I
From Exploration
to Inference
CHAPTER 8
Producing Data: Sampling
199
Population versus sample 199
How to sample badly 202
Simple random samples 203
197
FM.indd Page v 11/9/11 3:58:32 PM user-s163
user-F452
•
Inference about the population 208
Other sampling designs 209
Cautions about sample surveys 210
The impact of technology 213
CHAPTER 9
Producing Data: Experiments
CHAPTER 13
Binomial Distributions*
232
351
The reasoning of tests of significance 370
Stating hypotheses 372
P-value and statistical significance 374
Tests for a population mean 378
Significance from a table* 382
253
CHAPTER 16
Inference in Practice 391
Conditions for inference in practice 392
Cautions about confidence intervals 395
Cautions about significance tests 397
Planning studies: sample size for confidence intervals 401
Planning studies: the power of a statistical test* 402
268
CHAPTER 11
Sampling Distributions 285
Parameters and statistics 285
Statistical estimation and the law of large numbers
Sampling distributions 290 _
The sampling distribution of x 293
The central limit theorem 295
CHAPTER 12
General Rules of Probability*
CHAPTER 14
Conﬁdence Intervals: The Basics
CHAPTER 15
Tests of Signiﬁcance: The Basics 369
259
The idea of probability 260
The search for randomness* 262
Probability models 264
Probability rules 266
Finite and discrete probability models
Continuous probability models 271
Random variables 275
Personal probability* 276
331
The reasoning of statistical estimation 352
Margin of error and confidence level 354
Confidence intervals for a population mean 357
How confidence intervals behave 361
Commentary: Data Ethics* 246
Institutional review boards 248
Informed consent 248
Confidentiality 250
Clinical trials 252
Behavioral and social science experiments
v
The binomial setting and binomial distributions 331
Binomial distributions in statistical sampling 333
Binomial probabilities 334
Using technology 336
Binomial mean and standard deviation 338
The Normal approximation to binomial distributions 340
223
Observation versus experiment 223
Subjects, factors, treatments 225
How to experiment badly 228
Randomized comparative experiments 229
The logic of randomized comparative experiments
Cautions about experimentation 234
Matched pairs and other block designs 236
CHAPTER 10
Introducing Probability
D E T A I L E D T A BL E O F CON TE N TS
287
Part II summary 419
Test yourself 423
Supplementary exercises
Pa r t I I I
307
Independence and the multiplication rule
The general addition rule 312
Conditional probability 314
The general multiplication rule 316
Independence again 318
Tree diagrams 318
CHAPTER 17
From Exploration to Inference: Part II Review
431
Inference about
435
Variables
308
417
CHAPTER 18
Inference about a Population Mean
437
Conditions for inference about a mean 437
The t distributions 438
The one-sample t confidence interval 440
FM.indd Page vi 11/18/11 11:53:50 PM user-s163
vi
user-F452
DETA ILED TA B LE O F CO N T E N T S
The one-sample t test 443
Using technology 446
Matched pairs t procedures 449
Robustness of t procedures 452
The chi-square test statistic 560
Cell counts required for the chi-square test 561
Using technology 562
Uses of the chi-square test 567
The chi-square distributions 570
The chi-square test for goodness of fit* 572
CHAPTER 19
Two-Sample Problems 465
CHAPTER 24
Inference for Regression
Two-sample problems 465
Comparing two population means 466
Two-sample t procedures 469
Using technology 474
Robustness again 477
Details of the t approximation* 480
Avoid the pooled two-sample t procedures* 481
Avoid inference about standard deviations* 482
CHAPTER 25
One-Way Analysis of Variance: Comparing Several
Means 623
The sample proportion p̂ 494
Large-sample confidence intervals for a proportion 496
Accurate confidence intervals for a proportion 499
Choosing the sample size 502
Significance tests for a proportion 504
Comparing several means 625
The analysis of variance F test 625
Using technology 628
The idea of analysis of variance 631
Conditions for ANOVA 633
F distributions and degrees of freedom
Some details of ANOVA* 640
515
Two-sample problems: proportions 515
The sampling distribution of a difference between
proportions 516
Large-sample confidence intervals for comparing
proportions 517
Using technology 518
Accurate confidence intervals for comparing proportions
Significance tests for comparing proportions 522
Notes and Data Sources
Tables
520
CHAPTER 22
Inference about Variables: Part III Review 533
Part III summary 536
Test yourself 538
Supplementary exercises
587
Conditions for regression inference 589
Estimating the parameters 590
Using technology 593
Testing the hypothesis of no linear relationship 597
Testing lack of correlation 598
Confidence intervals for the regression slope 600
Inference about prediction 602
Checking the conditions for inference 607
CHAPTER 20
Inference about a Population Proportion 493
CHAPTER 21
Comparing Two Proportions
•
655
675
TABLE A
TABLE B
TABLE C
TABLE D
TABLE E
Standard Normal probabilities 676
Random digits 678
t distribution critical values 679
Chi-square distribution critical values 680
Critical values of the correlation r 681
Answers to Selected Exercises
Index
545
Inference about
Relationships
CHAPTER 23
Two Categorical Variables: The Chi-Square Test 553
Two-way tables 553
The problem of multiple comparisons 556
Expected counts in two-way tables 558
551
682
733
Pa r t V
Pa r t I V
637
Optional Companion
Chapters
(available on the BPS CD and online)
CHAPTER 26
Nonparametric Tests 26-3
Comparing two samples: the Wilcoxon rank sum test
The Normal approximation for W 26-8
26-4
FM.indd Page vii 11/9/11 3:58:32 PM user-s163
user-F452
•
Using technology 26-10
What hypotheses does Wilcoxon test? 26-13
Dealing with ties in rank tests 26-14
Matched pairs: the Wilcoxon signed rank test 26-19
The Normal approximation for W ⫹ 26-22
Dealing with ties in the signed rank test 26-24
Comparing several samples: the Kruskal-Wallis test 26-27
Hypotheses and conditions for the Kruskal-Wallis test 26-29
The Kruskal-Wallis test statistic 26-29
CHAPTER 27
Statistical Process Control
27-3
Processes 27-4
Describing processes 27-4
The
_ idea of statistical process control 27-9
x charts for process monitoring 27-10
s charts for process monitoring 27-16
Using control charts 27-23
Setting up control charts 27-25
Comments on statistical control 27-32
Don’t confuse control with capability! 27-34
Control charts for sample proportions 27-36
Control limits for p charts 27-37
D E T A I L E D T A BL E O F CON TE N TS
CHAPTER 28
Multiple Regression* 28-3
Parallel regression lines 28-4
Estimating parameters 28-8
Using technology 28-13
Inference for multiple regression 28-16
Interaction 28-26
The multiple linear regression model 28-32
The woes of regression coefficients 28-39
A case study for multiple regression 28-41
Inference for regression parameters 28-53
Checking the conditions for inference 28-58
CHAPTER 29
More about Analysis of Variance
29-3
Beyond one-way ANOVA 29-3
Follow-up analysis: Tukey pairwise multiple
comparisons 29-8
Follow-up analysis: contrasts* 29-12
Two-way ANOVA: conditions, main effects, and
interaction 29-16
Inference for two-way ANOVA 29-23
Some details of two-way ANOVA* 29-32
vii
FM.indd Page viii 11/9/11 3:58:33 PM user-s163
user-F452
To the Instructor: About this Book
elcome to the sixth edition of The Basic Practice of Statistics. This book
is the cumulation of 40 years of teaching undergraduates and 20 years of
writing texts. Previous editions have been very successful, and we think
that this new edition is the best yet. In this preface we describe for instructors the
nature and features of the book and the changes in this sixth edition.
BPS is designed to be accessible to college and university students with limited
quantitative background—“just algebra” in the sense of being able to read and use
simple equations. It is usable with almost any level of technology for calculating
and graphing—from a $15 “two-variable statistics” calculator through a graphing
calculator or spreadsheet program through full statistical software. Of course, graphs
and calculations are less tedious with good technology, so we recommend making
available to students the most effective technology that circumstances permit.
Despite its rather low mathematical level, BPS is a “serious” text in the
sense that it wants students to do more than master the mechanics of statistical calculations and graphs. Even quite basic statistics is very useful in many
fields of study and in everyday life, but only if the student has learned to move
from a real-world setting to choose and carry out statistical methods and then
carry conclusions back to the original setting. These translations require some
conceptual understanding of such issues as the distinction between data analysis and inference, the critical role of where the data come from, the reasoning
of inference, and the conditions under which we can trust the conclusions of
inference. BPS tries to teach both the mechanics and the concepts needed for
practical statistical work, at a level appropriate for beginners.
BPS is designed to reflect the actual practice of statistics, where data analysis
and design of data production join with probability-based inference to form a
coherent science of data. There are good pedagogical reasons for beginning with
data analysis (Chapters 1 to 7), then moving to data production (Chapters 8
and 9), and then to probability (Chapters 10 to 13) and inference (Chapters 14 to
29). In studying data analysis, students learn useful skills immediately and get over
some of their fear of statistics. Data analysis is a necessary preliminary to inference
in practice, because inference requires clean data. Designed data production is the
surest foundation for inference, and the deliberate use of chance in random sampling and randomized comparative experiments motivates the study of probability
in a course that emphasizes data-oriented statistics. BPS gives a full presentation
of basic probability and inference (20 of the 29 chapters) but places it in the
context of statistics as a whole.
W
GUIDING PRINCIPLES AND THE GAISE GUIDELINES
David Moore has based BPS on three principles: balanced content, experience with
data, and the importance of ideas. These principles are widely accepted by statisticians concerned about teaching and are directly connected to and reflected by the
viii
FM.indd Page ix 11/9/11 3:58:33 PM user-s163
user-F452
•
T O T H E I N STRUC TOR
themes of the College Report of the Guidelines for Assessment and Instruction in
Statistics Education (GAISE) Project.
The GAISE Guidelines includes six recommendations for the introductory
statistics course. The content, coverage, and features of BPS are closely aligned to
these recommendations:
1. Emphasize statistical literacy and develop statistical thinking.
The intent of BPS is to be modern and accessible. The exposition is straightforward and concentrates on major ideas and skills. One principle of writing
for beginners is not to try to tell students everything you know. Another
principle is to offer frequent stopping points, marking off digestible bites
of material. Statistical literacy is promoted throughout BPS in the many
examples and exercises drawn from the popular press and from many fields
of study. Statistical thinking is promoted in examples and exercises that give
enough background to allow students to consider the meaning of their calculations. Exercises often ask for conclusions that are more than a number (or
“reject H0”). Some exercises require judgment in addition to right-or-wrong
calculations and conclusions. Statistics, more than mathematics, depends on
judgment for effective use. BPS begins to develop students’ judgment about
statistical studies.
2. Use real data. The study of statistics is supposed to help students work
with data in their varied academic disciplines and in their unpredictable later
employment. Students learn to work with data by working with data. BPS is full
of data from many fields of study and from everyday life. Data are more than mere
numbers—they are numbers with a context that should play a role in making sense of the numbers and in stating conclusions. Examples and exercises
in BPS, though intended for beginners, use real data and give enough background to allow students to consider the meaning of their calculations.
3. Stress conceptual understanding rather than mere knowledge
of procedures. A first course in statistics introduces many skills, from
making a stemplot and calculating a correlation to choosing and carrying out
a significance test. In practice (even if not always in the course), calculations
and graphs are automated. Moreover, anyone who makes serious use of statistics will need some specific procedures not taught in her college stat course.
BPS therefore tries to make clear the larger patterns and big ideas of statistics,
not in the abstract, but in the context of learning specific skills and working
with specific data. Many of the big ideas are summarized in graphical outlines.
Three of the most useful appear inside the front cover. Formulas without guiding principles do students little good once the final exam is past, so it is worth
the time to slow down a bit and explain the ideas.
4. Foster active learning in the classroom. Fostering active learning is
the business of the teacher, though an emphasis on working with data helps.
To this end, we have created interactive applets to our specifications and made
them available online and on the text CD. The applets are designed primarily to help in learning statistics rather than in doing statistics. An icon calls
ix
FM.indd Page x 11/9/11 3:58:33 PM user-s163
x
user-F452
TO THE INS TR U CTO R
•
attention to comments and exercises based on the applets. We suggest using
selected applets for classroom demonstrations even if you do not ask students
to work with them. The Correlation and Regression, Confidence Interval, and
P-value applets, for example, convey core ideas more clearly than any amount
of chalk and talk.
We also provide Web exercises at the end of each chapter. Our intent
is to take advantage of the fact that most undergraduates are “Web savvy.”
These exercise ...