CHAPTER 13
FURTHER APPLICATIONS
13.1 ABOUT THIS CHAPTER
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
There are many areas of application of statistics in
business and industry beyond those discussed in earlier
chapters. We briefly consider some of these here. We
begin with three industries that have traditionally used
statistics extensively: the food, beverage, and related
industries; the semiconductor industry; and the
communications industry. Next, we discuss statistical
image analysis – a relatively new field with important
medical, security, and other applications. We then provide
short descriptions of the role of statistics in various other
areas and conclude with a glance into the future.
We do not cover all the bases; but the discussion should
demonstrate the use of statistics in addressing diverse
problems in business and industry (and beyond). Many of
our comments are based upon inputs from those working
in the areas discussed.
13.2
FOOD,
INDUSTRIES 1
BEVERAGE,
AND
RELATED
13.2.1 The Setting: From Soup to Soap
Modern conveniences that “enrich” our lives – such as
microwave popcorn, frozen gourmet dinners, and instant
chicken soup – would not have been possible without
574
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
scientific
advances
in chemistry,
biochemistry,
microbiology, and physics. The food and beverage
industry has become a high-technology business. It
provides challenging opportunities for the application of
statistics in areas ranging from market studies to product
development to manufacturing. Similar issues also arise in
applications involving many other consumer products,
especially the ones that are purchased repeatedly – such as
suntan lotion, detergents, light bulbs, bug killer sprays, and
paper tissues.
13.2.2 Controlling Product Variability
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Reducing product variability is a major goal in the food,
beverage, and related industries. These industries, in fact,
provide an ultimate example of the need for robust design.
There are differences in the types of applications even
within the food and beverage industry, especially between
raw foods, such as fresh fruits, vegetables, eggs, and raw
meats; and processed foods, such as cereals and soft
drinks.
Raw Foods Raw foods are often subject to variability in
appearance and taste. These differences are often due to
variations in nature, especially fluctuations in temperature
and rainfall and differences in soil fertility. Such
environmental (or noise) variables are difficult or
impossible to control. Consumers recognize this and
expect – and sometimes even enjoy – some product
variability, such as between early season, mid-season and
late season fruits. In addition, for most raw foods, the
575
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
purchaser can examine, and sometimes even sample, the
product prior to purchase.
There are, moreover, various control variables that can be
set to make raw foods more homogeneous. These range
from the type and amount of fertilizer to apply to when to
harvest and market. In addition, the product itself might be
controlled by, for example, spraying with varying amounts
of preservative.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
The challenges of growing foods more scientifically led to
the development of the statistical design of experiments in
agriculture during the early part of the last century (Section
2.2.7). These applications tended to focus on improving
product yield (e.g., growing the most vegetables or the
biggest tomatoes), rather than on minimizing variability.
Processed Products Consumers are less willing to accept
variability in processed products. In buying chocolate chip
cookies, canned soups, bottled beer, and so on, they expect
a product to look and taste the same from one purchase to
the next. We want cheeses from different packages of a
particular brand to taste the same, despite the fact that they
come from milk from different cows on different farms at
different times. Also, cereal manufacturers need combat
the tendency of ingredients with different densities (e.g.,
nuts, raisins, and grains) to separate during packaging.
Similar requirements hold for other repetitively purchased
products. Tissues, for example, need to have the same soft
feel from one package to the next.
Manufacturers strive to achieve product homogeneity for
processed products by:
576
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
• Working with suppliers to minimize variability.
• Understanding the factors that induce variability during
manufacture and addressing these.
• Varying processing conditions to adjust for the variability in
raw products. For example, to achieve homogeneity in
roasted peanuts, measurements on such factors as sugar
content and moisture are obtained on crop samples during
harvesting. This information is then used to set processing
(e.g., roasting) conditions.
• Blending product grown at different times or in different
places. The processing of flour, for example, involves
mixing varieties of soft and hard grains, whose properties
vary over time. Homogeneity in the final product is attained
by adjusting the mixing proportions, based upon product or
process measurements.
As for other products, producers need also be concerned
with robustness to usage conditions. A dishwasher
detergent, for example, needs to be effective in cleaning
dishes regardless of water hardness.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.2.3 Data Issues
In the food, beverage, and related industries, as in other
businesses, data are obtained to help determine what
features a product needs to have (and what must be done)
to capture the highest possible consumer interest. The
ultimate judgment for food and beverage products often
involves an assessment of taste and appearance; health
considerations are also becoming increasingly significant.
Other characteristics may be important for nonfood
products; the users of a shampoo might, for example, be
concerned with aroma.
577
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Sources of Data Foods, beverages, and related products
are typically assessed by:
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
• Consumers (or consumer panels): These generally consist of
(potential) purchasers who provide their personal
preferences about product characteristics. These are solicited
during design and marketing (Section 3.3.1), or they might
be on a (possibly prototype) product that has already been
made. Consumers describe what they like and dislike about
the product, how it compares with competitive products, and
gauge their ultimate satisfaction. The results may be used to
make improvements and, possibly, to formulate advertising
claims (Section 9.8.3).
• Instruments: These involve measurements on such product
properties as color, texture, and composition to quantify
chemical or physical characteristics.
• Experts (or panels of experts): These typically are trained
employees who make sensory assessments of product
properties that cannot be measured by instruments (e.g.,
juiciness, puffiness, and aroma), generally without making
value judgments. A typical (and highly envied) example is
that of wine tasters. Such evaluations are most often made
on final product as a quality check.
A common (simplified) sequence of activities is displayed
in Figure 13.1.
Types of Data Measurements obtained by instruments are,
by and large, similar to those for other products. In
contrast, evaluations by consumers tend to be more
subjective and are often expressed on an ordinal (intensity
or acceptance) scale, such as from 1 to 5. Such numbers
may express preferences running from completely
unimportant to highly important or assessments of the
product as poor, fair, good, very good, and outstanding.
Relative product rankings may be used for assessing
578
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
competitive products. Sensory assessments by experts,
such as an assessment of the degree of dryness of a wine or
softness of a tissue, may also be on such a scale.
Figure 13.1 Data sources in life cycle of food products.
Data Gathering As in other applications, the
data-gathering process is highly important. We generally
strive to obtain a random sample of product, and to come
as close as possible to a random sample of consumer
evaluators.2
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
When many samples are to be evaluated in the same
session, protocols that minimize the impact of the
sequence of evaluation need to be developed; this is
especially important in taste tests.3
The resulting sensory and preference data are inherently
variable; there are differences in how individuals perceive
product attributes, how consumers and testers use scoring
scales, and their stamina over multiple samples.
Benchmarks for consistently assigning scores need to be
established. GRR studies (Section 3.4.2) might be
conducted to assess measurement consistency.
Data Analysis As we have seen, the data resulting from
consumer preference and expert sensory studies are often
ordinal (e.g., intensity scale from 1 to 5) or categorical
(e.g., fruity, acidity, and bitter taste). Statistical analysis of
579
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
such data requires special methods (see Agresti 2002,
2007; Hildebrand, Laing, and Rosenthal 1977). (Also see
Lea, Naes, and Rodbotten 1997 for a discussion of
specialized methods for the analysis of sensory data.)
13.2.4 Product Formulation: Mixture Problems
As a Southern general wrote, mixing mint juleps “is a rite
that must not be entrusted to a novice, a statistician, nor a
Yankee.”
—W. Faulkner
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Statistically planned experiments are frequently used to
help develop the best possible food, beverage or related
product, based upon consumer preferences, manufacturing
efficiency, cost considerations – or, most likely, a
combination (see Hare 2006). These typically call for
determining an optimum product formulation, as well as
optimum processing conditions.
Processed foods generally involve mixtures of ingredients.
For a baked product, for example, we may wish to
determine the best blend or mix of ingredients (e.g., yeast,
flour, water, shortening, eggs, salt, and spices). This
suggests the use of a specialized type of designed
experiment, known as a mixture experiment – so-called
because it involves a mixture of product ingredients that is
constrained to add to 100%. Cornell (2002) describes a
mixture experiment to identify the optimal blend of three
fish species (mullet, sheephead, and croaker) and the best
settings of the process factors (duration of deep fat frying,
580
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
cooking temperature, and time) to make fish patties with
consistent texture.
See Cornell (2002) and Smith (2005) for detailed
discussions of the planning and analysis of mixture
experiments.
13.2.5 Manufacturing
The manufacture of processed foods involves such
operations as mixing, storing, drying, condensation, and
heat treatment. These operations are similar to those
encountered for many continuous processes and
considerations similar to those discussed in earlier chapters
apply.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.2.6 Ensuring Food Safety
The Basic Concern Ensuring a safe product is a key
concern to the food and beverage industry. Potential
hazards can be biological (e.g., a microbe), chemical (e.g.,
a toxin), or physical, such as glass or metal fragments
implanted in the food.4
The Traditional Approach to Ensuring Safety Industry
and regulators have traditionally depended on spot-checks
of manufacturing and random sampling of final product to
help ensure food safety. This approach, though necessary,
tends to be reactive and often fails to identify problems
and their causes speedily.
A Proactive Approach The U.S. Food and Drug
Administration and Department of Agriculture are
581
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
promoting a preventive approach to food safety, known as
Hazard Analysis and Critical Control Point (HACCP)
similar to the proactive approach for product design,
described in Chapter 3, “for use in all segments of the food
industry
from
growing,
harvesting,
processing,
manufacturing, distributing, and merchandising to
preparing food for consumption.”5 Issues addressed range
from analyzing “potential safety and health hazards” to
establishing “recordkeeping that ensures good data are
consistently obtained and maintained on potential hazards
and their causes.” (See Surak 2007 for further
information.)
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.2.7 Food Spoilage
Spoilage of a product over time is a further concern of the
food and beverage industry, especially for such products as
milk, cheese, and meats. There are again strict guidelines
and standards to which the industry need adhere. Many
foods are labeled with a date by which the product should
be sold or used. The determination of such dates raises
issues similar to those for pharmaceuticals (Section
10.4.4).
13.3 SEMICONDUCTOR INDUSTRY 6
13.3.1 The Setting
Semiconductors, by controlling the flow of electrical
signals, are the engine behind much of today’s advanced
electronic technologies (e.g., computers, cell phones,
digital televisions, airplane control systems, etc). They
may be in the form of discrete devices, such as individual
582
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
transistors, resistors, capacitors and diodes. Or they may be
connected together in large numbers on a continuous
substrate as an integrated circuit (IC).
The basic element is a chip, built in a tiny rectangle on a
thin wafer sawed from a cylindrical ingot of extremely
pure, crystalline silicon. A chip may have millions of
transistors. The dimensions are, moreover, getting
continuously smaller – even as the specifications on
properties are getting tighter. A single wafer can yield
hundreds, thousands, or even tens of thousands of chips.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.3.2 Business Features
The
Manufacturing
Process
Semiconductor
manufacturing takes place in clean rooms and involves
complex processes, starting with the fabrication of a seed
crystal. This is sliced into wafers that may range in size
from a small chocolate chip cookie to a large pizza. The
processing of wafers typically requires many steps; several
wafers are often worked together in lots. The wafers are
eventually diced and fabricated into ICs or discrete
devices.
Process Measurements Automated measurements are
taken during and after each step, or after several steps, of
the fabrication process on all or selected wafers and chips
and on the generally expensive tools doing the processing.
More intensive measurements might be taken when the
process is deemed “unhealthy.” Measurements may be
used for early-stage rejection (to avoid the cost of further
processing of inferior material) or rework or for adaptive
control (see Sidebar 13.1). Further measurements, often
583
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
based on machine sensor trace data, are used for fault
detection, and still others for control charting (Section 6.7)
and for providing information for improving the process.
SIDEBAR 13.1: INTEGRATING ENGINEERING
PROCESS CONTROL (EPC) AND SPC IN THE
SEMICONDUCTOR INDUSTRY
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
The integration of EPC and SPC (Section 8.2) in
semiconductor fabrication may involve using the
results from one stage of wafer processing to
automatically make changes in the process settings
in subsequent steps. This can get a drifting process
back on target and/or reduce variability, while at
the same time allowing the variability to be
monitored by control charts. Sachs, Hu, and
Ingolfsson (1995) describe an application aimed a
keeping the thickness of epitaxy layers on wafers
on target.
End-of-Line Measurements A series of automated
measurements on electrical properties, such as on current
and threshold voltage, are typically taken on each chip at
the end of the process. The chip is accepted if all
measurements meet specifications (see Sidebar 13.2).
SIDEBAR 13.2: END-OF-LINE YIELD
584
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
End-of-line yield is the percentage of chips that
meets final product specifications on all
measurements. End-of-line yield is often low for
new products; yields of less than 50% are not
uncommon and may be initially acceptable. Due to
the high volume of product and the low cost of
inspection, economic analysis may suggest 100%
end-of-line inspection to control quality of early
product – while striving to improve the process
upstream.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Over time, end-of-line yield needs to improve
dramatically. The ability of an operation to raise its
yield rapidly is a hallmark of its success and
involves all stages of production. Mature processes
typically and consistently yield over 90%
acceptable product.
The accepted chips may be sorted into groups,
based upon their gigahertz rating and possibly
other properties. It is important that the process
yields a high percentage of “fastrunning” chips;
these command the highest price. If a chip’s
electrical characteristics do not fully meet prime
product requirements, it may, however, still be
useable as a less-demanding product.
Complex statistical yield models have been
developed. Articles on this subject appear in the
IBM Journal of Research and Development and in
585
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Semiconductor International, including a series of
papers by Atchison and Ross (1999).
A Data-Intensive Business Quoting Czitrom and Spagon
(1997), “the chemistry and physics of these
(semiconductor) processes are not well understood....
Semiconductor manufacturing is thus a measurement
intensive business.”
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Statistical methods harness the data to improve each of the
many process steps and tie them together. They have also
helped shift industry thinking from the compilation of
yield rates to understanding underlying causes and how to
address them. Data-mining methods are used to find
patterns of defects traceable to a particular process step;
statistically designed experiments and simulation are also
increasingly employed. These are integrated with whatever
physical models are available.
13.3.3 Understanding the Impact of Process Variables
upon Performance
A key industry goal is to improve productivity and yield
across the board, and to establish optimum process
specifications. It is, therefore, critical to gain an
understanding of how the many factors associated with
processing (e.g., photolithography, etching, stripping,
diffusion, ion implantation, and deposition) impact
electrical performance and yield.
586
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Such understanding is sometimes gained, especially during
early stages of development, through laboratory-scale
investigations. Frequently, however, studies are conducted
directly on a (pilot) production line with good units sold to
customers. A manufacturer might, for example, make a
variety of design masks to determine which provides the
best results. Experiments may focus on a specific process
step or several such steps. Such studies raise various
technical challenges (see Sidebar 13.3).
SIDEBAR
13.3:
CHALLENGES
SOME
TECHNICAL
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Challenges in studies to understand the impact of
process variables upon semiconductor performance
include:
• The frequent need to consider a large number of
potential impacting variables, especially in
experiments involving multiple processing stages.
This typically results in the intitial use of
screening experiments (such as fractional factorial
plans) to efficiently identify a smaller set of
critical factors.
• Interactions between variables in different steps
of the process – for example, between implant
and anneal.
• The multivariate nature of such studies – due to
multiple, and often correlated, response variables.
(A response variable at one stage might be a
process variable in a subsequent stage.)
• The relatively long elapsed time to conduct even
a single production run. The fabrication process
often spans weeks and even months.
587
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
• The constant pressure to upgrade, calibrate, and
maintain state-of-the-art metrology instruments
capable of measuring nanometer features with
accuracy and precision.
(See Baron, Takken, Yashchin, and Lanzerotti
2004 for further discussion and references.)
13.3.4 Monitoring and Controlling Critical Process
Variables
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Control charts, especially
and range charts, are often
used to monitor critical parameters and to detect changes
throughout the manufacturing process. These help decide
when corrective action needs to be taken to ensure the
process stays on target and to reduce variability.
Understanding the structure of the process is critical to
establishing the most effective groupings for this purpose;
wafers or sometimes locations on a wafer are often used.
Due to the numerous operations involved in semiconductor
fabrication and the many measurements taken, the number
of such control charts can be quite large. Therefore, it is
useful to develop an automated approach for assessing the
results and providing appropriate notifications for
out-of-control situations, while striving to minimize the
number of “false alarms.”7
13.3.5 Assessing Product Life
Accelerated life tests (Section 5.7.8), especially at
high-voltage stress, are frequently used to assess the life of
588
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
ICs, as well as that of printed circuit boards, and larger
assemblies. Observed failure mechanisms need to be
understood and modeled both physically and statistically to
ensure they do not have a deleterious impact on usable
product life (see Li, Christiansen, Gill, Sullivan, Yashchin,
and Filippi 2006; Li, Yashchin, Christiansen, Gill, Filippi,
and Sullivan 2006).
13.3.6 Product Burn-In
Newly designed ICs are frequently subjected to burn-in
(Section 5.11) prior to shipment to minimize infant
mortalities in the field. Such burn-in might be discontinued
as the product matures. Statistical concepts help determine
how long units should be burned in, based upon cost
considerations (see Kim and Kuo 1998).
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.3.7 Sources of Further Information
Czitrom (2003) provides a detailed discussion of statistical
approaches and methods dealing with process
improvement for the semiconductor industry (see Sidebar
13.4) and Czitrom and Spagon (1997) describe 24
statistical case studies.
SIDEBAR 13.4: STATISTICAL METHODS
USED IN SEMICONDUCTOR INDUSTRY
589
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Many of the statistical methods Czitrom (2003)
describes have already been mentioned in this and
earlier chapters. These include:
• GRR studies to quantify measurement variability
and act thereon.
• Multi-vari charts to compare characteristics over
different positions on a wafer, different wafers
and different manufacturing lots.
• Variance component analysis to quantify the
relative contribution to total variability of, say,
lots, wafers, positions within a wafer, individual
units, and measurement error.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Other methods tend to be unique to the
semiconductor industry. For example, wafer maps
provide a graphical comparison of a measured
performance characteristic by wafer site.
Also, the (NIST/Sematech) online Engineering Statistics
Handbook contains many case studies, several from the
semiconductor industry.
13.4 COMMUNICATIONS INDUSTRY8
13.4.1 Setting the Stage
The communications industry is concerned with the
transmission of voice, data, text, sound, and video. This
industry has, almost since the advent of the telephone in
the early twentieth century, played an important role in the
development of statistical applications and concepts. As
indicated earlier, the study of processes through statistical
590
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
monitoring can be traced back to Shewhart’s work at Bell
Labs. Since then, the industry has benefited from the
involvement of many key figures in statistics.
This attention is not surprising in light of the massive
amounts of data generated every minute of every day.
Each communication, such as a telephone call, can
produce data on who initiated the communication, who
received it, where and when it was placed, how long it
lasted, how it was paid for, and so on. In addition, there is
extensive systems information, such as on switching
performance.
All of this creates formidable challenges in recording and
maintaining data and using such data to the advantages of
the business.9
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.4.2 Usage Assessments
Telephone usage data are explored (or mined) by network
engineers and statisticians to identify meaningful patterns
and relationships, both short-term and long-term. Such
studies help address practical questions such as
determining:
• The capabilities needed to allow calls to be successfully
placed and completed 99.99% of the time.
• Peak usage times (e.g., 10:00 a.m., U.S. Eastern Standard
Time, on Mother’s Day) in the United States and the
probability of successfully making a connection on the first
try at such times.
• A company’s penetration into different geographical
markets.
591
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
• The impact of new technologies, such as Voice Over Internet
Protocol (VOIP), on customers’ calling habits.
See Raftery, Tanner and Wells (2001) for further
discussion and added applications. Also, Lambert and
Pinheiro (2001) studied wireless calling records for 96,000
customers, making about 18 million calls in three months.
Their work characterized statistically the calling patterns
of individual customers with the ultimate goal of
identifying unusual events that might signal fraudulent
calls – a topic considered directly by Cahill, Chen,
Lambert, Pinheiro, and Sun (2002).
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.4.3 Reliability and Service Improvement
The communications industry, like other businesses,
strives for high reliability and service. The complexity of
communications systems in general, and phone networks
in particular, make them subject to hardware failures or
possible performance degradation over time resulting in
deteriorating message quality. The reasons for failures
during operations need to be quickly identified and
addressed.
Software issues can, in addition, lead to service
breakdowns, potentially involving large geographic areas;
software reliability (Section 5.12) is, therefore, of
particular interest.
Storms are a further major reason for disruption in service
of communications products and are best addressed
proactively by making systems more robust to adverse
weather. Service providers also strive to predict
592
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
emergencies and to develop and evaluate the effectiveness
of alternative contingency plans.
13.4.4 Business Assessments
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Pricing The communications industry is highly
competitive. The cost of making calls is very important in
gaining market share. Service providers need to know the
impact of alternative pricing strategies and may test these
out in selected markets. This leads to such statistical issues
as deciding which market areas to select, the duration of
tests, and the selection, and relevance, of the prior (and
possibly subsequent) time periods and market areas with
which comparisons are to be made.
Competitive Evaluations Communications service
providers – and especially those of wireless products –
often wish to make claims about the superiority of their
product’ s performance. Becker, Clark, and Lambert
(1998) describe a study that involved hundreds of millions
of test calls to quantify reliability for different
circuit-based service providers. The study required placing
calls on carefully selected test phones between designated
cities, say every 15 minutes, using a company’ s own
network and those of competitors and recording fraction of
calls blocked.
Logistics Statistical and related concepts are applied to
address various logistical issues, such as optimizing cell
tower placement.
13.4.5 Internet Opportunities
593
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
The Internet has had a major impact on the
communications industry. Many companies have expanded
their services to become Internet providers to private and
commercial customers. Statistical methods are being used
to gain a better understanding of Internet traffic patterns to
help improve performance.
Also, companies wish to leverage data from Internet
business transactions to learn about customer behavior.
Online retailers, for example, may personalize their
offerings, or their advertising, by tailoring them to
individual customer groups, using, perhaps, multivariate
statistical analysis methods, such as cluster analyses, to
establish such groups.
13.5 STATISTICAL IMAGE ANALYSIS FOR
MEDICAL, SECURITY, AND OTHER APPLICATIONS
10
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.5.1 Introduction
The digital image – simply a matrix of numeric pixels
(picture elements) represented as brightness on a computer
screen or as a printout – is the natural data output format
for a wide variety of applications. From digital cameras
and flatbed scanners to medical imaging devices, digital
images are being produced ubiquitously in staggering
volumes. In fact, in many imaging domains so much data
are being produced that it has become nearly impossible
for humans to review it all. State-of-the-art computed
(axial) tomography (CAT) scanners can, for example,
generate a thousand images of a person’s lungs in just a
few seconds. This amount of information can be
594
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
overwhelming for radiologists looking for lung cancer,
especially when they must review scans from dozens of
patients each day.
Automatic image analysis holds the promise of sifting
through large amounts of data to identify, quantify, and
summarize the most important characteristics. Systems are
already in use that automatically read addresses on
envelopes, detect breast cancer in mammograms, and
match crime scene fingerprints to databases of known
felons.
Many of these applications require modeling uncertainty in
the data and therefore rely, to a significant degree, on
statistical concepts and analysis. This section briefly
describes two of the most important and challenging areas
of statistical image analysis: image restoration and pattern
recognition.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.5.2 Image Restoration
Goal The goal of image restoration is to take an image that
demonstrates some defect, such as uneven lighting or
being out-of-focus, and alters it to remove the defect. The
statistical element of this problem arises from the need to
model how an image with a defect can arise when using a
particular imaging device such as a digital camera.
Although we might be able to model the mechanics of the
camera almost deterministically, we are usually uncertain
as to exactly how it was used and in what environment.
Statistical models can help quantify and address such
uncertainty.
595
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Application to Digital Photography One of the advantages
of digital photography over conventional film is its ability
to correct small “errors” by the photographer. This is
accomplished by using one of various easy-to-use and
powerful photo-editing software packages.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Many of the image-editing tools in these packages can be
considered to be implementations of statistical image
analysis. For example, one of the most widely used tools,
the UnSharp Mask (USM) filter, addresses the familiar
blur seen in out-of-focus photographs caused by the optics
of the camera when the focus distance is slightly off the
subject of interest. The USM filter recovers sharpness in
such pictures.
This filter is based on the notion that an out-of-focus
picture can be modeled as a Gaussian blur of the desired
(but unobserved) in-focus picture. A Gaussian blur is
simply a weighted moving average over all the pixels in
the image, where the weights are defined by a Gaussian
distribution.11 The model assumes that at each pixel of the
in-focus image, the blurred image is made up of the
weighted average of nearby pixels, weighted more heavily
for close-by pixels. To recover sharpness from the
out-of-focus image, the USM filter estimates the parts of
the image lost due to the camera being out of focus by
subtracting a slightly Gaussian blurred version of the
out-of-focus image from the original. This difference
image, which accentuates the sharp portions of the image
(those areas with adjacent structures of large differential
brightness), is then added to the original out-of-focus
image to increase sharpness and to produce an estimate of
the in-focus image.
596
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Simple Example Figure 13.2 gives a stylized example of
the Gaussian blur model. In the left panel we have drawn a
simple happy face in black on a white background. The
middle panel is a Gaussian blur of this image. The goal in
image sharpening is to estimate the original in-focus image
from this out-of-focus image. The right panel is the result
of this estimation process using the USM filter. Like all
statistical estimation procedures, it is not a perfect
reconstruction. For example, the separation between the
left eye and the left side of the face, which is clear in the
original image, is less defined in the estimated image. And,
all the lines in the estimated image are thicker than in the
original. Such artifacts are unavoidable because of the
limited information contained in the middle image, which
was the input to the filter. However, the contrast in the
original image has been largely recovered and, overall, the
sharpened estimated image tends to be more pleasing to
the eye than the out-of focus image.
Photography Example Figures 13.3 and 13.4 illustrate this
process in a photography setting. The original slightly
out-of-focus image, shown in the left panel of Figure 13.3,
was taken using a digital camera. The sharpened image is
shown in the right panel. It provides greater contrast
between adjacent structures with differential brightness as
compared to the original image. For example, in the
sharpened image, the reflected light in the boy’s eyes is
more noticeable, and the patch on his coat is better
defined. Figure 13.4 highlights these differences by
zooming in on the boy’s eyes and the patch on his coat.
Figure 13.2 A simple example of a Gaussian blur model
and the reconstruction of the original image, using a USM
597
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
filter, showing (a) an in-focus image, (b) its Gaussian blur,
and (c) the restored image.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Figure 13.3 Original slightly out-of-focus image (left) and
sharpened image (right).
Figure 13.4 Two portions of the image in Figure 13.3,
showing the original slightly out-of-focus image (left) and
the sharpened image (right).
The USM filter gives the appearance of sharpness to blurry
images by increasing contrast between light and dark
598
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
adjacent structures. It cannot recover detail lost due to a
camera being completely out of focus. This is because the
only input to the filter is the out-of-focus image; any fine
structure not actually present in an image cannot be
recovered.
13.5.3 Pattern Recognition
Goal Image pattern recognition is used to determine if a
pattern of features is present in an image, and if so, to
quantify the location, size, and shape of the pattern. This
has numerous important medical applications, such as in
detecting tumors. We will, however, discuss an application
dealing with photographic face recognition.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
There are various components of uncertainty in such
problems, including whether or not a particular feature is
present in the image, biological variability in the locations
of features, and noise in the image data itself.
Face Recognition Automatic face recognition has become
an important tool in criminal investigations and in various
security settings. Photographs of a suspected criminal are
matched with an often large database of pictures of known
persons. The resulting system takes a digital image of a
suspect’ s face and analyzes it to identify and characterize
various “biometric features” of that person. Typical
features are the locations of salient points on the face
involving a person’s eyes, mouth, nose and chin. The
system then scans a database of known persons searching
for matches with individuals whose images display similar
features.
599
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Figure 13.5 illustrates several biometric features of an
image for a particular person (i.e., the centers and corners
of his eyes, corners of the mouth, and the tip and underside
of the nose, overlaid as white squares.)
At least two parts of this problem rely on statistical
analysis. First, we must automatically locate the positions
of the biometric features in the given image. Second, we
must determine whether the arrangement of these features
appears to be consistent with those of any specific
image(s) in an available database.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Figure 13.5 Photograph with several biometric feature
estimates overlaid.
600
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Locating Biometric Features One method of locating
biometric features in an image is to apply a template of
features (the white squares in Figure 13.5) to the image
and adjust it (via scaling, rotating, and warping) until the
best template location is found. The best location is
typically defined as the one which deforms the template
the least while maximizing some image metric on the
pixels underneath each of the template points. This metric
could, for instance, be the correlation of the pixels in a
small neighborhood around the feature position with those
of a gold-standard feature image. Such a gold-standard
image could be defined a priori to be an average image
estimated from the database or possibly an external image
whose biometric features had been defined by extensive
physical analysis.
Matching with Database Constructing an effective face
recognition system calls for matching the biometric
features of an unidentified person, such as the one shown
in Figure 13.5, with those of an identified person (or
persons) in the database. To do this, natural variability in
biometric features of different individuals must be properly
modeled. For example, the biometric features representing
the inner corners of the left and right eye are almost
horizontal to each other for most persons (in properly
posed photos), but the distance between them varies
appreciably from one person to the next.
To capture such variability, a multivariate statistical model
to represent feature locations can be fitted, using the data
in the image database. This model is then used to derive
metrics of template variability (sometimes called distance)
of the biometric features for the person under scrutiny
601
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
from those for each person in the database. Given a new
image to be matched, we first locate its biometric features,
as outlined above. Then we decide which photos, if any, in
the database match this new image. This is done by using
the statistical model to calculate the distance from the
features detected in the image under scrutiny to each of the
image feature sets in the image database. The persons in
the database whose images have the smallest distances are
determined to be the best matches to the person under
scrutiny.
We could also establish a system that flags those matches
whose distances are smaller than a specified decision
threshold value. This requires trading sensitivity against
specificity in the matching algorithm. High specificity
means that if a true match does not exist, then the
algorithm will correctly fail to come up with one. High
sensitivity means that if there is a true match in the
database, the algorithm will likely find it. These competing
requirements resemble the balancing act between Type I
and Type II errors for a simple statistical hypothesis test.
The preceding method might be enhanced by using a
Bayesian approach, as described briefly in Sidebar 13.5.
SIDEBAR 13.5: BAYESIAN METHODS FOR
PATTERN RECOGNITION
In using a Bayesian approach for pattern
recognition we specify, for each image in the
602
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
database, a prior probability of a match to the
image under scrutiny – before considering any
information about the image, as gleaned from the
database. Thus, if we had added information on
subjects in the database (beyond their images) – for
instance on age, gender, or date of birth – we
would use such information to assign higher prior
probabilities to individuals whose characteristics
seem more likely to match those of the person
under scrutiny. If, on the other hand, we have no
such prior information, we might say each image in
the database is equally likely to match the image
under scrutiny and make each of the prior
probabilities equal.
Using a statistical model we would then, for each
image in the database, calculate the probability (or
likelihood) of observing the biometric features of
the subject under scrutiny under the assumption
that the two subjects are the same. Using Bayes’
Rule, the posterior probability that the two images
actually match is proportional to the product of the
likelihood and the prior probability. This posterior
probability is then calculated for each image in the
database, and those with the highest posterior
probabilities are identified as the most likely
matches to the person to be identified.
13.5.4 Further Information
603
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
McCulloch, Laading, Wilson, and Johnson (1996) give a
detailed face recognition example. See Duda, Hart, and
Stork (2000) for a more general discussion of pattern
recognition, and Winkler (2006) for other examples of
image analysis discussed from a statistical perspective.
13.6 OTHER APPLICATION AREAS
13.6.1 Health Care
Patient Care There is a continuing need worldwide to
improve patient care. Statistics plays an important role in
such areas as the:
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
• Comparison of the effectiveness of alternative treatments of
a disease and of different approaches to patient care, such as,
for example, the length of hospital stays after childbirth.
• Assessment of waiting times in doctors’ offices or in
hospitals and identification of their likely causes.
• Evaluation and reduction of errors in filling prescriptions,
administering medications, performing surgeries, or in
diagnoses.
Occasionally, such applications call for conducting a
designed experiment. Most of the time, however,
experimentation is inappropriate and we must, instead, rely
on data from observational studies. As always, planning to
get good data from such studies is essential.
Process Improvement Health care organizations, due to
their highly process oriented nature, present important
opportunities for improvement similar to those for
businesses.
604
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
The scheduling of doctors in a hospital emergency room
provides a typical example. How many doctors are likely
to be needed at any given time is predicted from past data
on incoming patient flow. Doctors are then scheduled,
taking into account their preferences and the need to avoid
fatigue. Scheduling outpatient procedures and physical
resource planning (e.g., assignment of scanning
equipment) are some additional applications.
Further Reading Health care practitioners have come to
use quality improvement methodologies, such as PDSA, as
well as statistical and operations research methods.
Relevant articles, for example, Van Den Heuvel, Does, and
Bisgaard (2005), appear in Six Sigma Forum Magazine and
also Quality Progress. At a more technical level, Woodall
(2006) presents a detailed review of issues that arise in the
application of control charts to health care.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.6.2 Epidemiological Studies
Epidemiology is the branch of medicine that deals with the
study of the causes, distribution, and control of disease in
populations. Examples of important issues that are
addressed using a statistical approach are:
• Identification of an impending epidemic by, for example,
studying occurrence rates in different regions of a country or
continent, or the entire world, over time.
• Developing hypotheses concerning likely causes of diseases
by relating these to health outcomes. Such evaluations may
assess the impact of environmental conditions by
geographical region, for example, correlating air quality to
cancer incidence in different localities. Or they may study
605
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
personal habits and be conducted on individuals, such as
relating the incidence of cancer to smoking.
• Assessing the impact of different preventative strategies,
such as publicity about methods for prevention, on
occurrence of a particular disease.
Evaluations such as the preceding rely heavily on
observational studies. They also provide important
opportunities for using a proactive approach for disease
prevention.
(See Kahn and Sempos 1989 for a discussion of statistical
methods in epidemiology, and Ahrens and Pigeot 2007 and
Kleinbaum, Sullivan, and Barker 2007 for more general
discussions.)
13.6.3 National Security
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Banks (2005) considers the application of statistics to
national defense and homeland security. He describes eight
opportunity areas such as:
• Probabilistic risk assessment: This balances cost of
investment against risk reduction. Statistics has long been
used in this area, dating back to studies of the safety of
nuclear reactors.
• Cybersecurity: This aims for early detection of sabotage
directed at defense networks, financial services, public
utilities, etc. Automated statistical algorithms are used to
detect system anomalies quickly and adopt appropriate
countermeasures.
• Privacy protection: This includes determining how much
noise to add to data to protect personal identity, and other
ways of balancing the needs for security and confidentiality.
606
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Banks’ main point, however, is that, at the time of writing,
the statistics profession in the United States was not yet
sufficiently involved in such activities.
13.6.4 Space Exploration
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Space exploration presents special challenges since again
human life may be at stake. Ensuring high reliability prior
to launch, especially for “manned” space vehicles, is
critical. Improper analysis of the available data on the
susceptibility to failure of O-rings in cold weather led to
failure to postpone the fatal 1986 Challenger space shuttle
launch (see Dalal, Fowlkes, and Hoadley 1989).
Statistics is also used to assess the risk caused by external
factors, such as ice and insulation foam striking a shuttle
during launch, for addressing the uncertainty in the
physical models that predict where a launched space
vehicle will be at a particular point in time, and for many
other applications. Also, Rhew and Parker (2007) describe
a study in which the design of experiments was used to
“efficiently identify and rank the primary contributors to
the integrated drag over the vehicle’s ascent trajectory” for
the launch alert system of the NASA Crew Exploratory
Vehicle.
13.6.5 Oil Exploration
Exploration for oil (and for valuable minerals) may
provide the ultimate application of decision-making that
balances statistical Type I errors (claiming that oil is
present when it is not) and Type II errors (failing to detect
oil if it is there). Moreover, if it seems likely that there is
607
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
oil, we need to estimate the quantity and exact location.
These assessments typically involve the use of
relationships established from past data between physically
relevant predictor variables and oil discovery. Such data
are of varying relevance and precision; thus, the
evaluations might involve using an appropriate weighting
scheme and/or Bayesian methods.
(See Agterberg and Bonham-Carter 1999 and Harbaugh,
Doveton and Davis 1977 for further discussion.)
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.6.6 Environmental Studies
Environmental concerns are becoming increasingly
important to business and industry. Issues may arise as a
result of the manufacturing process itself, such as the
discharge of undesirable liquids into the ground or river.
Or they may be the consequence of the use of products,
such as automobiles or locomotives that emit pollutants.
Traditionally, statistics was used to address questions such
as “does the discharge exceed the ‘allowable’ limits
established by the company, the industry or a regulating
authority?” Sometimes, it was also desired to determine
whether “hot spots” or highly contaminated areas were
present and, if so, to identify their locations. Over time, the
focus has shifted to a more proactive approach; that is, to
minimize discharges and to find effective pollutant control
methods.
Statistical methods are typically used in environmental
studies to:
608
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
• Model and monitor (using control charts) pollution at a
particular location over time.12
• Evaluate seasonal patterns and long-term trends in pollution
levels using time series analysis.
• Assess the precision of the instruments used to quantify
pollution using GRR studies.
• Characterize the geographical distribution of environmental
pollution using spatial methods (see Cressie 1993).
Environmental studies often involve analysis of censored
data due to the pollutant level being below (left-censored)
or sometimes above (right-censored) the capability of the
measurement instrument.
(See Gibbons and Coleman 2001, and Gilbert 1987 for
further discussion.)
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.6.7 Chemometrics
Chemometrics is a specialized area that deals with the
analysis and modeling of chemical data. A common
problem is identifying the composition of an unknown
material based on the substance’s spectra (e.g., reflectance
at various wavelengths). Modern instrumentation often
generates large amounts of spectral data – a million or
more data points per sample are not atypical. Such data are
then used to compare the spectra of the unknown material
with those of known materials in order to determine its
composition.
Readily available software packages allow chemometric
studies to be carried out directly by a laboratory chemist.
These typically include multivariate statistical analysis and
609
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
modeling methods. (See Beebe, Pell, and Seasholtz 1998
for more information.)
13.7 EMERGING AREAS: A GLANCE INTO THE
FUTURE
We always overestimate the change that will occur in
the next two years and underestimate the change that
will occur in the next ten.
—Bill Gates
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
What are some of the new areas in business and industry in
which statistics will be important in the future? We briefly
consider this question, carrying forward our discussion in
Section 2.4 and elsewhere. (For more detailed assessments,
see Lindsay, Kettenring, and Siegmund 2003, 2004;
Raftery, Tanner, and Wells 2001; Steinberg 2008; Straf
2003).
13.7.1 Advances in Information Science
Computers in the future may weigh no more than 1.5
tons.
—Popular Mechanics, 1949
The interplay between information science and statistics
was discussed briefly in Section 2.4.1 (see also Wegman,
Said, Scott, and Solka 2009). Advances in data gathering
and in computing continue to lead to new areas of
application, as well as new statistical methods.
610
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Data Gathering The new opportunities presented by the
continuing advances in data-capturing capabilities and
increased access to databases have been highlighted
throughout this book. We have stressed the importance of
careful early planning of the data-gathering process to get
good data for statistical analysis. This will continue to be
critical as added capabilities for using data emerge.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Methods currently in use for tracking tagged units, ranging
from product inventory to cattle, are, for example, being
extended to such diverse applications as product usage
assessment, customer behavior evaluation, fraud detection,
field monitoring of products and evaluation of Internet
traffic. The advent of radio frequency identification
devices (RFID), in particular, calls for “an essential data
management infrastructure” to effectively utilize the
resulting “data avalanche” (quote from RFID Study Group
at Pennsylvania State University 2006).
Computations Advances in computer capabilities and in
the speed and ease of calculations continue to
revolutionize the way statistics is applied in addressing
problems in business and industry. The following are just a
few examples of such computer-intensive methods:
• Modern Bayesian methods (see Gelman, Carlin, Stern, and
Rubin 2003).
• Markov Chain Monte Carlo simulation methods (see Kass,
Carlin, Gelman, and Neal 1998).
• Experimentation on large-scale computational models (e.g.,
CAD/CAM models involving thousands of parameters; see
Lindsay, Kettenring, and Siegmund 2003; 2004).
• New clustering and classification methods for data-mining
applications (see Hastie, Tibshirani, and Friedman 2003).
611
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
In addition, the increasing availability of user-friendly
statistical software and readily accessible information
about statistics continues to make statistical methods
increasingly accessible to nonstatisticians. This will further
challenge statisticians to develop tools that are as simple as
possible to use – and maximally robust to potential misuse.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.7.2 Advances in Application Areas
Bioinformatics and Statistical Genetics The rapidly
developing field of bioinformatics has been loosely
referred to as any use of computers to handle biological
information (see Tramontano 2006). Gene mapping, a key
application area, has “the ultimate goal... to identify the
genes that play important roles in the inheritance of
particular traits and to explain the role of these genes in
relation to one another and... the environment” (see
Siegmund and Yakir 2007). This typically requires the
assembly and statistical analysis of large sets of DNA and
other information. Genetic research experiments, for
example, now routinely collect over 300,000
measurements per subject.
Nanotechnology The ability to manipulate material
characteristics at the individual molecule level is leading to
important changes in the way products are designed,
manufactured, and used. The routine clinical use of
nanotechnology to deliver drugs to different locations
within the body is an active research area. The need to
build high-quality and high-reliability products on a
nanoscale creates statistical challenges that seem to
resemble those faced, early on, by the semiconductor
industry in raising yield. (See Dasgupta, Ma, Joseph,
612
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Wang, and Wu 2007; Nembhard 2007; Rue 2006 for
further discussion.)
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
Automated Tracking Implanted health-monitoring devices
to provide early warning of, say, an impending heart attack
hold much promise. As in other applications of remote
monitoring, a major contribution of statistics is to help
establish action thresholds that provide the desired balance
between maximizing detection sensitivity and minimizing
the false alarm rate.
Other Areas People all over the world have come to
depend on increasingly complex systems (e.g., economy,
health, and transportation) that utilize statistics. For
example, dynamic travel routing might be extended from
using information on current congestion to using
predictions of future conditions. These might employ a
statistical algorithm that uses data on past traffic (by time
of day and day of week), the impact of (forecasted)
weather and of special local events (e.g., a football game)
to predict traffic volume.
Statistical methods also help address such “hot topics” as
climate change and the assessment of world energy
reserves.
In addition, the continuing flow of government and
industry regulations calls for new programs of disciplined
data gathering and statistical analysis. The Sarbanes–Oxley
Act, a U.S. federal law enacted in response to various
corporate and accounting scandals, provides an example.
Thus, Faltin and Faltin (2003) illustrate how a Six Sigma
approach, employing publicly available data and statistical
613
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
tools, could have predicted WorldCom’ s ultimate
bankruptcy.
Statistics will, undoubtedly, continue to make important
contributions to addressing new technical and societal
challenges involving business and industry as these
emerge; in turn, new applications will stimulate further
powerful developments in statistics.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
13.8 MAJOR TAKEAWAYS
• Applications of statistics in the food, beverage, and related
industries are characterized by:
• The need (especially for processed foods) to
minimize the variability that customers experience
from one purchase to the next.
• Data for product assessment are obtained from
consumers, instruments and experts and are often
ordinal or categorical.
• The applicability of mixture designs in planned
experiments.
• Manufacturing issues similar to those encountered
for many continuous processes.
• High concern with food safety and spoilage.
• The semiconductor business is characterized by its many
process steps and efforts to improve yield. It is heavily
data-intensive and provides numerous opportunities for the
use of statistics. Applications include:
• Studies to understand the impact of process
variables upon performance
• Monitoring and controlling critical process
variables
• Product life assessment
• Product burn-in
• The communications industry, another highly data-intensive
business, has traditionally been a heavy user of statistical
614
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
methods. Typical problems involve usage assessments,
reliability and service improvement, and business
assessments. The emergence of the Internet has led to new
applications.
• Statistical methods play a prominent role in image analysis
for medical, security and other applications including:
• Restoration of images, such as those obtained in
digital photography, to remove defects.
• Pattern recognition in, for example, diagnosing
diseases by medical scanning or identifying
criminals by matching their pictures with those in
an available database.
• Statistics also makes major contributions in many other
application areas that often involve business and industry,
including health care, epidemiological studies, national
security, space exploration, oil exploration, environmental
studies and chemometrics (the analysis and modeling of
chemical data).
• Technological advances in both information sciences and
application areas continue to lead to new applications in
bioinformatics and statistical genetics, nanotechnology,
automated tracking, and numerous other areas, and to
provide the impetus for the development of powerful new
statistical methods.
DISCUSSION QUESTIONS
General
1. Why do the food, beverage and related industries
“provide an ultimate example of the need for robust
design,” as stated in Section 13.2.2?
2. Assume that you are the manufacturer of a popular
brand of chocolate chip cookies. Propose a program to
help ensure product consistency.
615
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
3. Assume that you are the manufacturer of a canned fish
product. Propose a program to help ensure product safety.
4. How does the “relatively long elapsed time to conduct
even a single production run” (Sidebar 13.3) in
semiconductor manufacture impact the statistical
evaluations?
5. Practices in use in the semiconductor industry, such as
end-of-line inspection, accepting relatively low product
yield, and burn-in, tend to be contrary to the goals of
ensuring up front high quality and reliability. Discuss this
apparent disconnect and why and when it might be
appropriate for the semiconductor industry.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
6. A telephone company has asked you to develop a usage
assessment system to provide information to improve
service. What would you propose?
7. Suggest some applications of statistical image analysis
beyond those presented in this chapter.
8. What do you think are the most important contributors
to uncertainty in a face recognition system?
9. Suggest some measurable characteristics associated with
the correct administration of prescription drugs in a
hospital.
10. Applications such as pattern recognition (Section
13.5.3) and oil exploration (Section 13.6.5) call for
developing a scheme that strikes the best possible balance
between incorrectly identifying a signal when it is not
616
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
present (Type I error) versus failing to identify a signal
when one exists (Type II error). Explain this comment and
suggest what might be done to increase the chances of
making the right call.
11. Suggest and briefly describe some likely future
applications of statistics in business and industry beyond
those discussed in this chapter.
Technical
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
1. We state in Section 13.2.2 that to achieve homogeneity
in processing roasted peanuts, “measurements on such
factors as sugar content and moisture are obtained on crop
samples during harvesting. This information is then used to
set processing (e.g., roasting) conditions.” Explain this
concept and propose a plan for implementation.
2. In the discussion of the need for reducing the variability
in the processing of flour in Section 13.2.2, we state that
“Homogeneity in the final product is attained by adjusting
the mixing proportions, based upon product or process
measurements.” Propose a system for doing this.
3. Propose a GRR study to help ensure consistency in
judgment over time and between tasters for end-of-line
wine tasting.
4. Obtain further details about the HACCP system for
proactively ensuring food safety (Section 13.2.6) and
discuss how statistics fits in.
617
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
5. In Section 13.3.6, we state that “Statistical concepts help
determine how long units should be burned in, based upon
cost considerations.” How would you do this?
6. Variance components analysis was suggested in Sidebar
13.4 “to quantify the relative contribution to total
variability of, say, lots, wafers, positions within a wafer,
individual units and measurement error.” This suggests
that each of these are random, rather than fixed, effect
(analysis of variance) variables. Discuss this assumption,
especially with regard to wafer position, and how the
analysis is changed if the assumption is not satisfied.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
7. Consider a Bayesian interpretation of the face
recognition example (Section 13.5.3). Given a defined set
of N prior probabilities and N likelihood values from a
database of N images, define the posterior probability of a
true match. How can you adjust the prior probabilities to
change system sensitivity and specificity?
8. What kind of statistical models might you use to model
different contributors to uncertainty in a face recognition
system?
9. Pattern recognition (Section 13.5.3) and chemometrics
(Section 13.6.7) both address problems that involve
comparison of an unknown sample against known ones to
find a match. Discuss the similarities and differences in the
two problems and the statistical methods used to address
them.
10. We state in a footnote to Section 13.6.6 that in
modeling pollution, “a lognormal distribution is often used
618
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
to represent discharge amounts.” Why might a lognormal
distribution be appropriate in this situation?
11. Select one or more of the “Other Applications”
described in Section 13.6. Describe statistical evaluations
that might likely be performed.
1. This discussion has benefited greatly from inputs
provided by Lynne Hare and Mark Vandeven.
2. In contrast, expert evaluators are generally selected
based on their proficiencies.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
3. Due to fatigue, etc. it is sometimes inadvisable to have
each evaluator assess each product. This leads to the use of
randomized incomplete block experimental designs to
obtain data for making comparisons.
4. T he U.S. media recently reported an E. coli outbreak,
attributed to raw spinach, and peanut butter contaminated
with salmonella.
5. Per the HACCP website http://www.cfsan.fda.gov/
~comm/haccpov.html.
6. This discussion has benefited greatly from inputs
provided by Mike Clayton, Veronica Czitrom, Mary
Lanzerotti, Paul Tobias, and Emmanuel Yashchin. Various
definitions were adapted from the Semiconductor Industry
Association’s online glossary.
619
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
7. A more advanced approach involves monitoring
individual estimated variance components (see Yashchin
1994).
8. This discussion has benefited greatly from inputs
provided by Diane Lambert and Scott Vander Wiel.
9. References to articles that discuss specific application
areas, beyond those cited here, are to be posted on the
book’s ftp site.
10. This section was written by Colin McCulloch.
11. The term “Gaussian distribution” is widely used in the
imaging literature for the normal distribution.
Copyright © 2008. John Wiley & Sons, Incorporated. All rights reserved.
12. A lognormal distribution is often used to represent
discharge amounts.
620
Hahn, G. J., & Doganaksoy, N. (2008). The role of statistics in business and industry. ProQuest Ebook Central http://ebo
Created from usf on 2021-05-19 19:26:46.
Purchase answer to see full
attachment