Probability Background
1
2
3
4
Random variables
A random variable (r.v.) is a numerical
outcome whose value depends on chance. We
denote random variables with upper case
letters, and their realized values with lower
case letters.
Random variables
For example, suppose demand for sweaters
next winter cannot be predicted with certainty,
and we think it could be anywhere between 0
and 1000. Then we may denote the uncertain
demand by random variable X, which takes on
values in the set {0,1,…,1000}. If at the end
of the season, the actual demand turns out to
be 734 units, we say the value of X is x = 734.
Probability distribution
The expression {X ≤ x} is the uncertain event
that the random variable X takes on a value
less than or equal to x. The event is uncertain
because whether it occurs or not depends on
the value of X.
Probability distribution
The probability that the event occurs is
denoted as P[X ≤ x]. As x varies, this
probability defines a function:
F(x) = P[X ≤ x]
which is called the cumulative distribution
function (c.d.f) of the random variable X.
Probability distribution
Sometimes, we write it as FX(x) to highlight
that it is distribution function of X. The
cumulative distribution of a random variable
contains all information about it.
Probability distribution
A random variable X is called discrete if it can
take on only a finite or countable number of
values x1, x2,… with probabilities
pi = P[X = xi] for i = 1, 2, … and
Σi pi = 1.
Probability distribution
The function
f(xi) = P[X = xi] for i = 1, 2, …,
is called the probability mass function (p.m.f)
of X. It is related to the cumulative
distribution function as
𝐹(𝑥) = σ𝑥𝑖≤𝑥 𝑓 𝑥𝑖
Example of Discrete Random Variables
L.L. BEAN EXAMPLE
L.L. Bean Example
• L.L.Bean is a large mail-order company that sells
apparel.
• One of the products L.L.Bean sells is ski jackets,
•
•
for which the selling season is from November to
February.
The buyer at L.L.Bean currently purchases the
entire season’s supply of ski jackets from the
manufacturer before the start of the selling
season.
What is the probability that demand does not
exceed 1,000 jackets?
Copyright © 2016 Pearson Education, Inc.
13 – 14
L.L. Bean Example
Demand Di
Probability pi
Cumulative Probability of Demand
Being Di or Less (Pi)
400
0.01
0.01
500
0.02
0.03
600
0.04
0.07
700
0.08
0.15
800
0.09
0.24
900
0.11
0.35
1000
0.16
0.51
1100
0.20
0.71
1200
0.11
0.82
1300
0.10
0.92
1400
0.04
0.96
1500
0.02
0.98
1600
0.01
0.99
1700
0.01
1.00
Copyright © 2016 Pearson Education, Inc.
13 – 15
Probability distribution
A random variable X is called continuous if it
takes on a continuum of values x. In that case,
it is improbable that it will take on any
specific value x, i.e.,
P[X = x] = 0 for every x.
Probability distribution
Example: Suppose that you randomly ask
this question: “Is it now exactly 12:30 PM?”
What is the probability that you are answered
“Yes”?
Answer: 0
Comment: You are never exactly on time in
your life.
Probability distribution
Then its cumulative distribution function is
continuous in x. Often there exists a
probability density function 𝑓(𝑥) such that
𝐹(𝑥) =
𝑥
−∞ 𝑓
𝑢 𝑑𝑢
which is the area under the probability
density function to the left of x.
Probability density function of a triangular distribution
Example of Continuous Random Variables
TRIANGULAR DISTRIBUTION
Triangular Distribution
• Acknowledgement : The following notes
are provided by Dr Nicola Ward Petty and
Dr Shane Dye of Statistics Learning Centre:
StatsLC.com
Triangular Distribution
• A triangular distribution is a continuous
probability distribution with a probability
density function shaped like a triangle. It is
defined by three values: the minimum value
a, the maximum value b, and the peak value
c.
Triangular Distribution
• This is really handy as in a real-life
situation we can often estimate the
maximum and minimum values, and the
most likely outcome, even if we don't know
the mean and the standard deviation.
Triangular Distribution
• The triangular distribution has definite
upper and lower limits, so we avoid
unwanted extreme values. In addition the
triangular distribution is a good model for
skewed distributions.
Triangular Distribution
Triangular Distribution
Triangular Distribution
The probability density function of a
triangular distribution is zero for values below
a and values above b. It is piecewise linear
2
rising from 0 at a to 𝑏−𝑎 at c, then dropping
down to 0 at b. The graph shows the
probability density function of a triangle
distribution with a = 1, b = 9 and c = 6. The
peak is at c = 6 with a function value of 0.25.
Triangular Distribution
Triangular Distribution
• Example 1: A burger franchise planning a
new outlet in Auckland uses a triangular
distribution to model the future weekly
sales. They estimate that the minimum
weekly sales is $1000 and the maximum is
$6000. They also estimate that the most
likely outcome is around $3000.
Triangular Distribution
• Example 1 (Continued): The graph of the
probability density function reaches its
maximum of 0.0004 at c = $3000. The
graph of this probability density function is
shown below.
Triangular Distribution
• Example 1 (Continued):
Triangular Distribution
• The probability density function is used to
determine the probability that the random
variable falls in some range. We want to
determine the probability that the random
variable is above a given value, below a
given value, or between a pair of values. It
is simply a matter of finding the area under
the curve for the required interval.
Triangular Distribution
• Example 1 (Continued): A burger
franchise planning a new outlet in Auckland
wants to determine the probability the new
outlet will have weekly sales of less than
$2000. If the weekly sales are less than this,
the outlet is unlikely to cover its costs.
Triangular Distribution
• Example 1 (Continued): So, they wish to
calculate P(X < 2000). They use a triangular
distribution to model the future weekly
sales with a minimum value of a = $1000,
and maximum value of b = $6000 and a
peak value of c = $3000.
• Answer: P(X < 2000) = 0.1 = 10%
Triangular Distribution
• Example 1 (Continued):
Expected Value or Mean
The expected value or the mean of a random
variable X is the weighted average of all of its
possible values, using probabilities as weights
𝑥𝑖 𝑓 𝑥𝑖 if 𝑋 is a discrete r. v.
𝐸 𝑋 =
∞
𝑖
න 𝑢𝑓 𝑢 𝑑𝑢 if 𝑋 is a continuous r. v.
−∞
We will also denote the mean of X by µX.
Variance and Standard Deviation
The variance of a random variable X is a measure of
its variability from the mean. It is computed as the
expected squared deviation of X from its mean 𝜇𝑋
and is denoted by
𝑉 𝑋 = 𝐸 𝑋 − 𝜇𝑋 2 .
The square-root of the variance of X is called its
standard deviation and is denoted by
𝜎𝑋 =
𝑉𝑋 =
𝐸 𝑋 − 𝜇𝑋
2
.
L.L. Bean Example (Discrete)
a) What is the expected demand?
b)What is the standard deviation?
Copyright © 2016 Pearson Education, Inc.
13 – 37
L.L. Bean Example (Discrete)
DemandProb
400
0.01
D*Prob
Deviance^2
4
3918.76
500
0.02
10
5533.52
600
0.04
24
7259.04
700
0.08
56
8502.08
800
0.09
72
4596.84
900
0.11
99
1746.36
1000
0.16
160
108.16
1100
0.2
220
1095.2
1200
0.11
132
3330.36
1300
0.1
130
7507.6
1400
0.04
56
5595.04
1500
0.02
30
4493.52
1600
0.01
16
3294.76
1700
0.01
17
4542.76
1026
61524
MEAN
Copyright © 2016 Pearson Education, Inc.
VAR
248.040319
STDEV
13 – 38
Triangular Distribution
(Continuous)
Discussion Problem W01-01-a)
Variance = Risk in Finance
• V[(value of) Stock 1] = 0.0449
• V[Stock 2] = 0.0069
• V[Stock 3] = 0.0011
assuming E[Stock 1] = E[Stock 2] = E[Stock 3]
Which stock is best (of smallest variance) in risk?
Covariance and Correlation Coefficient
Suppose 𝑋1 and 𝑋2 are two random variables with
means 𝜇1 and 𝜇2 and standard deviations 𝜎1 and 𝜎2 ,
respectively. The covariance of 𝑋1 and 𝑋2 is defined
as the expected value of the product of their
deviations from their respective means and is
denoted by
𝐶𝑜𝑣 𝑋1 , 𝑋2 = 𝐸 𝑋1 − 𝜇1 𝑋2 − 𝜇2 .
The correlation coefficient is then defined as
𝜌 = 𝐶𝑜𝑣𝜎 𝑋𝜎1,𝑋2 .
1 2
Covariance and Correlation Coefficient
The value of the correlation coefficient is always
between –1 and +1. A positive covariance or
correlation coefficient implies that the two random
variables tend to vary in the same direction (up or
down). Similarly, negative covariance or correlation
coefficient implies that on average they tend to move
in the opposite direction. If 𝑋1 and 𝑋2 are
independent then the two are uncorrelated, or
𝐶𝑜𝑣 𝑋1 , 𝑋2 = 0.
Covariance and Correlation Coefficient
Discussion Problem W01-01-b)
Financial Hedging: to select a portfolio of common
stocks giving precise mathematical meaning to the
adage “Don't put all of your eggs in one basket.”
Correlations of the values of Stocks 1, 2 and 3 are
• ρ12 = 0.0032
• ρ13 = –0.0005
ρij = correlation of the
values of Stocks i & j
• ρ23 = 0.0006
Which pair of Stocks are best in hedging?
Sums of Random Variables
Consider two random variables X1 and X2. Then, it
turns out that
𝐸 𝑋1 + 𝑋2 = 𝐸 𝑋1 + 𝐸 𝑋2
𝑉 𝑋1 + 𝑋2 = 𝑉 𝑋1 + 𝑉 𝑋2 + 2𝐶𝑜𝑣 𝑋1 , 𝑋2
Recall that if 𝑋1 and 𝑋2 are independent, then
𝐶𝑜𝑣 𝑋1 , 𝑋2 = 0.
Sums of Random Variables
It then follows that the expected value and variance
of sums of independent random variables is equal to
the sum of their expectations and variances,
respectively; i.e.,
𝐸 𝑋1 + 𝑋2 = 𝐸 𝑋1 + 𝐸 𝑋2
𝑉 𝑋1 + 𝑋2 = 𝑉 𝑋1 + 𝑉 𝑋2
Example of Discrete Random Variables
DISCUSSION PROBLEM
W01-02
L.L. Bean Example
• L.L.Bean is a large mail-order company that
•
•
sells apparel.
One of the products L.L.Bean sells is ski
jackets, for which the selling season is from
November to February.
The buyer at L.L.Bean currently purchases the
entire season’s supply of ski jackets from the
manufacturer before the start of the selling
season.
Copyright © 2016 Pearson Education, Inc.
13 – 48
L.L. Bean Example
Demand Di
(in hundreds)
Probability pi
Cumulative Probability of Demand
Being Di or Less (Pi)
400
0.01
0.01
500
0.02
0.03
600
0.04
0.07
700
0.08
0.15
800
0.09
0.24
900
0.11
0.35
1000
0.16
0.51
1100
0.20
0.71
1200
0.11
0.82
1300
0.10
0.92
1400
0.04
0.96
1500
0.02
0.98
1600
0.01
0.99
1700
0.01
1.00
Copyright © 2016 Pearson Education, Inc.
13 – 49
Discussion Problem W01-02
L.L. Bean Example:
a) What is the expected demand?
b)What is the standard deviation?
c) What is the probability that demand does
not exceed 1,000 jackets?
Copyright © 2016 Pearson Education, Inc.
13 – 50
Example of Continuous Random Variables (Triangular Distribution)
DISCUSSION PROBLEM
W01-03 ~ W01-06
Triangular Distribution
• Acknowledgement : The following notes
are provided by Dr Nicola Ward Petty and
Dr Shane Dye of Statistics Learning Centre:
StatsLC.com
Triangular Distribution
• A triangular distribution is a continuous
probability distribution with a probability
density function shaped like a triangle. It is
defined by three values: the minimum value
a, the maximum value b, and the peak value
c.
Triangular Distribution
• This is really handy as in a real-life
situation we can often estimate the
maximum and minimum values, and the
most likely outcome, even if we don't know
the mean and standard deviation.
Triangular Distribution
• The triangular distribution has a definite
upper and lower limit, so we avoid
unwanted extreme values. In addition the
triangular distribution is a good model for
skewed distributions.
Triangular Distribution
Triangular Distribution
The probability density function of a
triangular distribution is zero for values below
a and values above b. It is piecewise linear
2
rising from 0 at a to 𝑏−𝑎 at c, then dropping
down to 0 at b. The graph below shows the
probability density function of a triangle
distribution with a = 1, b = 9 and c = 6. The
peak is at c = 6 with a function value of 0.25.
Triangular Distribution
Triangular Distribution
Triangular Distribution
Triangular Distribution
• Example 1: A burger franchise planning a
new outlet in Auckland uses a triangular
distribution to model the future weekly
sales. They estimate that the minimum
weekly sales is $1000 and the maximum is
$6000. They also estimate that the most
likely outcome is around $3000.
Triangular Distribution
• Example 1 (Continued): The graph of the
probability density function reaches its
maximum of 0.0004 at c = $3000. The
graph of this probability density function is
shown below.
Triangular Distribution
• Example 1 (Continued):
Triangular Distribution
• The probability density function is used to
determine the probability that the random
variable falls in some range. We want to
determine the probability that the random
variable is above a given value, below a
given value, or between a pair of values. It
is simply a matter of finding the area under
the curve for the required interval.
Triangular Distribution
• W01-03-a): A burger franchise planning a
new outlet in Auckland wants to determine
the probability the new outlet will have
weekly sales of less than $2000. If the
weekly sales are less than this the outlet is
unlikely to cover its costs.
Triangular Distribution
• W01-03-a) (Continued): So, they wish to
calculate P(X < 2000). They use a triangular
distribution to model the future weekly
sales with a minimum value of a = $1000,
and maximum value of b = $6000 and a
peak value of c = $3000.
• Answer: P(X < 2000) = 0.1 = 10%
Triangular Distribution
• W01-03-a) (Continued):
Triangular Distribution
• W01-03-b) Assume that weekly sales is
independent across weeks. What is the
expected sales for 3 weeks?
• W01-03-c) Assume that weekly sales is
independent across weeks. What is the
standard deviation of the sales for 3 weeks?
Triangular Distribution
a=
1000
mean = 3333.333
mean*3=
c=
3000
var =
var*3=
b=
6000
1055556
10000
3166667
Triangular Distribution
• Example 2: Voting for the student
representative on a school’s Board of
Trustees has closed but the votes have not
been counted. Simon Pegg (a candidate)
thinks about how many votes he thinks he
will get. He thinks the most likely value is
around 550, but he could get as many as
900 or as few as 200.
Triangular Distribution
• Example 2 (Continued): Simon models the
number of votes he may have received as a
triangular distribution with minimum value
a = 200, maximum value b = 900 and peak
value c = 550. The graph of the probability
density function reaches its maximum of
0.002857 at c = 550 and is shown below.
Triangular Distribution
• Example 2 (Continued):
Triangular Distribution
• W01-04 Voting for the student
representative on a school’s Board of
Trustees has closed but the votes have not
been counted. Candidate Simon Pegg wants
to determine the probability that he received
more than 450 votes. This means Simon
wants to determine P(X > 450).
Triangular Distribution
• W01-04 (Continued):
Triangular Distribution
• W01-04 (Continued):
a=
200
250
0.002041
0.255102
c=
550
350
0.002857
0.744898
b=
900
Triangular Distribution
• Example 3:
• W01-05 Find P(6.5 < X < 8) for a triangular
distribution with minimum value a = 1,
maximum value b = 9 and peak value c = 6.
• Answer: P(6.5 < X < 8) = ?
a=
1
1
c=
6
2.5
b=
9
3
0.083333 0.041667 0.21875
0.208333 0.260417
0.25
Triangular Distribution
• W01-05 (Continued)
Triangular Distribution
• W01-06 Find P(1.2 < X < 2.6) for a
triangular distribution with minimum value
a = 0.8, maximum value b = 2.8 and peak
value c = 2.0.
• Answer: P(1.2 < X < 2.6) = ?
a=
0.8
0.266667
0.708333333
0.025
c=
2
0.666667
1
1
0.25
b=
2.8
0.8
1.2
0.8
0.2
Triangular Distribution
• W01-06 (Continued)
TIME SERIES
Tahoe Salt
Consider the demand for rock salt used
primarily to melt snow. This salt is produced by a
firm called Tahoe Salt, which sells its salt
through a variety of independent retailers
around the Lake Tahoe area of the Sierra Nevada
Mountains. In the past, Tahoe Salt has relied on
estimates of demand from a sample of its
retailers, but the company has noticed that
these retailers always overestimate their
purchases, leaving Tahoe (and even some
retailers) stuck with excess inventory.
Copyright © 2016 Pearson Education, Inc.
7 – 81
Tahoe Salt
After meeting with its retailers, Tahoe has
decided to produce a collaborative forecast.
Tahoe Salt wants to work with the retailers to
create a more accurate forecast based on the
actual retail sales of their salt. Quarterly retail
demand data for the past three years are shown
in Table 7-1 and charted in Figure 7-1.
Copyright © 2016 Pearson Education, Inc.
7 – 82
Tahoe Salt
Year
Quarter
Period, t
Demand, Dt
1
2
1
8,000
1
3
2
13,000
1
4
3
23,000
2
1
4
34,000
2
2
5
10,000
2
3
6
18,000
2
4
7
23,000
3
1
8
38,000
3
2
9
12,000
3
3
10
13,000
3
4
11
32,000
4
1
12
41,000
TABLE 7-1
Copyright © 2016 Pearson Education, Inc.
7 – 83
Tahoe Salt
FIGURE 7-1
In Figure 7-1, observe that demand for salt is
seasonal, increasing from the second quarter of
a given year to the first quarter of the following
year. The second quarter of each year has the
lowest demand. Each cycle lasts four quarters,
and the demand pattern repeats every year.
Copyright © 2016 Pearson Education, Inc.
7 – 84
Tahoe Salt
FIGURE 7-1
There is also a growth trend in the demand, with
sales growing over the past three years. The
company estimates that growth will continue in
the coming year at historical rates.
Copyright © 2016 Pearson Education, Inc.
7 – 85
Time Series Plots
• A time series or time sequence is a data set in
which the observations are recorded in the order in
which they occur.
• A time series plot is a graph in which the vertical
axis denotes the observed value of the variable and
the horizontal axis denotes the time.
• When measurements are plotted as a time series, we
often see
•Trends (Moving Average)
•Cycles = seasonality
Time Series Plots
•A moving average (MA) is a widely used indicator
in technical analysis that helps smooth out demand
action by filtering out the “noise” from random
demand fluctuations. It is a trend-following, or
lagging, indicator because it is based on past prices.
4-period moving average
Period
t
Demand
Dt
Moving Average
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
8,000
13,000
23,000
34,000
10,000
18,000
23,000
38,000
12,000
13,000
32,000
41,000
19,500
20,000
21,250
21,250
22,250
22,750
21,500
23,750
24,500
Forecast
Ft
Error
Et
19,500
20,000
21,250
21,250
22,250
22,750
21,500
23,750
24,500
24,500
24,500
24,500
9,500
2,000
-1,750
-16,750
10,250
9,750
-10,500
-17,250
Forecast by 4-period moving average
45,000
40,000
35,000
30,000
25,000
20,000
15,000
10,000
5,000
0
0
2
4
6
8
10
12
14
Regression
(Simple linear) regression is a statistical
method that allows us to summarize and study
relationships between two continuous
(quantitative) variables:
• One variable, denoted x, is regarded as
the predictor variable.
• The other variable, denoted y, is regarded as
the response variable.
Regression
Since we are interested in summarizing the
trend between two quantitative variables, the
natural question arises — "what is the best
fitting line?" You were probably shown a
scatter plot of (x, y) data and were asked to
draw the "most appropriate" line through the
data.
Regression
You can try it now on a set of heights (x) and
weights (y) of 10 students. Look at the plot
below. The red line best summarizes the trend
Y
between height and weight. X
63
127
(inch)
(lb)
64
66
69
69
71
71
72
73
75
121
142
157
162
156
169
165
181
208
Regression
Coefficients
Intercept
X Variable 1
-266.5
Trend = 6.138
Discussion Problem W01-07
Perform Regression on the moving averages
to see the trend in Tahoe Salt Example.
(Use Excel | Data Analysis | Regression)
• Answer: Demand increases by 554 every
quarter
Coefficients
Intercept
17427.78
X Variable 1
554.1667
4-period moving average
Period
t
Demand
Dt
Moving Average
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
8,000
13,000
23,000
34,000
10,000
18,000
23,000
38,000
12,000
13,000
32,000
41,000
19,500
20,000
21,250
21,250
22,250
22,750
21,500
23,750
24,500
Forecast
Ft
Error
Et
19,500
20,000
21,250
21,250
22,250
22,750
21,500
23,750
24,500
24,500
24,500
24,500
9,500
2,000
-1,750
-16,750
10,250
9,750
-10,500
-17,250
PERT Application of beta distribution
Program Evaluation and Review Technique
(www.se.cuhk.edu.hk/~seem3530/files/ProjMgt-IA-PERT.ppt)
SHOPPING MALL RENOVATION
Estimation of the activity duration
Example: An activity was performed 40 times
in the past, requiring a time between 10 to 70
hours. The figure below shows the frequency
distribution.
PERT
SEEM 3530
97
Estimation of the activity duration
Example: An activity was performed 40 times
in the past, requiring a time between 10 to 70
hours. The figure below shows the frequency
distribution.
PERT
SEEM 3530
98
Estimation of the activity duration
The probability distribution of the
activity is approximated by a probability
frequency distribution.
PERT
SEEM 3530
99
Estimation of the activity duration
In project scheduling, we usually use a
beta distribution to represent the time
needed for each activity.
PERT
SEEM 3530
100
Estimation of the activity duration
• Three key values we use in the time estimate
for each activity:
a = optimistic time, which means that there is little
chance that the activity can be completed before
this time;
c = most likely time, which will be required if the
execution is normal;
b = pessimistic time, which means that there is little
chance that the activity will take longer.
PERT
SEEM 3530
101
Estimation of Mean and SD
• The expected or mean time is given by:
µ = (a + 4c + b)/6
The variance is:
𝜎 2 = (b-a) 2/36
▪ The standard deviation is (b - a)/6
PERT
SEEM 3530
102
Estimation of Mean and SD
For our example (Figure 7-3), we have a
= 10, b = 70, c = 35.
Therefore µ = 36.6, and 𝜎 2 =100.
PERT
SEEM 3530
103
Estimation of Mean and SD
Beta-distribution
a
PERT
c
b
Expected task time:
𝝁 = 𝒂+𝟒𝒄+𝒃
𝟔
Standard deviation:
𝒃−𝒂
𝟔
SEEM 3530
𝝈=
,
V =
𝝈𝟐
=
𝒃−𝒂
𝟔
104
𝟐
Example: Shopping Mall Renovation
Activity
A: Prepare initial design
B: Identify new potential clients
C: Develop prospectus for tenants
D: Prepare final design
E: Obtain planning permission
F: Obtain finance from bank
G: Select contractor
H: Construction
I: Finalize tenant contracts
J: Tenants move in
PERT
SEEM 3530
IP
a
1
4
A
2
A
1
D
1
E
1
D
2
G, F
10
B, C, E 6
I, H
1
m
3
5
3
8
2
3
4
17
13
2
b
5
12
10
9
3
5
6
18
14
3
105
Expected Activity Time and SD
Act
A
B
C
D
E
F
G
H
I
J
PERT
a
1
4
2
1
1
1
2
10
6
1
m
3
5
3
8
2
3
4
17
13
2
b
5
12
10
9
3
5
6
18
14
3
SEEM 3530
µ
3
6
4
7
2
3
4
16
12
2
1+ 4 3 + 5
2
t=
=3
6
0.44
1.78
1.78 = (12−4 ) = 1.78
6
1.78
0.11
0.44
0.44
1.78
1.78
0.11
2
2
106
Discussion Problem W02-01:
Expected Activity Time and SD?
Act
A
B
C
D
E
F
G
H
I
J
PERT
a
1
4
2
1
1
1
2
10
6
1
m
3
5
3
8
2
3
4
17
13
2
b
5
12
10
9
3
5
6
18
14
3
SEEM 3530
D
?
V
?
D=
1+ 4 3 + 5
=3
6
12− 4 = 1.78
)
6
2
V =(
107
The PERT Approach
▪ Now assume that the activity times are
independent random variables.
▪ Further, assume that there are n activities in
the project, k of which make the longest path
in the mean time. Denote the activity times
of the activities in the path by the random
variables di with mean E(di) and variances
V(di), for i = 1, 2, …, k.
PERT
SEEM 3530
108
The PERT Approach (cont’d)
▪ Then, the total length of the path is a random
variable X = d1 + d2 +,…, + dk
▪ The mean length, E(X), and its variance, V(X):
E(X)= E(d1)+E(d2)+,…, +E(dk)
V(X)= V(d1)+V(d2)+,…, +V(dk)
PERT
SEEM 3530
109
Discussion Problem W02-02:
a) The mean time of path A-D-E-F-H-J?
b) The standard deviation of the time of
the path?
I,12
B,6
1
C,4
J,2
E,2
End
F,3
A,3
PERT
D,7
G,4
SEEM 3530
H,16
110
Important Continuous Distributions
Normal Distribution
Undoubtedly, the most widely used model for the
distribution of a random variable is a normal
distribution.
• Central limit theorem
• Gaussian distribution
Important Continuous Distributions
Normal Distribution
Important Continuous Distributions
Normal Distribution
Important Continuous Distributions
Normal Distribution
Important Continuous Distributions
Normal Distribution
Important Continuous Distributions
Normal Distribution
Normal Distribution
Important Continuous Distributions
Normal Distribution
Important Continuous Distributions
Normal Distribution
Normal Distribution
Normal Distribution
Probability Background
1
PERT Application of beta distribution
Program Evaluation and Review Technique
(www.se.cuhk.edu.hk/~seem3530/files/ProjMgt-IA-PERT.ppt)
SHOPPING MALL RENOVATION
Estimation of the activity duration
Example: An activity was performed 40 times
in the past, requiring a time between 10 to 70
hours. The figure below shows the frequency
distribution.
PERT
SEEM 3530
3
Estimation of the activity duration
Example: An activity was performed 40 times
in the past, requiring a time between 10 to 70
hours. The figure below shows the frequency
distribution.
PERT
SEEM 3530
4
Estimation of the activity duration
The probability distribution of the
activity is approximated by a probability
frequency distribution.
PERT
SEEM 3530
5
Estimation of the activity duration
In project scheduling, we usually use a
beta distribution to represent the time
needed for each activity.
PERT
SEEM 3530
6
Estimation of the activity duration
• Three key values we use in the time estimate
for each activity:
a = optimistic time, which means that there is little
chance that the activity can be completed before
this time;
c = most likely time, which will be required if the
execution is normal;
b = pessimistic time, which means that there is little
chance that the activity will take longer.
PERT
SEEM 3530
7
Estimation of Mean and SD
• The expected or mean time is given by:
µ = (a + 4c + b)/6
The variance is:
𝜎 2 = (b-a) 2/36
▪ The standard deviation is (b - a)/6
PERT
SEEM 3530
8
Estimation of Mean and SD
For our example (Figure 7-3), we have a
= 10, b = 70, c = 35.
Therefore µ = 36.6, and 𝜎 2 =100.
PERT
SEEM 3530
9
Estimation of Mean and SD
Beta-distribution
a
PERT
c
b
Expected task time:
𝝁 = 𝒂+𝟒𝒄+𝒃
𝟔
Standard deviation:
𝒃−𝒂
𝟔
SEEM 3530
𝝈=
,
V =
𝝈𝟐
=
𝒃−𝒂
𝟔
10
𝟐
Example: Shopping Mall Renovation
Activity
A: Prepare initial design
B: Identify new potential clients
C: Develop prospectus for tenants
D: Prepare final design
E: Obtain planning permission
F: Obtain finance from bank
G: Select contractor
H: Construction
I: Finalize tenant contracts
J: Tenants move in
PERT
SEEM 3530
IP
a
1
4
A
2
A
1
D
1
E
1
D
2
G, F
10
B, C, E 6
I, H
1
c
3
5
3
8
2
3
4
17
13
2
b
5
12
10
9
3
5
6
18
14
3
11
Expected Activity Time and SD
Act
A
B
C
D
E
F
G
H
I
J
PERT
a
1
4
2
1
1
1
2
10
6
1
c
3
5
3
8
2
3
4
17
13
2
b
5
12
10
9
3
5
6
18
14
3
SEEM 3530
µ
3
6
4
7
2
3
4
16
12
2
1+ 4 3 + 5
2
t=
=3
6
0.44
1.78
1.78 = (12−4 ) = 1.78
6
1.78
0.11
0.44
0.44
1.78
1.78
0.11
2
2
12
Discussion Problem W02-01:
Expected Activity Time and SD?
Act
A
B
C
D
E
F
G
H
I
J
PERT
a
1
4
2
1
1
1
2
10
6
1
c
3
5
3
8
2
3
4
17
13
2
b
5
12
10
9
3
5
6
18
14
3
SEEM 3530
µ
?
2
?
D=
1+ 4 3 + 5
=3
6
12− 4 = 1.78
)
6
2
V =(
13
The PERT Approach
▪ Now assume that the activity times are
independent random variables.
▪ Further, assume that there are n activities in
the project, k of which make the longest path
in the mean time. Denote the activity times
of the activities in the path by the random
variables di with mean E(di) and variances
V(di), for i = 1, 2, …, k.
PERT
SEEM 3530
14
The PERT Approach (cont’d)
▪ Then, the total length of the path is a random
variable X = d1 + d2 +,…, + dk
▪ The mean length, E(X), and its variance, V(X):
E(X)= E(d1)+E(d2)+,…, +E(dk)
V(X)= V(d1)+V(d2)+,…, +V(dk)
▪ We call the longest path as critical path. (See
the flow chart on the next page, which
illustrates the precedence relations between
activities (IP in the table of activities).)
PERT
SEEM 3530
15
Discussion Problem W02-02:
a) The mean time of path A-D-E-F-H-J?
b) The standard deviation of the time of
the path?
I,12
B,6
1
C,4
J,2
E,2
End
F,3
A,3
PERT
D,7
G,4
SEEM 3530
H,16
16
Important Continuous Distributions
Normal Distribution
Undoubtedly, the most widely used model for the
distribution of a random variable is a normal
distribution.
• Central limit theorem
• Gaussian distribution
Important Continuous Distributions
Normal Distribution
Important Continuous Distributions
Normal Distribution
Important Continuous Distributions
Normal Distribution
Important Continuous Distributions
Normal Distribution
Important Continuous Distributions
Normal Distribution
Normal Distribution
Important Continuous Distributions
Normal Distribution
Important Continuous Distributions
Normal Distribution
Normal Distribution
Normal Distribution
PERT Application of normal distribution
Program Evaluation and Review Technique
(www.se.cuhk.edu.hk/~seem3530/files/ProjMgt-IA-PERT.ppt)
SHOPPING MALL RENOVATION
(REVISITED)
The PERT Approach
▪ The mean project length, E(X), and its variance,
V(X):
E(X)= E(d1) + E(d2) + … + E(dk)
V(X)= V(d1) + V(d2) + … + V(dk)
▪ Assumption:
▪ Activity times are independent random variables.
▪ The project duration (=sum of times of activity on a
critical path) is normally distributed.
▪ Based on the Central Limit Theorem, which states
that the distribution of the sum of independent
random variables is approximately normal when the
number of terms in the sum is sufficiently large.
PERT
SEEM 3530
29
The PERT Approach (cont’d)
▪ Using a normal distribution, the probability of completing
the project in not more than some given time T:
X-E(X)
T -E(X)
T -E(X)
P(X T) = P( ------------ ------------- ) = P(Z ----------)
V(X)1/2
V(X)1/2
V(X)1/2
where Z is the standard normal deviate with mean 0 and
variance 1.
• The probability for P(Z < ), given any , can be found
using normal distribution tables.
PERT
SEEM 3530
30
PERT
SEEM 3530
31
Example: Shopping Mall Renovation
Activity
A: Prepare initial design
B: Identify new potential clients
C: Develop prospectus for tenants
D: Prepare final design
E: Obtain planning permission
F: Obtain finance from bank
G: Select contractor
H: Construction
I: Finalize tenant contracts
J: Tenants move in
PERT
IP
a
1
4
A
2
A
1
D
1
E
1
D
2
G, F
10
B, C, E 6
I, H
1
SEEM 3530
c
3
5
3
8
2
3
4
17
13
2
b
5
12
10
9
3
5
6
18
14
3
32
Example: Issues to Address
1. Schedule the project.
2. What is the probability of completing
the project in 36 weeks?
PERT
SEEM 3530
33
Expected Activity Time and SD
Act
A
B
C
D
E
F
G
H
I
J
PERT
a
1
4
2
1
1
1
2
10
6
1
c
3
5
3
8
2
3
4
17
13
2
b
5
12
10
9
3
5
6
18
14
3
µ
3
6
4
7
2
3
4
16
12
2
SEEM 3530
1+ 4 3 + 5
2
t=
=3
6
0.44
1.78
1.78 = (12−4 ) = 1.78
6
1.78
0.11
0.44
0.44
1.78
1.78
0.11
2
2
34
CPM with Expected Activity Times
I,12
B,6
1
C,4
J,2
E,2
End
F,3
A,3
PERT
D,7
G,4
SEEM 3530
H,16
35
Critical Path and Expected Time
1. Critical path: A-D-E-F-H-J.
2. Expected Completion time: 33 weeks
3. What is the probability to complete the
project within 36 weeks?
-- Use the critical path to assess the
probability
PERT
SEEM 3530
36
Discussion Problem W02-03
The longest path in the mean time is A-D-EF-H-J. Assume that the length of the longest
path is the whole project duration.
What is the probability to complete the
project within 36 weeks? Use the longest path
to assess the probability
PERT
SEEM 3530
37
Probability Assessment
Expected project completion time:
Sum of the expected activity times
along the critical path.
Used to obtain
probability of project
= 3+7+2+3+16+2 = 33
completion
Variance of project-completion time
Sum of the variances along
the critical path.
2 = 0.44+1.78+0.11+0.44+1.78+0.11= 4.66
= 2.15
PERT
SEEM 3530
38
Assessment by Normal Distribution
P(X 36) = ?
Assume X ~ N(33, 2.152)
Normal
Distribution
= 2.15
- 36 - 33
T
=
= 1.4
z =
.
2.15
Standardized Normal Distribution
= 33 36
PERT
P(Z 1.4) = ?
=1
z
X
SEEM 3530
= 0 1.4
z
Z
39
Obtain the Probability
Standardized Normal Probability Table (Portion)
Z
.00
.01
.02
P(Z
Purchase answer to see full
attachment