Measures of Central Tendency
Measures of Central Tendency (MCT) are numerical values that locate the middle of a set of
data. The term “average” is often associated with these measures; however, there are
several other values related to MCT. Each of these can also be called the “average value.”
These include mean, median, mode, and midrange.
The mean value, X , is the sum of all the values divided by the number, n, of the values.
Thus,
x
x
n
For example, a sample consists of 5 values: 6, 3, 8, 5, and 3. The mean of these values is
calculated in the following manner.
x
x
n
63853
5
25
5
5
~
The median value, x , is the middle value when the data is ranked in order of size. To
~
calculate x , the middle position, or Position of Median, must first be found. To find the
middle position, i, use the following formula:
i
n1
2
Where:
n is the number of data points
i is the position that the median occupies in the ranked data
For the previous data, the ranked sample data is 3, 3, 5, 6, and 8.
i=
n 1 5 1
3
2
2
QSO 530 Module 2
1
The data in the third position of the data is 5, thus ~
x = 5.
It is important to note that the median value is the same regardless of which end, high or
low, from which you count.
While the previous example had an odd number of data, there are instances when there is
an even number of data. The following is provides an example of finding the median with an
even number of data.
Data (ranked in order): 6, 7, 8, 9, 9, 10
n 1
6 1
3.5 where n is the number of data. The median is between the third and
2
2
fourth pieces of data. Thus, the mean value between the two pieces of data must be found.
89
17
8.5
Thus, ~
x=
2
2
i=
The mode is the value within the data set that occurs most frequently. For example, the
mode for the following data set is 3: 6, 3, 8, 5, 3. This is because the value that occurs most
often is 3. Be aware that there is no mode if there is more than one value that occurs most
frequently. For example, in the following example, 3, 3, 4, 5, 5, 7, both 3 and 5 have the
same number of occurrences. There is no mode for this data.
The midrange is the number which is exactly midway between the highest number (H) and
the lowest number (L). Thus,
MR = (L + H)/2
For example, the midrange value for the data set, 3, 3, 4, 5, 5, 7, is calculated as follows:
37
5
MR =
2
Measures of Dispersion-Spread
There are three numerical values that show the amount of spread, or variability, found within
data. These three values are Range, Variance, and Standard Deviation. Each of these
values provides different information about the spread or variability within the data.
2
QSO 530 Module 2
The measures of dispersion follow two rules:
1. Closely grouped data will have relatively small values of spread
2. More widely spread data will have larger values for the measures of dispersion
Range is the simplest measure of dispersion. It is the difference between the largest (H =
highest) and the smallest (L = lowest) values within the data. Thus,
Range = H – L
For example, the following data set: 3, 3, 5, 6, 8 has the range of 5. Range = 8 – 3 = 5
The next two values are Variance and Standard Deviation (SD). These two measures are
actually measures of dispersion about the mean (mean was covered earlier). To understand
the concepts of Variance and Standard Deviation, deviation from the mean must first be
defined. Deviation from the mean is x x where x is the actual value and x is the mean of
the values in the data set.
Remember from the preceding section that the mean is x
x
.
n
Consider the following example data set where the mean is calculated as x
n
6
3
8
5
3
n=5
x 25
n
25
5.
5
xx
x
1
2
3
4
5
x
(6-5)
(3-5)
(8-5)
(5-5)
(3-5)
1
-2
3
0
-2
x x 0
Notice that for each value of x, its variance is calculated, but the sum of those variances is 0.
The sum of the variances is always 0 and does not describe anything specific about this
data. However, a mean deviation can be calculated, and this will provide a more meaningful
value.
QSO 530 Module 2
3
This is done by making all the deviations positive through the use of absolute values. Thus,
x x becomes x x . The value chart can be changed to the following:
x x
xx
xx
(6-5)
1
1
1
9
(3-5)
-2
2
4
8
64
(8-5)
3
3
9
4
5
25
(5-5)
0
0
0
5
3
6
(3-5)
-2
2
4
n=5
x 25
0
xx =8
2
n
x
x
1
6
36
2
3
3
x
2
140
2
x x
2
18
Notice that the sum of the absolute value of variance is 8. Using this number, the mean
deviation is calculated.
Mean deviation =
x x
n
x x
8 1.6 .
For the data above, this is calculated as
n
5
The measure of mean deviation is not used frequently because there are other ways of
canceling the zeroing effect seen in summing the variance. This is usually done by squaring
the each deviation and summing the values. Remember from earlier math courses that
squaring a value always results in a non-negative value.
The sum of these squares of the deviation from the mean is
4
QSO 530 Module 2
x x
Thus, x x
2
2
18 in the above chart.
The squares of the deviation from the mean are used in calculating the variance.
Variance, s 2 , of a sample is defined as the measure of the spread of the data about the
mean.
x x
s
n1
2
2
Where n is the sample size
Using the above data,
s
2
xx
2
n 1
18
4.5
4
Standard Deviation is the positive square root of the variance. Thus,
s s
2
( x x ) 2
n1
From the example above, s s 2 4.5 2.1.
An alternative formula for calculating the square of the standard deviation is
n( x 2 ) ( x) 2
s
n(n 1)
2
Based on this, the above example can be calculated
s2
5(143) (625) 90
4.50
5(5 1)
20
S.D. s
4.50 2.1
QSO 530 Module 2
5
Coding and Measures of Position
Coding Data
When working with large numbers, it often difficult to calculate using those large values. It is
better to adjust the numbers to make calculations easier. Consider the following data values:
497, 499, 500, 502, 503. To adjust the values, the median value, in this case, 500, is reset to
0. This value is called X0. Once X0 is identified, U is calculated for each value in the data set.
U = X – X0
Assume X0 = 500, the data values in the example are as follows:
x
U
497
-3
499
-1
500
0
502
+2
503
+3
Based on this data, the following values are easier to calculate:
Just as the mean value, X
U
U2
-3
9
-1
1
0
0
+2
4
_+3
9
U 1
U 23
2
X , now the mean value can now be expressed in terms of U,
n
such that
6
QSO 530 Module 2
U
For the example above, U
U
n
U 1 0.2 . Now X can be expressed in terms of U such
n
5
that
X U X0
So that, X U 500
U
1
500 500 500.2
n
5
Also, standard deviation can be expressed in terms of U, such that
n( U2 ) ( U)2
s s
n(n 1)
2
x
2
u
For the example data above,
n( U2 ) ( U) 2 5(23) (1)2 115 1
5.7
n(n 1)
5( 4)
20
s 5.7 2.4
s 2x s u2
The standard deviation of the population is 2.4 from the coded value of X0 = 500.
Measures of Position (MP)
Measures of Position allow a data value to be located in relation to the other data points
within the dataset. The three most common measure of position are Quartiles, Percentiles
and Standard Score, or Z-score.
Quartiles
Quartiles divide ranked data into quarters. There are 3 quartile values, Q1, Q2, and Q3. Q1 is
the value that one forth (25%) of which the data set is below. Q 2 is the median value of the
ranked values in the data set. Q3 is the value that three-fourths (75%) of the data set is
below. See the following diagram.
QSO 530 Module 2
7
25%
25%
L
Q1
25%
Q2
25%
Q3
H
To find the quartile values, the dataset values must first be ranked from lowest to highest. To
find Qi,
i = (nk)/100%
where
n is the number of data points
k is the percentile position.
Consider the following data which consists of 50 test scores (or data points) from a QSO530
final exam:
27
43
43
44
47
49
50
54
58
65
68
71
71
71
73
73
74
75
76
77
Class Raw Test Scores
79
84
91
80
84
91
81
86
93
83
88
94
84
88
94
94
97
97
103
106
107
108
108
116
120
120
122
123
127
128
To find Q1, first find i. For the first quartile for the above data, i= (50 X 25%)/100% =
1250%/100% = 12.5. So Q1 is the value half way between the 12th and 13th data points.
Therefore, Q1 = (71 + 71)/2 = 71.
Once the 1st quartile data point is found, it is easy to find the 3 rd. In this case, i = (50 X
75%)/100% = 3750%/100% = 37.5. Thus, Q3 is half way between the 37th and 38th data
points and Q3 = (97 + 97)/2 = 97.
Percentiles
A percentile value assumes the data is divided into 100 equal parts. Just as with calculating
Quartiles, percentiles require data be ranked in increasing order. The n th percentile is value
in the nth position.
8
QSO 530 Module 2
If the above data is a set of exam scores and a student knows that she scored at the 85th
percentile, she knows that she did better than 85% of the other students and 15% of the
students did better than she did on the exam.
To find the 56th Percentile, we must first find nth Percentile = (.nk)
n= (.56) X (50) = 28
The 56th Percentile value is the 28th position in the ranked data set. In the data above, the
28th position is 86, thus the 56th Percentile = 86.
Mid-quartile (MQ)
The Mid-quartile is one more measure of Central Tendency. It is the median of the 1 st and
3rd quartile values. Thus, MQ = (Q1 + Q3)/2
Z-Score
The z-score is a measure of relative position with respect to the mean. z is the position of a
value in terms of the number of Standard Deviations the data point is from the mean.
Remember that Standard Deviation, s, was discussed above as the positive square root of
the variance. Z is the difference of the value and the mean, divided by the Standard
Deviation. Thus,
z
(x x )
s
Based on the class test score data above, we know that the mean is 83.7 and the standard
deviation is 24.3. If a student’s raw score was 120, we can calculate how many standard
deviations the score is from the mean.
(120 83.7)
36.3
1.49 or z is approximately 1.5 Standard Deviations from the
24.3
24.3
mean.
z
It is important to note that very seldom will a value be more than 3 standard deviations from
the mean. This means very rarely will you see z> 3.
John and his friend, Jake, took exams in two different classes. John received a raw score of
45, while Jake received a 72 in his class. Whose grade is better?
QSO 530 Module 2
9
At first glance, you might say that Jake did better, but let’s look at the z-score of each grade.
The mean of the exam in John’s class is 38, with a S.D. of 8 points. In Jake’s class, the
mean was 65 with a S.D of 14 points. Notice that both grades are 7 points above the mean.
John’s z-score: z
( 43 38)
7
8
8
Jake’s z-score: z
(72 65)
7
1
14
14
2
John did better than Jake because, in this case, 7/8 is better than 1/2. This means that
John’s score was further above the mean than Jake’s score.
10
QSO 530 Module 2
Purchase answer to see full
attachment