Political Science 221
Problem Set #1
Spring 2020
Please answer all of the following problems on a separate sheet of paper. Your answers may be
handwritten. Be sure to show all steps in your calculations. (30 points total)
1. (1 points each) Given that k = {11.2, -4, 17, 11, 0.6, -4.5, 0, -8, -31, 13, 15.9, -3.5}, find the
following:
a) k 3
b) k 2−k 6
11
c)
∑ ku
u=2
5
d)
e)
∑ ku
u=1
∑ ku
f)
g)
h) Mode ( k )
i) MD ( k )
j) An outlier.
2. (2 points each) Given that l = {4, 2, 1, 6, 8, 7}; solve for m in each of the following. Show
all of your work at each step:
6
a) m=∑ l i
i=3
b)
3. Suppose that v is the data of an entire population and w is a sample taken from v. Given that:
v = {7, 5, 4, 4, 9, -2, 11, 5, 2, 0, -2, 5, -4, 6, 9}
w = {4, 5, -2, 2, 6, 5, -4, 9}
Find each of the following (4 points each):
2
a) σ v
b) σ v
2
c) s w
d) s w
POL 221: Political Analysis
Scott Granberg-Rademacker
Handout #1
Measures of Central Tendency
Measures of central tendency are mathematical operations which supply information about the “typical” observation in a set or variable. There are
several measures of central tendency, each with different pros and cons: expected values (sometimes called expectations, means or averages), medians,
and modes. Expected values (usually denoted as E (X) or x̄) are most commonly used in practice, but there are applications where medians (denoted
x̃) or modes may prove to be a better indicator of what the “typical” observation is like.
Most of the time, the expected value is identical to the simple average,
which is nothing more than the arithmetic mean of a set or variable. Simple
averages, however, make the assumption that the probability of each observation is equal: P (x1 ) = P (x2 ) = · · · = P (xk ). If X is a discrete stochastic
variable, the simple average can be simply found as follows:
n
P
E(X) = x̄ =
xi
i=1
n
(1)
However, such an assertion may or may not be true. If the probabilities
assocatied with each observation are different, then the expected value is a
weighted average. Consider the expected value of a variable, x, where the
probability of each possible observation is different. In a case like this, the
expected value would simply be each observation times its probability:
E(X) = x̄ =
n
X
xi f (xi )
(2)
i=1
Though the problem with weighted averages in practice is that we often do
not know the exact probabilities that make up f (x) (remember that f (x)
is the probability density function of x). When these probabilities are not
known, the most common approach is to simply assume that the probabilities are all the same and use the simple average formula.
1
One of the main problems with using expected values is that the influence of
outliers is poorly mitigated. Basically, extreme values which are not “typical” of other observations may heavily skew the expected value. Consider
two variables:
a = {3, 4, −2, 4, 5, 3}
b = {3, 4, −2, 4, 5, 3, 170}
The only difference between the two is that B has one more observation than
A, but that single observation is clearly much different than the rest of the
observations. Such abnormal observations are outliers, which can badly skew
the expected value:
n
P
ā =
n
n
P
b̄ =
ai
i=1
i=1
n
=
3+4+(−2)+4+5+3
6
=
3+4+(−2)+4+5+3+170
7
bi
=
17
6
=
= 2.83
187
7
= 26.71
So, how can one consider extreme outliers while still getting a good idea
about the “typical” observation? Another possibility is to use the median.
The median of a set or variable is the value that has just as many values
greater than it as are less than it. When the set or variable has an even
number of observations, the median is the average of the two middle values.
When the set or variable has an odd number of observations, the median is
simply the middle value.
It is important to note for discrete variables that the median will always
satisfy the following condition:
P (X ≤ x̃) ≥ 0.5 ≤ P (X ≥ x̃)
(3)
Finding the median is quite simple. The first step is to arrange the values in
the variable(s) from least to greatest. Let us denote the arranged variables
as a∗ and b∗ .
a∗ = {−2, 3, 3, 4, 4, 5}
b∗ = {−2, 3, 3, 4, 4, 5, 170}
When the total number of observations is odd, the median can be found
using the following formula:
(4)
x̃ = x∗n+1
2
2
and when the total number of observations is even:
x̃ =
x∗n + x∗n +1
2
2
2
(5)
Since a has six observations (n = 6), it is necessary for us to use Equation 5
to find the median of a:
ã =
a∗n + a∗n +1
2
2
2
=
a∗6 + a∗6 +1
2
2
2
=
a∗3 + a∗3+1
a∗ + a∗4
3+4
7
= 3
=
= = 3.5
2
2
2
2
Finding the median of b is simply a matter of using Equation 4, since b has
an odd number of observations (n = 7):
b̃ = b∗n+1 = b∗7+1 = b∗8 = b∗4 = 4
2
2
2
When we compare the means and medians of a and b, one can see that they
are not the same:
ā = 2.83, ã = 3.5
b̄ = 26.71, b̃ = 4
However, both the mean and median are fairly “typical” of a, which is to
be expected since there is no extreme outlier in a. Note that the mean of b
has been heavily skewed by the outlier but the median of b easily mitigates
the impact of the outlier. This illustrates one of the nice properties of the
median–it tends to be resistant to outliers.
Another measure of central tendency which is not used very often is the
mode. The mode of a set or variable is simply the value that occurs most
frequently within that set or variable. It is possible that for any given set or
variable, there may be one mode, several modes, or no modes. For example,
the mode of a is simply:
Mode (a) = {3, 4}
Modes are seldom used in practice for good reason. They are often unreliable
and misleading, as illustrated in the following example:
c = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 902, 902}
Where the mode of c is:
3
Mode (c) = 902, which is hardly typical of c.
Consider another example:
d = {1, 2, 3, 4, 5, 6, 7}
In this instance, there is no mode of d, because there is only one instance of
each value.
Mode (d) = ∅, where ∅ denotes an empty set.
Measures of Variability are mathematical operations which measure the
amount of dispersion or spread in a given set or variable. While measures of
central tendency tell you what the “typical” observation is like, measures of
variability tell you hbow dispersed or spread out the data in a set or variable is. There are several measures of variability available to us, each with
advantages and disadvantages.
The most basic measure of variability is the range. The range of a set or
variable is simply the largest value minus the smallest value. The range can
be denoted as:
Range (x) = xmax − xmin
(6)
So if we have two variables:
e = {3, 5, 5, 7}
f = {4, 4, 6, 6}
Finding the ranges is quite simple:
Range (e) = 7 − 3 = 4
Range (f ) = 6 − 4 = 2
Ranges are nice but are only informative about the extreme values of a variable. This means that they are susceptible to outliers, and can ultimately
provide a badly skewed picture of the variability of a variable.
A better measure of variability is the mean deviation. The mean deviation
is the average distance an observation in a set or variable is away from the
mean. This makes for a nice interpretation about the “typical” observation.
4
The mean deviation can be found by using the following formula:
n
P
MD (x) =
|xi − x̄|
i=1
n
(7)
Absolute value bars || simply mean that after all operations in the absolute value are finished, negative numbers are turned positive. For example,
|5 − 8| = |−3| = 3. The absolute value of a positive number is a positive
number: |5| = 5.
Despite the nice interpretation, absolute values are not used all that often.
First of all, absolute values are problematic (particularly for computers) when
doing more complex operations. Secondly, it is possible for variables with different distributions to have the same mean deviation. Consider e and f once
again:
e = {3, 5, 5, 7}
f = {4, 4, 6, 6}
Clearly they are distributed differently, but the mean deviation will not reveal this to us. Observe how both mean deviations yield the same result
(keep in mind both ē and f¯ = 5):
n
P
|ei − ē|
|3 − 5| + |5 − 5| + |5 − 5| + |7 − 5|
=
n
4
|−2| + |0| + |0| + |2|
2+0+0+2
4
=
= =1
4
4
4
MD (e) =
i=1
n
P
=
fi − f¯
|4 − 5| + |4 − 5| + |6 − 5| + |6 − 5|
=
n
4
|−1| + |−1| + |1| + |1|
1+1+1+1
4
=
= =1
4
4
4
MD (f ) =
i=1
=
This is where the variance (commonly denoted σ 2 which is pronounced
“Sigma squared”) and standard deviation (denoted σ) can help out. The
formula for the variance is very similar to the mean deviation, but it avoids
the problem of taking the absolute value by simply squaring the deviations.
Additionally, it provides us with a measure that is more sensitive to variation
5
than the mean deviation. The formula for the variance is simply:
n
P
σ2 =
(xi − µ)2
i=1
n
The variance is simply the square root of the variance:
√
σ=
σ2 =
(8)
v
n
uP
u (xi − µ)2
t
i=1
(9)
n
All of these benefits do have a downside, however. Since the deviations are
being squared, the variance and standard deviation do not have a clean and
simple interpretation like the mean deviation does. It does have some nice
qualities which will be illustrated when we talk about distributions and hypothesis testing.
So how do the variance and standard deviation fare with e and f ? Let’s find
the variances:
n
P
σe2
=
i=1
n
P
σf2 =
(ei − µe )2
(3 − 5)2 + (5 − 5)2 + (5 − 5)2 + (7 − 5)2
=
n
4
(−2)2 + 02 + 02 + 22 4 + 4
8
=
+
= =2
4
4
4
(fi − µf )2
(4 − 5)2 + (4 − 5)2 + (6 − 5)2 + (6 − 5)2
n
4
2
2
2
2
(−1) + (−1) + 1 + 1
1+1+1+1
4
=
+
= =1
4
4
4
i=1
=
And the standard deviations:
σe =
p
σe2 =
√
2 = 1.41
q
√
σf = σf2 = 1 = 1
6
Notice that the standard deviations are close (or identical in the case of f )
to the mean deviations found, but are still different from each other–better
reflecting the true variability of e and f . In general, the larger the standard
deviation, the greater the variability.
All of what we have done so far assumes that we are dealing with populations. Populations are complete sets of all observations of interest. In
reality, true populations are often unknown. Most of the time, what we have
in social science is sample data. Samples are simply subsets of a population. Because we often deal with sample data, we need to account for the
uncertainty that needs to be accounted for in a sample. Think of it like a
currency: every observation in a sample is a currency unit, but whenever
an estimate is calculated, one unit of currency is “spent”. These “currency”
are known as degrees of freedom (referred to as “df” for short), and one
degree of freedom is lost when we “spend” it to calculate an estimate.
More technically, degrees of freedom are any of the unrestricted, random variables that constitute a statistic. In practicality, this means that we have to
make small adjustments to some of our formulas when dealing with samples.
The biggest change for us right now is to remember that the formulas for
variance and standard deviations need to be slightly corrected. The sample
variance can be found using the following formula:
n
P
s2 =
(xi − x̄)2
i=1
n−1
And the sample standard deviation is:
v
uP
u n
u (xi − x̄)2
√
t
s = s2 = i=1
n−1
(10)
(11)
You might ask, what really changed? The most noticeable change is that the
Greek letter σ is not used in either formula. Instead, the sample variance
is denoted as s2 and the sample standard deviation is denoted as s. These
are estimates which approximate the unknown population variance σ 2 and
population standard deviation σ. Since these are sample estimates, we lose
one degree of freedom, which is taken off of the denominator. So instead of
dividing by n, we divide by n − 1, when finding s2 and s.
7
Also of note is that the typical notation for the population mean and sample
mean are different. The population mean is usually denoted by the Greek
letter µ (pronounced “mu”), and the sample mean is usually denoted with
a bar over the variable name, x̄. Once again, in practice the true value of
µ is often unknown, and the mean of the observed sample data x̄ is only an
estimate of µ.
EXCEL Commands:
Average: =AVERAGE(number1,number2,...)
Median: =MEDIAN(number1,number2,...)
Mode: =MODE(number1,number2,...)
Range: =MAX(number1,number2,...)-MIN(number1,number2,...)
Mean Deviation: =AVEDEV(number1,number2,...)
Population Variance: =VARP(number1,number2,...)
Population Standard Deviation: =STDEVP(number1,number2,...)
Sample Variance: =VAR(number1,number2,...)
Sample Standard Deviation: =STDEV(number1,number2,...)
8
POL 221: Political Analysis
Scott Granberg-Rademacker
Handout #2
Normal Distribution
The shape of the normal distribution is the famous bell-shape shown below.
Figure 1: Normal Distribution
The normal distribution (also sometimes called the Gaussian distribution)
first appeared in print by Abraham de Moivre in 1733. It is easily the single
most important distribution ever.
The normal distribution has two parameters: mean (denoted µ) and variance
(denoted σ 2 ). The pdf of the normal distribution seems intimidating, but
fortunately we don’t really have to deal with it all that much in this class:
x−µ 2
[ σ ]
1
f x; µ, σ = √
e− 2
(1)
2πσ
for −∞ < x < ∞, where −∞ < µ < ∞ and 0 < σ < ∞. A normally
distributed random variable, X, is denoted: X ∼ N (µ, σ 2 ).
2
1
The importance of the normal distribution is in how it relates to most other
distributions. In fact, the central limit theorem states that if any given
distribution (normal or non-normal) has a finite mean µ and variance σ 2 ,
then the sampling distribution of the mean will approach the normal distri2
bution with a mean µ and variance σn where the sample size n increases and
approaches infinity n → ∞.
Tests of Hypotheses
What are hypotheses?
Hypotheses are sets of statements (usually two statements) which meet the
following criteria:
1. They are mutually exclusive, which means that it is not possible for
both statements to be true or false at the same time. If one is true
then the other must necessarily be false, and vice-a-versa.
2. They are collectively exhaustive, which means that all possibilities
must be accounted for.
3. There must be adequate data of sufficient quantity and quality by which
the statements in the set can be tested for truth or falsity.
Consider an example whereby you might be interested in knowing whether
the average age of children at your daycare center is significantly different
than the average age of daycare centers nationally. Let the national average
age of children at daycare centers be denoted as µ, and let the average age
of children at your daycare be denoted as x̄.
The relationship between µ and x̄ can be expressed in six possible ways:
1. µ 6= x̄
2. µ > x̄
3. µ < x̄
4. µ = x̄
5. µ ≥ x̄
6. µ ≤ x̄
2
Hypothesis sets are typically denoted as two different statements, H0 and H1 .
H0 is what is known as the null hypothesis (H0 is actually pronounced “H
not”) and H1 (which is pronounced “H one”) is the alternative hypothesis. It is important to remember when constructing a hypothesis set that
the equals sign (which could be expressed as =, ≥, or ≤) is always going to
be in H0 . Alternatively, H1 will never have an equals sign in it. Instead, H1
will be directly relatable to your suspicion about the relationship expressed.
For example, if you believed that than your daycare center had younger
children (on average) than daycare centers nationwide, your suspicion would
be:
Age of children at your daycare < Age of children at daycares nationwide
Which is the same as stating:
x̄ < µ
And since this is our suspicion, we can denote it as H1 :
H1 : x̄ < µ
Now that we have H1 , we need to construct H0 . We must include all other
possibilities and we must make sure that the equals sign is included in the
expression in H0 . In H1 , we stated our belief that children are on average
younger at your daycare than at daycares nationwide. If this statement is
false, then one of the following must be the case: children at your daycare
must be older or the same age as children at daycares nationwide. We could
express this formally as H0 :
H0 : x̄ ≥ µ
If we put H0 and H1 together, we have a hypothesis set that is both mutually
exclusive and collectively exhaustive:
H0 : x̄ ≥ µ
H1 : x̄ < µ
3
DIFFERENCE BETWEEN MEANS OF SAMPLE AND POPULATION WITH LARGE
SAMPLES (n > 30)
2-tailed test
Let’s say that you are interested in knowing whether or not your sample mean is different than
the known mean of your population 1.
Examples: Let’s say that you are interested in knowing whether the average age of children at
your daycare center are different than the average age of children at daycare nationally. Let x be
the ages of the children at your daycare, and your daycare has 30 children (n=30).
x = {5, 6, 6, 2, 4, 0, 9, 5, 5, 4, 4, 6, 7, 1, 2, 9, 0, 5, 6, 2, 2, 3, 8, 9, 9, 0, 0, 6, 5, 5}
The average child age at your daycare:
The variance: s2 = 8.05, and standard deviation: s = 2.84
Census Bureau data on daycares states that the average age of children at daycare is 5.7 years
old, and the population variance is 5.1 and population standard deviation is 2.26
So our population figures are:
µ = 5.7
σ 2 = 5.1
σ = 2.26
We then state our hypotheses (H0 must always contain an equal sign):
H0:
H1:
Or stated another way:
H0:
H1:
Since we have 30 or more observations, we can use the large-sample approximation to assume
that our sampling distribution is approximately normal. We then use the following formula to
calculate the test statistic:
So then we go through the actual calculation:
1
It is useful to know that most of the time, the true population mean (µ) of a sample is not known; neither is the true
population variance (σ2).
3
Once we have the z-score, we must determine whether or not our z-score is inside or outside of
the critical region.
We have to determine what our α-level is going to be. Think of this in terms of: how certain do
you want to be in your result? Most commonly, α = .05, though sometimes scientists want a
higher standard of proof, so they may choose a smaller α level. Basically, what this means is
that you are testing your hypothesis against a certain confidence level. This level of confidence
is:
1 - α = confidence level
So in our example, if we choose α = .05, then our confidence level is 95% (confidence level is
always 1-α, so if α = .05 like it does in this instance, 1-.05=.95, or 95% confidence).
Next we need to look our z-table.
If we are conducting a two-tailed test (which we are in this case), we would look to see if the
following statement is true or not:
, where
In this instance,
and
are found by looking at the z-table.
, so the trick is to find the values that most closely matches:
, which in our case is .475, since .5 - .025 = .475.
The closest match from the table is 1.96, with a value of .4749. To find 1.96, just look follow
straight across from .4749 on the table to arrive at 1.9, then follow straight up from .4749 to find
.06, put them together and your = 1.96
So in our instance the equation
is actually false, since the expression:
-1.96 < 2.31 < 1.96 is false. When this is false, we REJECT H0. Meaning that we can be 95%
is significantly different than the population mean of µ = 5.7.
confident that our mean of
1-tailed test
In the previous example, we used the two-tailed test because we didn’t know for sure whether
our mean was going to be smaller or larger than the population mean. 1-tailed tests are used
when you have a good idea which way you want to test.
4
Let’s say that the follow ...

Purchase answer to see full
attachment