Econ7810: Applied Econometrics, Fall 2021
Homework #2
Due date: 22 October. 2021; 1pm.
Do not copy and paste the answers from your classmates. Two identical homework will be treated as
cheating. Do not copy and paste the entire output of your statistical package's. Report only the relevant part
of the output. Please also submit your R-script for the empirical part. Please put all your work in one single
le and upload via Moodle.
Part I
Multiple Choice (3 points each, 21 points in total)
Please choose the answer that you think is appropriate.
1.1 When there are omitted variables in the regression, which are determinants of the dependent variable,
then
a. you cannot measure the eect of the omitted variable, but the estimator of your included variable(s)
is (are) unaected.
b. this has no eect on the estimator of your included variable because the other variable is not included.
c. this will always bias the OLS estimator of the included variable.
d. the OLS estimator is biased if the omitted variable is correlated with the included variable.
1.2 If you had a two regressor regression model, then omitting one variable which is relevant
a. will have no eect on the coecient of the included variable if the correlation between the excluded
and the included variable is negative.
b. will always bias the coecient of the included variable upwards.
c. can result in a negative value for the coecient of the included variable, even though the coecient
will have a signicant positive eect on Y if the omitted variable were included.
d. makes the sum of the product between the included variable and the residuals dierent from 0.
1.3 Consider the multiple regression model with two regressors X1 and X2, where both variables are
determinants of the dependent variable. You rst regress Y on X1 only and nd no relationship. However
when regressing Y on X1 and X2, the slope coecient changes by a large amount. This suggests that your
rst regression suers from
a. heteroskedasticity
b. perfect multicollinearity
c. omitted variable bias
d. dummy variable trap
1.4 Imperfect multicollinearity
a. implies that it will be dicult to estimate precisely one or more of the partial eects using the data at
hand
b. violates one of the four Least Squares assumptions in the multiple regression model
c. means that you cannot estimate the eect of at least one of the Xs on Y
d. suggests that a standard spreadsheet program does not have enough power to estimate the multiple
regression model
1.5 If you reject a joint null hypothesis using the F-test in a multiple hypothesis setting, then
a. a series of t-tests may or may not give you the same conclusion.
b. the regression is always signicant.
c. all of the hypotheses are always simultaneously rejected.
d. the F-statistic must be negative.
1
1.6 If the estimates of the coecients of interest change substantially across specications,
a. then this can be expected from sample variation
b. then you should change the scale of the variables to make the changes appear to be smaller
c. then this often provides evidence that the original specication had omitted variable bias
d. then choose the specication for which your coecient of interest is most signicant
1.7 You have estimated the following equation:
d
T estScore
= 607.3 + 3.85Income − 0.0423Income2
where TestScore is the average of the reading and math scores on the Stanford 9 standardized test
administered to 5th grade students in 420 California school districts in 1998 and 1999. Income is the average
annual per capita income in the school district, measured in thousands of 1998 dollars. The equation
a. suggests a positive relationship between test scores and income for most of the sample.
b. is positive until a value of Income of 610.81.
c. does not make much sense since the square of income is entered.
d. suggests a positive relationship between test scores and income for all of the sample.
Part II
Short Questions (32 points in total)
Please limit your answer to less than or equal to 5 lines per sub-question.
(12 points) 2.1 This question is about an omitted variable bias. The following model estimates the eects
of age on time spent sleeping by adults:
d = 3128.91 + 3.54age,
sleep
(59.47)
(1.47)
2
n = 706, R = 0.008,
where sleep is measured in minutes per week and age is measured in years. The standard errors are given in
the parentheses.
(2 points) (i) Interpret the coecient estimate on age.
(2 points) (ii) It is likely that adults trade o sleep for work. If you are also given the data for time spent
working, measured in minutes per week, and include this variable (say totwrk) in the above regression, what
would you expect the sign of the coecient on totwrk to be?
(8 points) (iii) Part (ii) suggests that there might be an omitted variable bias in the original simple
regression because totwork is not included. For an omitted variable bias to exist, what additional condition
needs to be met? Do you think this condition holds in reality? If yes, what do you expect the sign of the
omitted variable bias is?
(20 points) 2.2 You have collected data for 104 countries to address the dicult questions of the determinants for dierences in the standard of living among the countries of the world. You recall from your macroeconomics lectures that the neoclassical growth model suggests that output per worker (per capita income)
levels are determined by, among others, the saving rate and population growth rate. To test the predictions of
ˆ
this growth model, you run the following regression:RelP ersInc
= 0.339−12.894×n+1.397×sk, R2 = 0.621
where RelP ersInc is GDP per worker relative to the United States, n is the average population growth
rate, 1980-1990, and sk is the average investment share of GDP from 1960 to1990 (remember investment
equals saving).
(6 points) (i) Interpret the results. Do the signs correspond to what you expected them to be? Explain.
(Hints:The Solow growth model predicts higher productivity with higher saving rates and lower population
growth.)
2
(8 poins) (ii) You remember that human capital in addition to physical capital also plays a role in determining the standard of living of a country. You therefore collect additional data on the average educational
attainment in years for 1985, and add this variable (Educ) to the above regression. This results in the
modied regression output:RelPd
ersInc = 0.046 − 5.869 × n + 0.738 × sk + 0.055 × Educ, R2 = 0.775
When missing variable Educ, what happen to the coecient estimates of n and sk? Explain the reason
and mechanism in detail.
(2 points) (iii) Upon checking the regression output, you realize that there are only 86 observations, since
data for Educ is not available for all 104 countries in your sample. Do you have to modify some of your
statements in (b)?
(4 points) (iv) Brazil has the following values in your sample: RelP ersInc = 0.30, n = 0.021, sk = 0.169,
Educ = 3.5 Does your equation overpredict or underpredict the relative GDP per worker? What would
happen to this result if Brazil managed to double the average educational attainment?
Part III
Empirical part (47 points in total)
Please limit your answer to less than or equal to 10 lines per sub-question. PLEASE REPORT YOUR
REGRESSION OUTCOMES IN TABLES, NOT SCREENSHORTS.
(30 points) 3.1 Use the data attendance2018.dta for this exercise and check the label of each variable
for the meaning. Dr. Qin want to study the relationship between attendance rate (attend) and some
characteristics/performance of students. attend is measured as a percent, and the score (hw1 ) has a maximum
possible value of 100.
(4 points) (i) Please provide a summary of statistics table including the number of observations, mean,
standard deviation, minimum and maximum of all variables in the dataset.
(4 points) (ii) Using the data to estimate the population model
attend = β0 + β1 hw1 + u
Report the results in a table, including sample size and R-squared. Interpret the coecient β1 . Does hw1
explain a lot of the variation in the attendance rate?
(4 points) (iii) Dr. Qin just found another variable is available in this data set, entry _GP A, which is the
GP A before the students were enrolled in the program. If Dr. Qin is interested in discovering the relationship
between attendance rate and the rst homework score (hw1), should Dr. Qin include this variable into the
regression? Explain.
(8 points) (iv) Dr. Qin decides to include entry _GP A. Please use the data to estimate the model
attend = β0 + β1 hw1 + β2 entry _GP A + u
Please report the result and interpret β1 and β2 . Do they make sense? Please derive the sign of the possible
bias if entry _GP A is excluded.
(4 points) (v) Dr. Qin wants to see if there is a gender gap in the attendence rate. Please suggest a
regression model, estimate it and use result to answer the question.
(6 points) (vi) Dr. Qin further wonders whether her teaching is equally attractive/boring to dierent
ethnic groups of students. There are in total three ethnic groups in the class, indicating by three dummy
variables - black , white and asian. Please suggest a regression model, estimate it and use the result to
answer the question. You may use some tests if appropriate.
(17 points) 3.2 Please use VOTE2016.dta to answer the following questions. The following model can be
used to study whether campaign expenditures aect election outcomes:
_
voteA = β0 + β1 log(expendA) + β2 log(expendB) + u (1)
voteA = β0 + β1 log(expendA) + β2 log(expendB) + β3 prtystrA + u (2)
3
where voteA is the percentage of the vote received by Candidate A, expendA and expendB are campaign
expenditures (in 1000 dollars) by Candidates A and B, and prtystrA is a measure of party strength for
Candidate A (the percentage of the most recent presidential vote that went to A's party).
(4 points) (i) Please run the regression (1) and report your result in a table. Do A's expenditure aect the
outcome and how? What about B's expenditure? (Hint: you need to rst creat the variables ln(expendA)
and ln(expendB). R code log() can do)
(8 points) (ii) Please run the regression (2) and report your result in the same table. Do A's expenditure
aect the outcome and how? What about B's expenditure? Compare result from (i) and (ii), explain whether
we should include prtystrA in the regression or not. If we exclude it, to which direction the coecient of
interest tend to be biased towards?
(5 points) (iii) Can you tell whether a 1% increase in A's expenditures is oset by a 1% increase in B's
expenditure? How? Please suggest a regression or test and then answer the question according to your result.
4
Applied Econometrics
ECON7810
Fall 2021
Lecture 5
Dr. Bei QIN
SW Ch 5/6
1/42
Linear Regression with Multiple Regressors
(SW Chapter 6)
Outline
1. Omitted variable bias
2. Causality and regression analysis
3. Multiple regression and OLS
4. Measures of fit
5. Sampling distribution of the OLS estimator
SW Ch 5/6
2/42
Omitted Variable Bias
(SW Section 6.1)
The error u is present because of factors, or variables, that
influence Y but are not included in the regression function.
There are always omitted variables.
Sometimes, the omission of those variables can lead to bias in
the OLS estimator. Sometimes, it does not.
SW Ch 5/6
3/42
Omitted variable bias, ctd.
The bias in the OLS estimator that occurs as a result of an
omitted factor, or variable, is called omitted variable bias.
The two conditions for omitted variable bias
(1) Z is a determinant of Y (i.e. Z is part of u); and
(2) Z is correlated with the regressor X (i.e. corr(Z,X) ¹ 0)
Both conditions must hold for the omission of Z to result in
omitted variable bias.
SW Ch 5/6
4/42
Omitted variable bias, ctd.
In the class size and test score example:
1. English language ability (whether the student has
English as a second language) plausibly affects
standardized test scores: Z is a determinant of Y.
2. Immigrant communities tend to be less affluent and thus
have smaller school budgets and higher STR: Z is
correlated with X.
è bˆ1 is biased. What is the direction of this bias?
· What does common sense suggest?
· If common sense fails you, there is a formula…
SW Ch 5/6
5/42
Omitted variable bias, ctd.
A formula for omitted variable bias: recall the equation,
n
1 n
vi
( X i - X )u i
å
å
n i =1
i =1
ˆ
b1 – b1 = n
=
æ n -1ö 2
2
(Xi - X )
å
ç
÷ sX
è n ø
i =1
where vi = (Xi – X )ui » (Xi – mX)ui. Under LSA #1,
E[(Xi – mX)ui] = cov(Xi,ui) = 0.
But what if E[(Xi – mX)ui] = cov(Xi,ui) = sXu ¹ 0?
SW Ch 5/6
6/42
Omitted variable bias, ctd.
Under LSA #2 and #3 (that is, even if LSA #1 is not true),
1 n
( X i - X )u i
å
n i =1
ˆ
b1 – b1 =
1 n
2
(
X
X
)
å i
n i =1
s Xu
® 2
sX
æ s u ö æ s Xu ö æ s u ö
r Xu ,
=ç
´ç
=ç
÷
÷
÷
è s X ø è s Xs u ø è s X ø
p
where rXu = corr(X,u). If assumption #1 is correct, then rXu =
0, but if not we have….
SW Ch 5/6
7/42
The omitted variable bias formula:
p
æ su ö
ˆ
b1 ® b1 + ç
r Xu
÷
èsX ø
· If an omitted variable Z is both:
(1) a determinant of Y (that is, it is contained in u); and
(2) correlated with X,
then rXu ¹ 0 and the OLS estimator bˆ is biased and is not
1
consistent.
· For example, districts with few ESL students (1) do better
on standardized tests and (2) have smaller classes (bigger
budgets), so ignoring the effect of having many ESL
students factor would result in overstating the class size
effect.
SW Ch 5/6
8/42
· Why?
oDistricts with few ESL students (1) do better on
standardized tests à The number of ESL students
enters the error term with a negative sign.
oDistricts with few ESL students (2) have smaller
classes à The number of ESL students is positively
correlated with the STR.
o rXu < 0 à bˆ < b1. If b1 < 0, this means that the effect
1
of reducing the class size will be overestimated.
Is this actually going on in the CA data?
SW Ch 5/6
9/42
· Districts with fewer English Learners have higher test scores
· Districts with lower percent EL (PctEL) have smaller classes
· Among districts with comparable PctEL, the effect of class size
is small (recall overall “test score gap” = 7.4)
SW Ch 5/6
10/42
Mozart effect
The omitted variable bias is common in studies using
observational data. Here is another example:
· A study claimed that listening to Mozart for 10 to 15
minutes could temporarily raise your IQ by 8 or 9 points.
· Really?
oA review of dozen studies found that students who
take optional music or arts courses in high school do
have higher English or math test scores than those who
don’t. But…
oAcademically better students might have more time to
take optional music or arts courses.
oThose schools with a deeper musical curriculum might
be just better schools.
· A randomized, controlled experiment fails to find a
significant Mozart effect.
SW Ch 5/6
11/42
Causality and regression analysis
The test score/STR/English example shows that, if an omitted
variable satisfies the two conditions for omitted variable bias,
then the OLS estimator in the regression omitting that
variable is biased and inconsistent.
So, even if n is large, bˆ1 will not be close to β1.
This raises a deeper question: how do we define β1? That is,
what precisely do we want to estimate when we run a
regression?
SW Ch 5/6
12/42
What precisely do we want to estimate when we run a
regression?
There are (at least) three possible answers to this question:
1.
We want to estimate the slope of a line through a
scatterplot as a simple summary of the data to which
we attach no substantive meaning.
This can be useful at times, but isn’t very
interesting intellectually and isn’t what this course
is about.
SW Ch 5/6
13/42
2.
We want to make forecasts, or predictions, of the value
of Y for an entity not in the data set, for which we
know the value of X.
Forecasting is an important job for economists,
and excellent forecasts are possible using
regression methods without needing to know causal
effects. We will return to forecasting later in the
course.
SW Ch 5/6
14/42
3.
We want to estimate the causal effect on Y of a change
in X.
This is why we are interested in the class size
effect. Suppose the school board decided to cut
class size by 2 students per class (holding all other
factors fixed). What would be the effect on test
scores? This is a causal question (what is the
causal effect on test scores of STR?) so we need to
estimate this causal effect.
SW Ch 5/6
15/42
What, precisely, is a causal effect?
· “Causality” is a complex concept!
· In this course, we take a practical approach to defining
causality:
A causal effect is defined to be the effect measured
in an ideal randomized controlled experiment.
SW Ch 5/6
16/42
Ideal Randomized Controlled Experiment
· Randomized: subjects from the population of interest are
randomly assigned to a treatment or control group (so
there are no confounding factors)
· Controlled: having a control group permits measuring the
differential effect of the treatment
· Experiment: the treatment is assigned as part of the
experiment: the subjects have no choice, so there is no
“reverse causality” in which subjects choose the treatment
they think will work best.
SW Ch 5/6
17/42
Back to class size:
Imagine an ideal randomized controlled experiment for
measuring the effect on Test Score of reducing STR…
(1) In that experiment, students would be randomly
assigned to classes, which would have different sizes.
(2) Because they are randomly assigned, all student
characteristics (and thus ui) would be distributed
independently of STRi.
(3) Thus, E(ui|STRi) = 0 – that is, LSA #1 holds in a
randomized controlled experiment.
SW Ch 5/6
18/42
How does our observational data differ from this ideal?
· The treatment is not randomly assigned
· Consider PctEL – percent English learners – in the district.
It plausibly satisfies the two criteria for omitted variable
bias: Z = PctEL is:
(1) a determinant of Y; and
(2) correlated with the regressor X.
· Thus, the “control” and “treatment” groups differ in a
systematic way, so corr(STR,PctEL) ¹ 0
SW Ch 5/6
19/42
· (Randomized + Controlled) means that any differences (but
the treatment) between the treatment and control groups are
random – not systematically related to the treatment
· But, with observational data, we can eliminate the
difference in PctEL between the large (control) and small
(treatment) groups by examining the effect of class size
among districts with the same PctEL.
oIf the only systematic difference between the large and
small class size groups is in PctEL, then we are back to
the randomized controlled experiment – within each
PctEL group.
oThis is one way to “control” for the effect of PctEL
when estimating the effect of STR.
SW Ch 5/6
20/42
Three ways to overcome omitted variable bias
1. Run a randomized controlled experiment in which
treatment (STR) is randomly assigned: then PctEL is still
a determinant of TestScore, but PctEL is uncorrelated with
STR.
2. Adopt the “cross tabulation” approach, with finer
gradations of STR and PctEL – within each group, all
classes have the same PctEL, so we control for PctEL.
3. Use a regression in which the omitted variable (PctEL) is
no longer omitted: include PctEL as an additional
regressor in a multiple regression.
SW Ch 5/6
21/42
The Population Multiple Regression Model
(SW Section 6.2)
Consider the case of two regressors:
Yi = b0 + b1X1i + b2X2i + ui, i = 1,…,n
· Y is the dependent variable
· X1, X2 are the two regressors (independent variables)
· (Yi, X1i, X2i) denote the ith observation on Y, X1, and X2.
· b0 = unknown population intercept
· b1 = effect on Y of a change in X1, holding X2 constant
· b2 = effect on Y of a change in X2, holding X1 constant
· ui = the regression error (omitted factors)
SW Ch 5/6
22/42
Interpretation of coefficients in multiple regression
Yi = b0 + b1X1i + b2X2i + ui, i = 1,…,n
Consider changing X1 by DX1 while holding X2 constant:
Population regression line before the change:
Y = b0 + b1X1 + b2X2
Population regression line, after the change:
Y + DY = b0 + b1(X1 + DX1) + b2X2
SW Ch 5/6
23/42
Before:
After:
Difference:
So:
Y = b0 + b1X1 + b2X2
Y + DY = b0 + b1(X1 + DX1) + b2X2
DY = b1DX1
DY
, holding X2 constant
b1 =
DX 1
DY
, holding X1 constant
b2 =
DX 2
b0 = predicted value of Y when X1 = X2 = 0.
SW Ch 5/6
24/42
The OLS Estimator in Multiple Regression
(SW Section 6.3)
With two regressors, the OLS estimator solves:
n
min b0 ,b1 ,b2 å [Yi - ( b0 + b1 X 1i + b2 X 2i )]2
i =1
· This minimization problem is solved using calculus.
· This yields the OLS estimators of b0, b1 and b2. The
formulas are complicated and we would not derive them.
SW Ch 5/6
25/42
Example: the California test score data
Regression of TestScore against STR:
= 698.9 – 2.28´STR
Now include percent English Learners in the district (PctEL):
= 686.0 – 1.10´STR – 0.65PctEL
· What happens to the coefficient on STR?
· Why? (Note: corr(STR, PctEL) = 0.19)
SW Ch 5/6
26/42
Multiple regression in R
= 686.0 – 1.10´STR – 0.65PctEL
More on this printout later…
SW Ch 5/6
27/42
Measures of Fit for Multiple Regression
(SW Section 6.4)
Yi = Yˆi + uˆi
R2 = fraction of variance of Y explained by X
R 2 = “adjusted R2” = R2 with a degrees-of-freedom correction
SW Ch 5/6
28/42
R2 and R 2 (adjusted R2)
The R2 is the fraction of the variance explained – same
definition as in regression with a single regressor:
ESS
SSR
R =
= 1,
TSS
TSS
2
n
where ESS =∑
( − ) , SSR =
2
ˆ
u
å i , TSS =
i =1
n
2
.
(
Y
Y
)
å i
i =1
· The R2 always increases when you add another regressor– a
bit of a problem for a measure of “fit”
SW Ch 5/6
29/42
R2 and R 2 , ctd.
The R 2 (the “adjusted R2”) corrects this problem by
“penalizing” you for including another regressor – the R 2
does not necessarily increase when you add another regressor.
æ n - 1 ö SSR
Adjusted R : R = 1 - ç
÷
n
k
1
è
ø TSS
2
2
Note that R 2 < R2, however if n is large the two will be very
close.
SW Ch 5/6
30/42
Measures of fit, ctd.
Test score example:
(1)
= 698.9 – 2.28´STR,
R2 = .05
(2)
= 686.0 – 1.10´STR – 0.65PctEL,
R2 = .426, R 2 = .424
· What – precisely – does this tell you about the fit of
regression (2) compared with regression (1)?
· Why are the R2 and the R 2 so close in (2)?
SW Ch 5/6
31/42
The Least Squares Assumptions for Multiple Regression
(SW Section 6.5)
Yi = b0 + b1X1i + b2X2i + … + bkXki + ui, i = 1,…,n
1. The conditional distribution of u given the X’s has mean
zero, that is, E(ui|X1i = x1,…, Xki = xk) = 0.
2. (X1i,…,Xki,Yi), i =1,…,n, are i.i.d.
3. Large outliers are unlikely: X1,…, Xk, and Y have four
moments: E( X 1i4 ) < ¥,…, E( X ki4 ) < ¥, E(Yi 4 ) < ¥.
4. There is no perfect multicollinearity.
SW Ch 5/6
32/42
Assumption #1: the conditional mean of u given the
included Xs is zero.
E(u|X1 = x1,…, Xk = xk) = 0
· This has the same interpretation as in regression with a
single regressor.
· Failure of this condition leads to omitted variable bias,
specifically, if an omitted variable
(1) belongs in the equation (so is in u) and
(2) is correlated with an included X
then this condition fails and there is OV bias.
· The best solution, if possible, is to include the omitted
variable in the regression.
SW Ch 5/6
33/42
Assumption #2: (X1i,…,Xki,Yi), i =1,…,n, are i.i.d.
This is satisfied automatically if the data are collected by
simple random sampling.
Assumption #3: large outliers are rare (finite fourth
moments)
This is the same assumption as we had before for a single
regressor. As in the case of a single regressor, OLS can
be sensitive to large outliers, so you need to check your
data (scatterplots!) to make sure there are no crazy values
(typos or coding errors).
SW Ch 5/6
34/42
Assumption #4: There is no perfect multicollinearity
Perfect multicollinearity is when one of the regressors is
an exact linear function of the other regressors.
Example: Suppose you accidentally include STR twice:
SW Ch 5/6
35/42
Perfect multicollinearity is when one of the regressors is an
exact linear function of the other regressors.
· In the previous regression, b1 is the effect on TestScore of a
unit change in STR, holding STR constant
· We will return to perfect (and imperfect) multicollinearity
shortly, with more examples…
With these least squares assumptions in hand, we now can
derive the sampling distribution of bˆ , bˆ ,…, bˆ .
1
SW Ch 5/6
2
k
36/42
The Sampling Distribution of the OLS Estimator
(SW Section 6.6)
Under the four Least Squares Assumptions,
· bˆ is unbiased.
1
· var( bˆ1 ) is inversely proportional to n.
· Other than its mean and variance, the exact (finite-n)
distribution of bˆ is very complicated; but for large n…
1
p
o bˆ1 is consistent: bˆ1 ® b1
o
(
)
(
)
is approximately distributed N(0,1)
oThese statements hold for bˆ1 ,…, bˆk
Conceptually, there is nothing new here!
SW Ch 5/6
37/42
Multicollinearity, Perfect and Imperfect
(SW Section 6.7)
Perfect multicollinearity is when one of the regressors is an
exact linear function of the other regressors.
Some more examples of perfect multicollinearity
1. The example from before: you include STR twice,
2. Regress TestScore on a constant, D, and B, where: Di =
1 if STR ≤ 20, = 0 otherwise; Bi = 1 if STR >20, = 0
otherwise, so Bi = 1 – Di and there is perfect
multicollinearity.
SW Ch 5/6
38/42
The dummy variable trap
Suppose you have a set of multiple binary (dummy)
variables, which are mutually exclusive and exhaustive – that
is, there are multiple categories and every observation falls in
one and only one category (Freshmen, Sophomores, Juniors,
Seniors, Other). If you include all these dummy variables
and an intercept, you will have perfect multicollinearity – this
is sometimes called the dummy variable trap.
· Why is there perfect multicollinearity here?
· Solutions to the dummy variable trap:
1. Omit one of the groups (e.g. Senior), or
2. Omit the intercept
SW Ch 5/6
39/42
Perfect multicollinearity, ctd.
· Perfect multicollinearity usually reflects a mistake in the
definitions of the regressors, or an oddity in the data
· If you have perfect multicollinearity, your statistical
software will let you know
· The solution to perfect multicollinearity is to modify your
list of regressors so that you no longer have perfect
multicollinearity.
SW Ch 5/6
40/42
Imperfect multicollinearity
Imperfect and perfect multicollinearity are quite different
despite the similarity of the names.
Imperfect multicollinearity occurs when two or more
regressors are very highly but not perfectly correlated.
SW Ch 5/6
41/42
Imperfect multicollinearity, ctd.
Imperfect multicollinearity implies that one or more of the
regression coefficients will be imprecisely estimated.
· The idea: the coefficient on X1 is the effect of X1 holding
X2 constant; but if X1 and X2 are highly correlated, there is
very little variation in X1 once X2 is held constant – so the
data don’t contain much information about what happens
when X1 changes but X2 doesn’t.
· Imperfect multicollinearity (correctly) results in large
standard errors for one or more of the OLS coefficients.
· The math? See SW, App. 6.2
Next topic: hypothesis tests and confidence intervals…
SW Ch 5/6
42/42
Purchase answer to see full
attachment