elementary econometrics

User Generated

VQXUGYL

Economics

Description

Consider the following three models of house sales where housingi is the house price, POPi is population, Yi is the level of income and ri is the interest rate for home mortgages (standard errors in parenthesis and t-stats reported below). The correlation matrix is below the estimations. Each cell in the table is the simple correlation coefficient between the two variables indicated in the corresponding row and column headings.

Model 1:

Μ‚

π»π‘œπ‘’π‘ π‘–π‘›π‘” =βˆ’132+2225ln(𝑃𝑂𝑃)+4500ln(π‘Œ)βˆ’185π‘Ÿ

t-stat = Model 2:

(1612) 1.38

Adj-R2 = 0.784

(3000) 1.50

N = 50

(77) - 2.45

𝑖𝑖𝑖𝑖

Μ‚

π»π‘œπ‘’π‘ π‘–π‘›π‘” =βˆ’140+8700ln(π‘Œ)βˆ’184π‘Ÿ

t-stat = Model 3:

𝑖𝑖𝑖

(1548) (72) 5.62 - 2.56

Adj-R2 = 0.769

Μ‚

π»π‘œπ‘’π‘ π‘–π‘›π‘” =βˆ’160+3575ln(𝑃𝑂𝑃)βˆ’189π‘Ÿ

t-stat =

ln(POPi) ln(Yi)

ri

a.

𝑖𝑖𝑖

(812) (72)

4.40 - 2.62 Adj-R2 = 0.761

ln(POPi) 1.00 0.92 0.15

Correlation Matrix ln(Yi) ri

1.00

-0.34 1.00

Look at the simple correlation coefficients. Is there evidence of multicollinearity? If yes, for which variables?

1

b. Why does the coefficient on ln(POPi) increase when dropping ln(Yi) from the equation?

c. Even if you did not have the correlation matrix you could have determined whether there is multicollinearity. What could you say from comparing the regression output between the three models that would indicate whether there is multicollinearity?

d. Look at the coefficient on interest rates across the models. Why it does not change much across regressions while the coefficients on the other variables do?

e. Which model is preferred and why?

f. Do a Durbin Watson tests of first order serial correlation for Model 1. Show the critical values and the decision rule. Suppose you obtain a DW statistic equal to DW = 0.25. Do we have evidence of serial correlation. Show your work.

2

2. You are the assistant for an Economist hired by the Department of Education to determine how much does spending in public education affects future salaries. For this, a sample of 120 individuals who graduated from public high schools was selected and the amount of spending in public education (the average during the time individuals were in school) was recorded. Data from a household survey (N=120) allowed you to estimate the following regression model (IMPORTANT: t-statistics are in parenthesis):

Μ‚

𝐿𝑁(π‘ π‘Žπ‘™π‘–) = 7.4 + 0.14 𝑒𝑑𝑒𝑐𝑖 + 0.025 𝐿𝑁(𝑠𝑝𝑒𝑛𝑑𝑖) + 0.03 𝑒π‘₯𝑝𝑖 + 0.02 π‘Žπ‘”π‘’π‘– βˆ’ 0.20 𝐡𝐿𝐴𝐢𝐾𝑖

t-tests: (2.41 (3.05) (1.53) (0.97) (2.00) R2 = 0.75

where,

LN(sali) = the natural log of the annual salary of individual i

Educi = the years of education (12 = high school, 13 = one year of college, etc) LN(spendi) = the natural log of the (average) real annual spending in public education

in the high school district from which individual i graduated (millions of

dollars)

expi = years of experience working.

agei = the age of individual i

BLACKi = 1 if individual i is black, 0 otherwise

a. What are the expected signs of the coefficients? (The expected signs. Not the estimated ones.)

b. Which variables are statistically significant at the 5% level?

c. Is there any obvious signs of omitted variable bias? If yes, for which variable(s)?

3

d. Do you suspect multicollinearity? Explain why or why not. If yes, for which variable(s)?

e. What would be your primary concern from this dataset: serial correlation or heteroskedasticity? Why?

f. What is the interpretation of the R2?

g. What is the expected value of the dependent variable for a white 35 year old with 16 years of education, 20 years of experience and from a school district spending an average of 50 million dollars a year?

4

h. Consider that the correlation between experience (exp) and age is 0.85 and the correlation between experience and education is 0.46.

i. A classmate argues that the correlation of experience and age is creating a problem of multicollinearity that is affecting the hypothesis tests of experience and age. Is this possible? Why?

ii. Another classmate argues that the correlation of experience and age is also creating a problem of multicollinearity that is affecting the hypothesis test of education. Is your classmate right or wrong? Why?

i. Suppose that a White test for heteroskedasticity produces a test-statistic equal to 12. Do a hypothesis to test for heteroskedasticity at the 5% significance level?

j. Now suppose instead that you have been hired to provide an independent evaluation of the study above. You read in the conclusions section of the report: β€œWe conclude that the amount of public spending in education has a positive and statistically significant impact on future salaries”. Considering the econometric problems do you agree with this conclusion using the information above? Explain your answer. (Hint: This question is about the quality of the regression and not about what you might think. Assume there are no omitted variables. The answer can be only one sentence.)

5

Unformatted Attachment Preview

Name Elementary Econometrics Due date: Tuesday, July 31 Homework # 3 1. Consider the following three models of house sales where housingi is the house price, POPi is population, Yi is the level of income and ri is the interest rate for home mortgages (standard errors in parenthesis and t-stats reported below). The correlation matrix is below the estimations. Each cell in the table is the simple correlation coefficient between the two variables indicated in the corresponding row and column headings. Model 1: Μ‚ 𝑖 = βˆ’132 + 2225 ln(𝑃𝑂𝑃𝑖 ) + 4500 ln(π‘Œπ‘– ) βˆ’ 185 π‘Ÿπ‘– π»π‘œπ‘’π‘ π‘–π‘›π‘” (1612) (3000) (77) t-stat = 1.38 1.50 - 2.45 Adj-R2 = 0.784 N = 50 Model 2: Μ‚ 𝑖 = βˆ’140 + 8700 ln(π‘Œπ‘– ) βˆ’ 184 π‘Ÿπ‘– π»π‘œπ‘’π‘ π‘–π‘›π‘” (1548) (72) t-stat = 5.62 - 2.56 Adj-R2 = 0.769 Model 3: Μ‚ 𝑖 = βˆ’160 + 3575 ln(𝑃𝑂𝑃𝑖 ) βˆ’ 189 π‘Ÿπ‘– π»π‘œπ‘’π‘ π‘–π‘›π‘” (812) (72) t-stat = 4.40 - 2.62 Adj-R2 = 0.761 ln(POPi) ln(Yi) ri Correlation Matrix ln(POPi) ln(Yi) 1.00 0.92 1.00 0.15 -0.34 ri 1.00 a. Look at the simple correlation coefficients. Is there evidence of multicollinearity? If yes, for which variables? 1 b. Why does the coefficient on ln(POPi) increase when dropping ln(Yi) from the equation? c. Even if you did not have the correlation matrix you could have determined whether there is multicollinearity. What could you say from comparing the regression output between the three models that would indicate whether there is multicollinearity? d. Look at the coefficient on interest rates across the models. Why it does not change much across regressions while the coefficients on the other variables do? e. Which model is preferred and why? f. Do a Durbin Watson tests of first order serial correlation for Model 1. Show the critical values and the decision rule. Suppose you obtain a DW statistic equal to DW = 0.25. Do we have evidence of serial correlation. Show your work. 2 2. You are the assistant for an Economist hired by the Department of Education to determine how much does spending in public education affects future salaries. For this, a sample of 120 individuals who graduated from public high schools was selected and the amount of spending in public education (the average during the time individuals were in school) was recorded. Data from a household survey (N=120) allowed you to estimate the following regression model (IMPORTANT: t-statistics are in parenthesis): Μ‚ 𝑖 ) = 7.4 + 0.14 𝑒𝑑𝑒𝑐𝑖 + 0.025 𝐿𝑁(𝑠𝑝𝑒𝑛𝑑𝑖 ) + 0.03 𝑒π‘₯𝑝𝑖 + 0.02 π‘Žπ‘”π‘’π‘– βˆ’ 0.20 𝐡𝐿𝐴𝐢𝐾𝑖 𝐿𝑁(π‘ π‘Žπ‘™ t-tests: (2.41 (3.05) (1.53) (0.97) (2.00) R2 = 0.75 where, LN(sali) = the natural log of the annual salary of individual i Educi = the years of education (12 = high school, 13 = one year of college, etc) LN(spendi) = the natural log of the (average) real annual spending in public education in the high school district from which individual i graduated (millions of dollars) expi = years of experience working. agei = the age of individual i BLACKi = 1 if individual i is black, 0 otherwise a. What are the expected signs of the coefficients? (The expected signs. Not the estimated ones.) b. Which variables are statistically significant at the 5% level? c. Is there any obvious signs of omitted variable bias? If yes, for which variable(s)? 3 d. Do you suspect multicollinearity? Explain why or why not. If yes, for which variable(s)? e. What would be your primary concern from this dataset: serial correlation or heteroskedasticity? Why? f. What is the interpretation of the R2? g. What is the expected value of the dependent variable for a white 35 year old with 16 years of education, 20 years of experience and from a school district spending an average of 50 million dollars a year? 4 h. Consider that the correlation between experience (exp) and age is 0.85 and the correlation between experience and education is 0.46. i. A classmate argues that the correlation of experience and age is creating a problem of multicollinearity that is affecting the hypothesis tests of experience and age. Is this possible? Why? ii. Another classmate argues that the correlation of experience and age is also creating a problem of multicollinearity that is affecting the hypothesis test of education. Is your classmate right or wrong? Why? i. Suppose that a White test for heteroskedasticity produces a test-statistic equal to 12. Do a hypothesis to test for heteroskedasticity at the 5% significance level? j. Now suppose instead that you have been hired to provide an independent evaluation of the study above. You read in the conclusions section of the report: β€œWe conclude that the amount of public spending in education has a positive and statistically significant impact on future salaries”. Considering the econometric problems do you agree with this conclusion using the information above? Explain your answer. (Hint: This question is about the quality of the regression and not about what you might think. Assume there are no omitted variables. The answer can be only one sentence.) 5
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Hi, here is your assignment :)

1. Consider the following three models of house sales where housingi is the house price, POPi is
population, Yi is the level of income and ri is the interest rate for home mortgages (standard
errors in parenthesis and t-stats reported below). The correlation matrix is below the estimations.
Each cell in the table is the simple correlation coefficient between the two variables indicated in
the corresponding row and column headings.

a .Look at the simple correlation coefficients. Is there evidence of multicollinearity? If yes, for
which variables?
Answer:
Yes, taking correlations just among sets of indicators, be that as it may, is restricting. It is
conceivable that the pair-wise connections are little, but then a straight reliance exists among
three or considerably more variables.

b. Why does the coefficient on ln(POPi) increase when dropping ln(Yi) from the equation?
Answer:
Coefficient on In (POPi) increases when dropping In (Yi) from equation because In (Yi) and In
(POPi) are highly correlated. We can see from the correlation matrix, that the correlation
between In (Yi) and (POPi) is 0.92.

c. Even if you did not have the correlation matrix you could have determined whether there is
multicollinearity. What could you say from comparing the regression output between the three
models that would indicate whether there is multicollinearity?
Answer:
Even if the information of correlation matrix is not given, we can still say that there is
multicollinearity because we can see that in the first model, there are large standard error and
less t-value indicating high multicollinearity. In model 2, if we drop In (Yi), then Standard ...


Anonymous
Just what I was looking for! Super helpful.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Related Tags