## Description

Attached notes for unit1(read from 1-10) before start to work on the ps.

### Unformatted Attachment Preview

Purchase answer to see full attachment

## Explanation & Answer

View attached explanation and answer. Let me know if you have any questions.Hey! i have answered your questions according to the file you send me. please feel free to ask any question related. I hope that you will good marks as well. thanks

BSE Fundamental of Regression Analysis 2021/22

Assignment 1

Aysu Demir, Claudia Ochoa and Sandra Vicaria

1.

a) As we have the variable lwklywage, defined by: lwklywage = ln(wage), first we

create a new variable called wage, being: wage = exp(lwklywge).

generate wage = exp(lwklywge)

Wage is a variable that shows us the weekly wage. Now we compute the mean, if the

dummy variable education is 0, which means that we compute de average of the wage for

people with education years equal to 0 in the sample:

mean wage if educ == 0

Using this command we can see that the mean wage for people who have no education is

mean = 241.4846 as displayed in the table above.

wage = �0+ �1*educ with Bo=241.4846, which implies that the mean is an estimator of

the parameter of �0.

b) Using command hist we have produced the density histogram for the variable weekly

pay (wage). And following this we have used the command summarize in order to find the

basic descriptive statistics for wage.

hist(wage)

summarize(wage)

summarize(wage), detail

*the sample mean 439.4709 and sample median 384.7117 are as expected

*coefficient of skewness (26.39 as it is a lognormal distribution, which is expected to

exhibit some degree of right-skew, and therefore this coefficient is as expected).

c) Here we have used the exact same commands as the previous question but in this case

for the variable lwklywge. Therefore, the following graph represents the density histogram

for the variable lwklywge and the basic descriptive statistics for this same variable, using the

command summarize, and summarize, detail.

hist(lwklywge)

summarize(lwklywge)

summarize(lwklywge), detail

*the sample mean 5.89 and sample median 5.95 are as expected

*coefficient of skewness (-2: highly skewed to the left left tail is longer)

d) The first step is to sort the data in order to be able to plot the data we are interested in.

Therefore we use the keyword “by”, we use it as a prefix as it will sort it for the subgroup

education, and not the entire dataset. Therefore the following commands are used:

by educ, sort: egen conditional_mean = mean(lwklywge)

As we want to see the distribution of the data we use twoway connected, and it will give us

a similar plot to the theoretical one from mostly harmless econometrics referred to in the

problem set.

twoway connected (conditional_mean educ)

e)

regress lwklywge educ

The thick black function is an estimate of lwklywge = �0+ �1*educ with �0=4.99 and

�1=0.070

f) The thick black function illustrates the CEF (conditional expectation function) of the

lwklywge given the years of education for our sample. This CEF shows the fact that, in

spite of the big variations for each individual circumstance, it can be said that in general

people with more years of education tend to earn more.

g) We could use the thick black function for...