Question 1:

Now open Rstudio.

Download the dataset HW2.

Make sure you load the libraries broom and tidyverse in your r script at the top (just like last time).

Load in the dataset HW2, which contains information on fatality risk, wages, and earnings. (You can use the readr or read_csv or read.csv commands). I named the data VSL when I read it in.

First run a regression of wages on fatality risk, controlling for educational dummy variables.

What is your estimated coefficient on the variable fat_risk?

Question 2:

Continuing the problem from above, what is the implied VSL based on that regression coefficient? Hint, you must scale the coefficient so hourly wages have the same time units as annual fatality risk.

Question 3:

Now run a regression of annual earnings on annual fatality risk controlling for education.

What is the coefficient on fat_risk now? What is the implied VSL based on this regression?

Question 4:

Now I want you create a scatter plot using the ggplot command.

On the x-axis will be your fatality risk. On the y-axis will be your wages.

To create that scatter plot, type

sh <- ggplot(VSL, aes(x=fat_risk, y=wages)) +geom_point()

Now you can add better titles and axis labels by typing the following following.

sh + labs(title = "Wages vs. Fatality Risk", x = "Fatality Risk", y = "Wages")

After you do both of those things. Click on the figure and export it as a jpg with the name "HW2_yourname" where yourname is actually your name. Please upload that figure with your HW below.

Question 5:

Suppose someone doesn't get causality. You try to explain with formulas and that fails. So now construct a DAG (directed acyclic graph). This DAG should illustrate both characteristics of a job that are related to unobservables that influence wages, and fatality risks on a job that are exogenous and also affected wages. Please upload a sketch of your DAG.

