## Description

### Unformatted Attachment Preview

Purchase answer to see full attachment

## Explanation & Answer

View attached explanation and answer. Let me know if you have any questions.Please find the final answe...

CDS 101 Final Exam Project
1. Read in the file and select the following columns
i. INSTNM
ii. CITY
iii. STABBR
iv. ADM_RATE
v. COSTT4_A
vi. MD_EARN_WNE_P10
b. Save the dataframe to the variable “data”
2. Rename the columns to the following
i. INSTNM -> NAME
ii. STABBR -> STATE
iii. COSTT4A -> AVG_COST
iv. MD_EARN_WNE_P10 -> MED_10YR_PAY
b. Save the dataframe to the variable “data_renamed”
3. Count how many missing values there are for each state/territory
a. Answer
i. What state/territory has the most missing values?
1. How many were there?
ii. What state/territory have no missing values?
b. Hint
i. Google “US postal code” and the abbreviation to find the full name of the
state/territory
4. Using the data_renamed dataframe, impute missing tuition costs with the median cost. Save to
the variable data_imputed_cost.
5. Using the data_imputed_cost dataframe, impute missing admission rates with the median
admission rate. Save to the variable data_imputed_admissions.
6. Using the data_imputed_admissions dataframe, impute missing MED_10YR_PAY with the
median 10-year pay. Save to the variable data_imputed.
7. Using the data_imputed dataframe, calculate the average cost per institution by state and order
the values from smallest to largest. Save to the variable state_price.
a. What state/territory has the highest price? What is it?
b. What state/territory has the lowest price? What is it?
8. Using the state_price dataframe, create a scatterplot with state as the explanatory variable and
price as the response variable. Add color and a descriptive title and axes labels.
9. Even when adjusting the x-axis labels it is not very clear which data point belongs to which state.
Use ggplotly() to make the graph interactive so it is easier to see which data point corresponds
to which state/price.
10. Using the data_imputed dataset, make a linear regression model to compare average price as
the explanatory variable and median debt as the response variable.
a. print out the summary statistics
b. what is R^2?
11. Use the tidy() function to report the slope and intercept of the model.
12. Use the glance() function to find the r-squared value.
a. Colleges that have a lower admission rate are considered more selective and
prestigious. Is there a relationship between how selective/prestigious a college is and
the earnings 10 years after graduation? Use the r-squared value to justify your answer.
13. Hypothesis test average tuition price vs George Mason
a. You will perform a one-sided hypothesis test to determine whether the average cost of
attending George Mason University is higher than $24,537.32 (the average cost of
universities in this dataset). The average university cost(mean_cost) and average cost of
GMU(cost_obs_stat) have been calculated for you.
14. Generate the null distribution and p-value
15. Visualize the results and shade to the right. Give the graph a title and label both the x and y
axes.
a. Using a significance value of alpha = 0.05, decide whether to reject or fail to reject the
null hypothesis. What does this mean in terms of the cost of GMU compared to the
average tuition cost?
A
12 Use the glance() function to find the r-squared value.
glance(reg_model)
r.squared
adj.r.squared
sigma
statistic
p.value df
logLik
AIC
BIC
deviance
df.residual nobs
0.1423654
0.1421989
14669.76
855.3862
0 1 -56768.34
113542.7
113562.3 1.108935e+12
5153 5155
The r-squared value is 0.1423654
The R2 for the model is low and we observe that average price explains only 14.24% of the variation and this means that there is not strong
relationship between explanatory and response variable.
Colleges that have a lower admission rate are considered more selective and prestigious. Is there a relationship between how selective/prestigious
a college is and the earnings 10 years after graduation? Use the r-squared value to justify your answer.
reg_model2

Purchase answer to see full attachment

Purchase answer to see full attachment

User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Anonymous

Great! 10/10 would recommend using Studypool to help you study.