SOM 307 CSUN Logistic Regression Model Variables Mapping of Coefficients Worksheet

User Generated

JulQ

Programming

SOM 307

California state university Northridge

SOM

Description

Answer the 14 questions attached on the word file, using the R file attached. I have completed the first step which was to import a data set by downloading csv file and running first three lines. The following step is to answer the 14 mixed questions, some multiple choice, some multiple answers, and 3 questions need sentence answers or code written out. Lastly a completed progression of the R file based from the 13 question. (Question 14 asks for R file.)

Need help attaching R file, unable to do so here.

Unformatted Attachment Preview

Question 1 How many observations (or rows) and variables (or column) do we have in the dataset “chdData”? Group of answer choices Observations = 4240; Variables = 16 Observations = 4241; Variables = 16 Observations = 16; Variables = 4240 Observations = 16; Variables = 4241 Question 2 I want to inspect the first five rows of the data. Select the correct line of code? Group of answer choices View(chdData) summary(chdData) head(chdData)) head(chdData, 5) Question 3 Based on medical experts, you learned that the following clinical variables are important to predict tenyear risk of coronary heart disease: 1- Age 2- Gender 3- Systolic blood pressure 4- Heart rate 5- Glucose Please write a line of code to select the aforementioned variables? Store the output to a variable names chdDataSelected. (Please note that for model building purposes, you have to include dependent variable in your selection) Question 4 After writing the code for Question 3 in your R file, please run lines from 13 to 18, where we change the type of variables. Please select the option that will inspect the type of variable heart rate? (Multiple options may be correct) Group of answer choices sapply(chdDataSelected) class(chdDataSelected) class(chdDataSelected$heartRate) sapply(chdDataSelected$heartRate) Question 5 Considering chdDataSelected, which line of the code will generate the following plot? (Multiple answers may be correct) Group of answer choices ggplot(chdDataSelected, aes(x = male)) + geom_bar() + labs(x = "Gender", y = "Count")) ggplot(chdDataSelected, aes(x = male, fill = TenYearCHD)) + geom_bar() + labs(x = "Gender", y = "Count")) ggplot(chdDataSelected, aes(x = male)) + geom_bar() ggplot(chdDataSelected, aes(x = male)) + geom_points() + labs(x = "Gender", y = "Count")) Question 6 What kind of insights you can draw from the following plot ? Limit your answer in two to three sentences. Question 7 Considering the following set of independent and independent variables, please select the correct equation representing logistic regression. (multiple options may be correct) Independent variables: Age, Gender, Systolic blood pressure, Heart rate, Glucose Dependent variable: Ten year risk of coronary heart disease Group of answer choices log(odds of TenYearCHD) = w_0 + w_1 * age + w_2* male + w_3*sysBP + w_4*heartRate +w_5*glucose odds of TenYearCHD = w_0 + w_1 * age + w_2* male + w_3*sysBP + w_4*heartRate +w_5*glucose log(odds of TenYearCHD) = w_0 + w_1 * age + w_2* male + w_3*sysBP + w_5*heartRate +w_4*glucose log(odds of TenYearCHD) = w_1 * age + w_0 + w_2* male + w_3*sysBP + w_4*heartRate +w_5*glucose Question 8 Use R function to determine what percent of people in the dataset showed the ten year risk of coronary heart disease ? Group of answer choices 94.6 15.4 50 25 Question 9 I have created a complete dataset for you in R file using the following line of code: chdDataSelected = chdDataSelected[complete.cases(chdDataSelected), ] Now your task is to write a line of code to compute the regression coefficients (or weights) corresponding to each variable (male, age, sysBP, heartRate, glucose) considering TenYearCHD as a dependent variable. Group of answer choices fit_logit = glm(TenYearCHD~ male+ age + sysBP + heartRate + glucose, family = "binomial") fit_logit = glm(TenYearCHD~ male+ age + sysBP + heartRate + glucose, data = chdDataSelected,) fit_logit = glm(TenYearCHD + male ~ age + sysBP + heartRate + glucose, data = chdDataSelected, family = "binomial") fit_logit = glm(TenYearCHD~ male+ age + sysBP + heartRate + glucose, data = chdDataSelected, family = "binomial") Question 10 After computing weights, can you say anything about how systolic blood pressure impact the odds of ten year risk of coronary heart disease ? Group of answer choices odds of ten year coronary heart disease increase with systolic blood pressure odds of ten year coronary heart disease decreases with systolic blood pressure No relation Not sufficient information Question 11 After computing weights and their p-values, how many variables are significantly important ? Group of answer choices 2 3 4 5 Question 12 After computing the weights, please write down the final logistic regression equation considering only significant variables? Question 13 Let us say, a patient enters into the hospital with the following measurements: Age = 50, Gender = male, Systolic blood pressure = 90, Glucose = 80. What is the PROBABILITY of patient having ten-year risk of coronary heart disease? Please select the nearest answer. Group of answer choices 0.931 0.083 0.58 0.35 Question 14 Submit your R file. 2500 - 2000 - 1500- Ten YearCHD Count 1000- 500 - 0 Gender Question 7 4 pts Considering the following set of independent and independent variables, please select the correct equation representing logistic regression. (multiple options may be correct) Independent variables: Age, Gender, Systolic blood pressure, Heart rate, Glucose Dependent variable: Ten year risk of coronary heart disease loglodds of TenYearCHD) = w_0+ w_1 * age + w_2* male + w_3*sysBP + w_4*heartRate +w_5*glucose odds of TenYearCHD = w_0+w_1 * age + w_2* male + w_3*sysBP + w_4*heartRate +w_5*glucose log(odds of TenYearCHD) = w_0+w_1 * age +w_2* male + w_3*sysBP + w_5*heartRate +w_4*glucose loglodds of TenYearCHD) = w_1 * age + w_0+w_2* male +w_3* sysBP + w_4* heartRate +w_5*glucose
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Please find the final answer files attached. The word document has answers to the posted questions - The answers are highlighted in yellow for easy reference.The zip file has the R script and the code used to answer the questions in the word document. Some of the questions did not require any R code to answer. The R file has comments with question number reference for the code below it.If you have any doubts or queries or would you like any revisions or changes, please feel free to reach out to me. I will be happy to help you with any changes.Have a great day!!

Question 1
How many observations (or rows) and variables (or column) do we have in the dataset “chdData”?

Group of answer choices

Observations = 4240; Variables = 16
Observations = 4241; Variables = 16
Observations = 16; Variables = 4240
Observations = 16; Variables = 4241

Answer 1.
Observations = 4240; Variables = 16

Question 2
I want to inspect the first five rows of the data. Select the correct line of code?

Group of answer choices

View(chdData)
summary(chdData)
head(chdData))
head(chdData, 5)

Answer 2
head(chdData, 5)

Question 3
Based on medical experts, you learned that the following clinical variables are important to predict tenyear risk of coronary heart disease:

1- Age
2- Gender
3- Systolic blood pressure
4- Heart rate
5- Glucose
Please write a line of code to select the aforementioned variables? Store the output to a variable names
chdDataSelected. (Please note that for...


Anonymous
Really great stuff, couldn't ask for more.

Studypool
4.7
Indeed
4.5
Sitejabber
4.4

Related Tags