POLS 6380 University of Houston R Studio & Research Methods Project &Lab Report

User Generated

gbal998

Programming

POLS 6380

University of Houston

POLS

Description

Unformatted Attachment Preview

Research Methods HW#7 Professor Shea POLS 6380 1. (2 points) What happens to our OLS inference if we multiply some value by X, where X is our explanatory variable of interest? 2. This question relies on the following file: anorexia.csv. This data examines the effectiveness of different treatments for anorexia. (a) (3 points) Treat the data as paired data (as it is). I’m interested in whether treatment “f” is effective in adding weight to the patients. Derive a test statistic to determine a difference. What is your decision with the null based on the test? (b) (2 points) Why is it particularly difficult to infer a causal effect using a “matched pair” test? (c) (2 points) Repeat the same test procedure for treatment “c”, Derive a test statistic to determine a difference. What is your decision with the null based on the test? (d) (2 points) Repeat the same procedure for treatment “b”, Derive a test statistic to determine a difference. What is your decision with the null based on the test? (e) (2 points) We just engaged in some “multiple testing” behavior. What is the potential problem of multiple testing? Propose a strategy to address this “multiple testing” problem (you don’t have to actually carry this out - just discuss it). Is there a tradeoff with your strategy? (f) (2 points) Now suppose we failed to recognize that the data were paired. Derive a difference in means test statistic for the “b” groups (assume unequal variance). (g) (2 points) Was unequal variance a justifable assumption for the previous test. Use an F-test to support your case. 3. This next section will require you to use the following dataset: “gerber green larimer.Rdata” (this is the social pressure/civic duty data from the slides, weeks 6 and 7). (a) (2 points) Present a descriptive graph that compares voting turnout for each treatment group. (b) (1 point) For now, assume that variable “voted” is normally distributed. Run a one-way ANOVA with “treatment” as the main explanatory variable. What is the null hypothesis of this test? (c) (1 point) What is the sampling distribution of this test? Please identify the degrees of freedom and where the sampling distribution is centered. (d) (1 point) What is the explained and unexplained variance of your ANOVA test? (e) (1 point) What’s the F-statistic derived from the ANOVA test you ran? What does it tell us from a statistical significance standpoint? (f) (2 points) Reconsider the outcome variable (voted). Is an ANOVA appropriate with these data? 4. This next section will require you to use the following dataset: midtermvoteloss.csv. Research Methods HW#7 Professor Shea POLS 6380 (a) (2 points) Present a scatter plot of Midterm Vote Loss (the dependent variable) and Change in Income (the independent variable). Show this scatter plot with the best fitted line. (b) (2 points) Estimate the linear model associated with this scatter plot. Present the results and interpret all the important information. (c) (2 points) Provide two point predictions with the linear model (i.e. with minimum value of Change in Income and maximum value of Change in Income, etc.). 5. This next section will require you to use the following dataset: hw7. (a) (2 points) Ignoring what the variables are for a moment, estimate the following linear model using OLS and report the results: y1 = α + βx1 +  (b) (2 points) Estimate the following linear model using OLS and report the results: y2 = α + βx2 +  (c) (2 points) Estimate the following linear model using OLS and report the results: y3 = α + βx3 +  (d) (2 points) Estimate the following linear model using OLS and report the results: y4 = α + βx4 +  (e) (3 points) Given what you have learned in class, how would you compare the linear models you just estimated? Page 2 https://github.com/Tommysd123/tommy.git You have to use these CSV files in the link gerber_green_larimer.Rdata hw7.csv anorexia.csv midtermvoteloss.csv POLS 6480, Fall 2017 Lab assistant: Philip Waggoner Lab Assignment 08 I. Objectives: Primary objective is to test hypotheses regarding two populations. Secondary objective is to understand research design issues involving panel studies and paired/related samples. II. Datasets: “cereal.csv” and “anorexia.csv” III. Packages: none IV. Preparation 1) Open RStudio by double-clicking the icon or selecting RStudio from the Windows Start menu. 2) Clear any data in memory: > rm(list=ls()) 3) Download datasets “cereal.csv” and “anorexia.csv” and place them in your working directory 4) Download R script “POLS 6480 Lab 08.R” and place it in your working directory. 5) Open the R script by typing Ctrl+O or by clicking on File in the upper-left corner, using the dropdown menu, and navigating to the script in your working directory. V. Instructions for Lab 08 The first dataset you will use is a sample of 24 breakfast cereals – 12 cereals intended for children and 12 intended for adults – which you used last week. The five variables are the number of grams per serving, the number of calories per serving, the milligrams of sodium per serving, the grams of fiber per serving, and the grams of refined sugar per serving. The second dataset you will use is from a well-designed experiment on treating eating disorders. Subjects’ weights were measured before and after treatment, allowing us to calculate the change in weight, and the study also included a control group. A. Cereals 1. To load the first dataset, type the following lines, changing the directory if needed: > cereal children adults m1 t.test(treatment.f$after, control$after, alt="greater") Notice that I shortened the word alternative to alt. Answer the following three questions: Did the t statistic change? Did the confidence interval change? Did the p value change? 8. While it is good to know that subjects in the family treatment group had higher weight after treatment than those in the control group, it is impossible to say that the treatment had a causal relationship, because it is conceivable that subjects in the treatment group had higher weight before treatment. The research design utilized for questions 6 and 7 is called the “post-test only with non-equivalent control groups” design. An alternative is the “pre-test and post-test with no control group” design, which simply examines whether weight changed from before treatment to after treatment. You can carry out this comparison using a two-sample test: > t.test(treatment.f$after, treatment.f$before, alt="greater") Is there a statistically significant difference between pre- and post-treatment weights? Compare these results to a one-sample t test using R’s built-in t.test command. > treatment.f$delta t.test(treatment.f$delta, mu=0, alt="greater") Lab written by Scott Basinger, sjbasinger@uh.edu POLS 6480, Fall 2017 Lab assistant: Philip Waggoner Is the mean difference (reported as “mean of x” after the One Sample test) equal to the difference of means (subtracting “mean of y” from “mean of x” reported after the Two Sample test?) Is the t statistic for the One Sample test equal to the t statistic for the Two Sample test? Why not? Data from an experiment with the same subjects having the response variable measured two times (before and after treatment) are an example of paired data. To calculate the correct standard error of the difference of means, you need to take into account that the values of the response variable (in this case, the subject’s weight) are correlated! Patients who start heavier typically also finish heavier. Find the correlation between pre- and post-treatment weights by typing: > cor(treatment.f$after, treatment.f$before) The paired-samples difference of means test requires adding a statement to the code: > t.test(treatment.f$after, treatment.f$before, alt="greater", paired = TRUE) Check again: is the t statistic for the Paired Sample test equal to the t statistic for the One Sample test that you carried out earlier? 9. The final task will be to perform what is called a differences-in-differences test. Because we have measures of pre- and post-treatment weight for the control group also, the most persuasive test is to compare the changes in weights for subjects in the treatment group against the changes in weights for subjects in the control group, which must be computed before the t test: > control$delta t.test(treatment.f$delta, control$delta, alt="greater") Is there a statistically significant difference between the average weight change in the treatment group (= _____ ) and the average weight change in the control group (= _____ ) ? 10. On your own, repeat 6–9 using the cognitive behavioral therapy data. Answer the following: Is the average post-treatment weight higher for subjects receiving cognitive behavioral therapy higher than average post-treatment weight for subjects in the control group? Is the difference large enough to attain statistical significance? Did cognitive behavioral therapy increase the average weight of subjects? Is the difference large enough to attain statistical significance? Is the difference between average weight gain of subjects receiving cognitive behavioral therapy larger than the average weight gain of subjects in the control group? Is the difference large enough to attain statistical significance? Next week, we will use all three groups to perform one-way ANOVA. 11. To clear the Environment, type rm(list=ls()) or click on the broom icon. To clear the Console window, type Ctrl-l Lab written by Scott Basinger, sjbasinger@uh.edu
Purchase answer to see full attachment
Explanation & Answer:
600 words
1 Lab Report
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

View attached explanation and answer. Let me k...


Anonymous
Great study resource, helped me a lot.

Studypool
4.7
Indeed
4.5
Sitejabber
4.4

Related Tags