Harvard University Lab homework using R
#Lab 10#274-Wilcox (Fall 2019)#Name:#Student ID:rm(list=ls())source('Rallfun-v33.txt')#1) Import the dataset lab10hw1.txt in table form:#2) For this dataset, what is our dependent variable? #3) How many independent variables do we have? #4) How many levels does each independent variable have (use the function unique(x) to check)? #5) Make a boxplot for this set of data (submit the image). What problem do you see?#6) What is our null hypothesis?#7) Now use the classic method to analyze this dataset using the format aov(x~factor(g)). # Save this as an object called hw1.anova. #NOTE: MAKE SURE TO USE factor() AROUND YOUR GROUPING VARIABLE SO IT IS TREATED AS A FACTOR, NOT AS A NUMERIC VARIABLE. # Then summarize these results using summary(hw1.anova). #8) Do we reject or do we fail to reject the null hypothesis?#9) Now let's use the t1way() function, which is based on trimmed means and can deal with heteroscedasticity.#Hint 1: First, reorganize your data using fac2list(x, g). Save your new list as hw1.list.#Hint 2: You will need to have loaded in the source code to use the t1way function.#10) Do we reject or do we fail to reject the null hypothesis from 1.9?----------------------------------------------------------------------------------------------------------------------------------------------------------Lab 10 lecture notes:#Lab 10#Lab 10-Contents#1. One-Way Independent Groups ANOVA (Equal Variance)#2. One-Way Independent Groups ANOVA (Unequal Variance-Welch's Test)#---------------------------------------------------------------------------------# 1. One-Way Independent Groups ANOVA (Equal Variance)#--------------------------------------------------------------------------------- #Scenario for first exercise: # A professor is interested in the effect of visualization strategies#on test performance. In order to study this, he tells students in#his statistics class that they will have a 15 question exam in #two weeks. Then, he randomly assigns students to three groups. # # The first group is told to spend 15 min each day vizualizing #the outcome of getting an A on the test to vividly imagine #the exam with an "A" written on it and how great it will feel. # # The second group is a control group that does no visualization. ## The third group is told to spend 15 min each day visualizing#the process of studying for the exam: imagine the hours of studying,#reviewing their chapters, working through chapter problems, # quizzing themeselves, etc. # Two weeks later, the students take the exam and the professor # records how many questions the students answer correctly out of 15.#So, the groups are:#Group 1: Visualize Outcome (Grade)#Group 2: No visualization (Control)#Group 3: Visiualize Process (Studying)#######################################################Question: Are the groups here Independent?#######################################################We'll instroduce a few new terms: #Factor: A variable that consists of categories. #Levels: The categories of the Factor variable. #In our example above, the variable that contains#the groups is called "Group". #So, our factor is the variable "Group"#How many levels are there for the Group Factor?#Let's read in LAB10A.txtlab10a=read.table('LAB10A.txt', header=T)#While we can easily see the levels for the Group #factor we could also use a new command to figure out #the number of unique levels.#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^## Number of Unique Levels: unique(data$variable)#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^#unique(lab10a$Group) #As we can see, there are 3 levels. 1, 2, and 3#Look at boxplot of each group using #boxplot(y~group, data=data)par(mfrow=c(1,1))boxplot(Score~Group, data=lab10a)#Do you think the means will be different (statistically)#between the groups?#Before we begin to test for differences between #the means, let's wrtie out our NUll #and Alternative Hyhpotheses#H0: The means are equal (mu1=mu2=mu3)#HA: At least one mean is different. #(eg. mu1 != mu2 OR mu1 != mu3 OR mu2 != mu3 )#To test the Hypothesis we can use the ANOVA function aov():#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^## One-Way ANOVA: aov(y~factor(g), data)#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^##The aov() function assumes that the #variance is the same within each of the groups.mod1=aov(Score ~ factor(Group), data=lab10a)summary(mod1)#A) If pval < alpha, then Reject the Null Hypothesis#B) If pval > alpha, then Fail to Reject the Null Hypothesis#Do we Reject or Fail to Reject the Null?#Reject 0.00129 < .05 then Reject H0#What does this tell us? That the groups are different?#If so, how do we know which groups?#P-value we just got is called the Omnibus P-value, #which tells us that there are differences somewhere#With this P-value we often use the term #"Main Effect" to say that there is an effect of the#factor on the outcome.#In this instance we'd say that there is a Main Effect #of Group on the Score.#To Answer which groups are different, we need to first#conver the data into List Mode (a different way #of storing the data). We can convert the factor Group #to a list using the function fac2list(y, g)#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^## Convert Factors to List Data: fac2list(data$y, data$g)#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^#listA=fac2list(lab10a$Score, lab10a$Group)listA #Once the data is in List Mode we have to use the#lincon() command from Dr. Wilcox's source code.#The lincon() package is used to compare the groups while#controlling for the experimentwise Type 1 error rate.#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^## Compare Groups: lincon(list_name, tr=0.2)#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^##By default lincon() compares groups using 20% trimming. #We will set this to 0 for now:lincon(listA, tr=0) #result:# H0_1: mu1=mu2 --- p=0.32 ---Fail to reject# H0_2: mu1=mu3 --- p=0.0009 ---Reject# H0_3: mu2=mu3 --- p=0.008 ---Reject#---------------------------------------------------------------------------------# 2. One-Way Independent Groups ANOVA (Unequal Variance-Welch's Test)#--------------------------------------------------------------------------------- # We just learned how to conduct a One-Way ANOVA # when the variances are equal within each group. # Now, we will learn how to conduct a One-Way ANOVA #for then the variance is not equal.# Let's start by reading in the LAB10B.txt datafile.lab10b=read.table('LAB10B.txt', header=T)# Then examine a boxplot of all of it.boxplot(Score~factor(Group), data=lab10b)# What do we notice about this boxplot?#-----# Let's start by running the equal variance ANOVA#on the data (which of course is WRONG!)mod2=aov(Score ~ factor(Group), data=lab10b) #---DON'Tsummary(mod2)#A) If pval < alpha, then Reject the Null Hypothesis#B) If pval > alpha, then Fail to Reject the Null Hypothesis# Do we Reject or Fail to Reject the Null?#Fail to reject: p-value=0.0895 > .05 !!!INCORRECT----#----# Now let's try to run the correct test that assumes #unequal variance. #We call this the Welch's test (just like in the t-test)#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^## Welch's One-Way ANOVA: t1way(list_name, tr=0.20)#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^##In order to use this t1way function, #we will first need to convert the data to #List Mode using fac2list()listB=fac2list(lab10b$Score, lab10b$Group)t1way(listB, tr=0.2)# Do we Reject or Fail to Reject the Null?#Reject: p-value:0.04966583 <.05#Again, we can use the lincon() command to #find out Where the group differences are.#This time we will use the 20% trimming.lincon(listB, tr=0.2)# G1 and G2: p-value=0.92210409 > .05 Fail to reject# G1 and G3: p-value=0.19451518 > .05 Fail to reject#G2 and G3: p-value=0.03227316 < .05 Reject#