PA 3010 Central Washington University Central Tendency and Variability Worksheet

User Generated

ebbm

Mathematics

PA 3010

Central Washington University

PA

Description

Requirement

  • This assignment should be completed in Word. However, you are required to record your statistical analysis in Excel with formulas or functions kept inside. In other words, you need to submit two files (Word and Excel) to me for grading. This is required since I want to make sure you clearly know how to run formulas or functions in Excelto conduct the statistical analysis, which is a core purpose for this course.

Content

The ongoing pandemic provides us with a proper scenario where the relationship between politics, public health crisis, and economic development can be investigated. The given data set includes the confirmed cases of COVID-19 every 100,000 people (case rate per 100,000) and the gross domestic product (GDP) per capita as of 2020 in three groups of states with different political affiliation (swing states, solid red states, and solid blue states). In this context, case rate per 100,000 and GDP per capita in 2020 are two relative strength indicators taking population size into consideration. Compared with the absolute strength indicators (total confirmed cases and GDP), the relative strength ones can best represent the status quo of public health crisis and economic development, enabling a valuable comparative analysis across states. Specifically, the given data set is used to answer the following questions:

  • Calculate and compare central tendency and variability of case rate per 100,000 and GDP per capita in 2020 among swing, solid blue, and solid red states.
    • Please write out your observations and comments attached behind the below table regarding the comparison of confirmed cases of COVID-19 and economic development among swing, solid blue, and solid red states. For example, do swing states, on average, have more (or less) confirmed cases of COVID-19 and better (or worse) economic development than solid red and solid blue states? Are these observations or findings attractive to you? How will you justify these observations or findings? (Notes: Complete it in Word.)
    • Insert a new sheet in the given data set to place the new grouped data by political affiliation and case rate per 100,000. (Notes: Complete it in Excel.)
    • Follow the famous eight steps to test the hypothesis above. (Notes: Notes: Complete it in Word.)
  • Use the flowchart in textbook to select the most appropriate test statistic to examine whether the means of case rate per 100,000 among three groups of states (swing states, solid red states, and solid blue states) are different.
  • The ongoing pandemic brings up two critical questions requiring further examination. First, whether is there a relationship between the confirmed cases of COVID-19 (case rate per 100,000) and economic development (GDP per capita in 2020)? Second, the degree to which have the confirmed cases of COVID-19 (case rate per 100,000) affected economic development (GDP per capita in 2020)? The given sample data (24 states) will be used to answer the above two questions.
    • Use the correlation coefficient to examine the first question. You are NOT required to follow the famous eight steps to conduct the test statistic. Instead, you can use “Data Analysis Tools” in Excel to calculate the correlation. (Notes: Complete it in Excel.) (). However, you are REQUIRED to interpret and evaluate the correlation coefficient using “Thumb Rule”, “Coefficient of Determination”, and “Association versus Causality”. (Notes: Complete it in Word.) (
    • Use the regression analysis to examine the second question. Again, you are NOT required to follow the famous eight steps to conduct the test statistic. Instead, you can use “Data Analysis Tools” in Excel to get a regression equation or line that can reflect how the confirmed cases of COVID-19 (case rate per 100,000) affects economic development (GDP per capita in 2020). (Notes: Complete it in Excel.) (2 pts). However, you are REQUIRED to clearly and precisely state the independent variable, the dependent variable, slope and intercept, the regression equation, and a scatterplot that can visualize how the confirmed cases of COVID-19 (case rate per 100,000) affects economic development (GDP per capita in 2020). Finally, please calculate the predicted GDP per capita when the case rate per 100,000 goes up to 15,000. (Notes: Complete it in Word)

Unformatted Attachment Preview

Political Affiliation State Case Rate per 100,000 GDP Per Capita in 2020 ($) Arizona 11.702 50.187 Florida 9.815 50.424 Georgia 10.172 57.819 Michigan 8.589 51.766 Swing States Nevada 10.055 54.998 North Carolina 8.974 55.292 Pennsylvania 8.537 61.031 Wisconsin 11.150 58.066 Arkansas 11.033 42.591 Indiana 10.447 55.165 Kansas 10.480 59.475 Kentucky 9.753 46.909 Solid Red States Nebraska 11.156 66.480 Oklahoma 11.242 46.871 Tennessee 12.141 52.925 Wyoming 9.884 62.236 California 9.134 78.538 Colorado 8.428 67.169 Connecticut 9.200 78.971 Delaware 10.261 76.522 Solid Blue States Maryland 7.194 69.805 Massachusetts 9.234 84.722 New Jersey 10.892 69.695 Virginia 7.536 64.229 Data Source: https://covid.cdc.gov/covid-data-tracker; U.S. Bureau of Economic Analysis Using-Your-Thumb Rule Perhaps the easiest (but not the most informative) way to interpret the value of a correlation coefficient is by eyeballing it and using the information in Table 6.3. This is based on customary interpretations of the size of a correlation in the behavioral sciences. Table 6.3 Interpreting a Correlation Coefficient Size of the Correlation Coefficient General Interpretation .5 to 1.0 Strong relationship .4 Moderate to strong relationship .3 Moderate relationship .2 Weak to moderate relationship 0 to .1 Weak or no relationship So, if the correlation between two variables is .3, you could safely conclude that the relationship is a moderate one—not strong, but certainly not weak enough to say that the variables in question don’t share anything in common. This eyeball method is perfectly acceptable for a quick assessment of the strength of the relationship between variables, such as when you briefly evaluate data presented visually. But because this rule of thumb depends on a subjective judgment (of what’s “strong” or “weak”), we would like a more precise method. That’s what we’ll look at now. Special Effects! Correlation Coefficient Throughout the book, we will learn about various effect sizes and how to interpret them. An effect size is an index of the strength of the relationship among variables, and with most statistical procedures we learn about, there will be an associated effect size that should be reported and interpreted. The correlation coefficient is a perfect example of an effect size as it quite literally is a measure of the strength of a relationship. Thanks to Table 6.3, we already know how to interpret it. Some of you already know that the correlation coefficient can be checked for its significance, or how likely it is to have occurred by something other than chance, and we will get to that in Chapter 16. That’s another valuable way to interpret correlation coefficients, but for now, let’s stick with the simple, more descriptive way that follows. Once you have this under your belt, you’ll be ready to move on to more sophisticated ideas. A Determined Effort: Squaring the Correlation Coefficient Here’s the much more precise way to interpret the correlation coefficient: computing the coefficient of determination. The coefficient of determination is the percentage of variance in one variable that is accounted for by the variance in the other variable. Quite a mouthful, huh? Earlier in this chapter, we pointed out how variables that share something in common tend to be correlated with one another. If we correlated math and language arts grades for 100 fifth-grade students, we would find the correlation to be moderately strong, because many of the reasons why children do well (or poorly) in math tend to be the same reasons why they do well (or poorly) in language arts. The number of hours they study, how bright they are, how interested their parents are in their schoolwork, the number of books they have at home, and more are all related to both math and language arts performance and account for differences between children (and that’s where the variability comes in). The more these two variables share in common, the more they will be related. These two variables share variability—or the reason why children differ from one another. And on the whole, the brighter child who studies more will do better. To determine exactly how much of the variance in one variable can be accounted for by the variance in another variable, the coefficient of determination is computed by squaring the correlation coefficient. For example, if the correlation between GPA and number of hours of study time is .70 (or rGPA.time = .70), then the coefficient of determination, represented by rGPA.time2 , is .702, or .49. This means that 49% of the variance in GPA “can be explained by” or “is shared by” the variance in studying time. And the stronger the correlation, the more variance can be explained (which only makes good sense). The more two variables share in common (such as good study habits, knowledge of what’s expected in class, and lack of fatigue), the more information about performance on one score can be explained by the other score. However, if 49% of the variance can be explained, this means that 51% cannot—so even for a very strong correlation of .70, many of the reasons why scores on these variables tend to be different from one another go unexplained. This amount of unexplained variance is called the coefficient of alienation (also called the coefficient of nondetermination). Don’t worry. No aliens here. This isn’t X-Files or Walking Dead stuff—it’s just the amount of variance in Y not explained by X (and, of course, vice versa since the relationship goes both ways). How about a visual presentation of this sharing variance idea? Okay. In Figure 6.12, you’ll find a correlation coefficient, the corresponding coefficient of alienation, and a diagram that represents how much variance is shared between the two variables. The larger the shaded area in each diagram (and the more variance the two variables share), the more highly the variables are correlated. Description Figure 6.12 How Variables Share Variance and the Resulting Correlation The first diagram shows two circles that do not touch. They don’t touch because they do not share anything in common. The correlation is 0. The second diagram shows two circles that overlap. With a correlation of .5 (and r2XY = .25), they share about 25% of the variance between themselves. Finally, the third diagram shows the two circles placed almost one on top of the other. With an almost perfect correlation of rXY = .90 (r2XY = .81), they share about 81% of the variance between themselves. AS MORE ICE CREAM IS EATEN, THE CRIME RATE GOES UP (OR ASSOCIATION VERSUS CAUSALITY) Now, here’s the really important thing to be careful about when computing, reading about, or interpreting correlation coefficients. Imagine this. In a small Midwestern town, a phenomenon was discovered that defied any logic. The local police chief observed that as ice cream consumption increased, crime rates tended to increase as well. Quite simply, if you measured both, you would find the relationship was direct, which means that as people ate more ice cream, the crime rate increased. And as you might expect, as they ate less ice cream, the crime rate went down. The police chief was baffled until he recalled the Stat 1 class he took in college and still fondly remembered. He wondered how this could be turned into an aha! “Very easily,” he thought. The two variables must share something or have something in common with one another. Remember that it must be something that relates to both level of ice cream consumption and level of crime rate. Can you guess what that is? The outside temperature is what they both have in common. When it gets warm outside, such as in the summertime, more crimes are committed (it stays light longer, people leave the windows open, bad guys and girls are out more, etc.). And because it is warmer, people enjoy the ancient treat and art of eating ice cream. Conversely, during the long and dark winter months, less ice cream is consumed and fewer crimes are committed as well. Joe, though, recently elected as a city commissioner, learns about these findings and has a great idea, or at least one that he thinks his constituents will love. (Keep in mind, he skipped the statistics offering in college.) Why not just limit the consumption of ice cream in the summer months to reduce the crime rate? Sounds good, right? Well, on closer inspection, it really makes no sense at all. That’s because of the simple principle that correlations express the association that exists between two or more variables; they have nothing to do with causality. In other words, just because level of ice cream consumption and crime rate increase together (and decrease together as well) does not mean that a change in one results in a change in the other. For example, if we took all the ice cream out of all the stores in town and no more was available, do you think the crime rate would decrease? Of course not, and it’s preposterous to think so. But strangely enough, that’s often how associations are interpreted—as being causal in nature—and complex issues in the social and behavioral sciences are reduced to trivialities because of this misunderstanding. Did long hair and hippiedom have anything to do with the Vietnam conflict? Of course not. Does the rise in the number of crimes committed have anything to do with more efficient and safer cars? Of course not. But they all happen at the same time, creating the illusion of being associated. OTHER COOL CORRELATIONS There are different ways in which variables can be assessed. For example, nominal-level variables are categorical in nature; examples are race (e.g., black or white) and political affiliation (e.g., Independent or Republican). Or, if you are measuring income and age, you are measuring interval-level variables, because the underlying continuum on which they are based has equally appearing intervals. As you continue your studies, you’re likely to come across correlations between data that occur at different levels of measurement. And to compute these correlations, you need some specialized techniques. Table 6.3 summarizes what these different techniques are and how they differ from one another. 138 How Inference Works Here are the general steps of a research project to see how the process of inference might work. We’ll stay with adolescents’ attitudes toward mothers working as an example. Here’s the sequence of events that might happen: 1. The researcher selects representative samples of adolescents who have mothers who work and adolescents who have mothers who do not work. These are selected in such a way that the samples represent the populations from which they are drawn. For example, they might be chosen randomly from a long list of potential participants. 2. Each adolescent is administered a test to assess their attitude. The mean scores for groups are computed and compared using some test. 3. A conclusion is reached as to whether the difference between the scores is the result of chance (meaning some factor other than moms working is responsible for the difference) or the result of “true” and statistically significant differences between the two groups (meaning the results are due to moms working). 4. A conclusion is reached as to the relationship between maternal employment and adolescents’ attitudes in the population from which the sample was originally drawn. In other words, an inference, based on the results of an analysis of the sample data, is made about the population of all adolescents. How to Select What Test to Use Step 3 above brings us to ask the question, “How do I select the appropriate statistical test to determine whether a difference between groups exists?” Heaven knows, there are plenty of them, and you have to decide which one to use and when to use it. Well, the best way to learn which test to use is to be an experienced statistician who has taken lots of courses in this area and participated in lots of research. Experience is still the greatest teacher. In fact, there’s no way you can really learn what to use and when to use it unless you’ve had the real-life, applied opportunity to actually use these tools. And as a result of taking this course, you are learning how to use these very tools. But the basic reasons for why a particular statistical test is the right one to use can be simplified into a few characteristics about your research question. So, for our purposes and to get started, we’ve created this nice little flowchart (a.k.a. cheat sheet) of sorts that you see in Figure 10.1. You have to have some idea of what you’re doing, so selecting the correct statistical test does not put the rest of your study on autopilot, but it certainly is a good place to get started. Description Figure 10.1 A Quick (But Not Always Great) Approach To Determining Which Statistical Test to Use Don’t think for a second that Figure 10.1 takes the place of your need to learn about when these different tests are appropriate. The flowchart is here only to help you get started. This is really important. We just wrote that selecting the appropriate statistical test is not necessarily an easy thing to do. And the best way to learn how to do it is to do it, and that means practicing and even taking more statistics courses. The simple flowchart we present here works, in general, but use it with caution. When you make a decision, check with your professor or some other person who has been through this stuff and feels more confident than you might (and who also knows more!). Here’s How to Use the Chart 1. Assume that you’re very new to this statistics stuff (which you are) and that you have some idea of what these tests of significance are, but you’re pretty lost as far as deciding which one to use when. 2. Answer the question at the top of the flowchart. 3. Proceed down the chart by answering each of the questions until you get to the end of the chart. That’s the statistical test you should use. This is not rocket science, and with some practice (which you will get throughout this part of the book), you’ll be able to quickly and reliably select the appropriate test. Each of the remaining chapters in this part of the book will begin with a chart like the one you see in Figure 10.1 and take you through the specific steps to get to the test statistic you should use. Does the flowchart in Figure 10.1 contain all the statistical tests there are? Not by a long shot. There are hundreds, but the ones in Figure 10.1 are the ones used most often. And if you are going to become familiar with the research in your own field, you are bound to run into these. THE PATH TO WISDOM AND KNOWLEDGE Here’s how you can use Figure 11.1, the flowchart introduced in Chapter 10, to select the appropriate test statistic, the one-sample Z test. Follow along the highlighted sequence of steps in Figure 11.1. Now this is pretty easy (and they are not all this easy) because this is the only inferential comparison procedure in all of Part IV of this book where we have only one group. (We compare the mean of that one group to a theoretical invisible population.) Plus, there’s lots of stuff here that will take you back to Chapter 9 and standard scores, and because you’re an expert on those. . . . Description Figure 11.1 Determining that a One-Sample Z Test is the Correct Statistic 1. We are examining differences between a sample and a population. 2. There is only one group being tested. 3. The appropriate test statistic is a one-sample Z test. As in sooooo many statistical procedures, different symbols and words are used to represent the same thing (remember mean, X¯ , and average?). So it is with one-sample Z tests. Sometimes you’ll see the lowercase z used and sometimes the uppercase Z. We’re sticking with the uppercase Z because we like it (and so do many other stats-type folks) and that’s the way that Excel does it in its Z.TEST function. So, z scores and z values and Z tests. COMPUTING THE Z TEST STATISTIC The formula used for computing the value for the one-sample Z test is shown in Formula 11.1. Remember that we are testing whether a sample mean belongs to or is a fair estimate of a population. The difference between the sample mean ( X¯ ) and the population mean (μ) makes up the numerator (the value on top) for the Z test value. The denominator (the value on the bottom that you divide by), an error term, is called the standard error of the mean and is the value we would expect by chance, given all the variability that surrounds the selection of all possible sample means from a population. Using this standard error of the mean (and the key term here is standard) allows us once again (as we showed in Chapter 10) to use the table of z scores to determine the probability of an outcome. It turns out that sample means drawn randomly from populations are normally distributed, so we can use a z table because it assumes a normal curve. (11.1) z=X¯−μ/ SEM where • • • X¯ is the mean of the sample, µ is the population average, and SEM is the standard error of the mean. Now, to compute the standard error of the mean, which you need in Formula 11.1, use Formula 11.2: (11.2) SEM=σ/ square rootn where • • σ is the standard deviation for the population, and n is the size of the sample. The standard error of the mean is the standard deviation of all the possible means selected from the population. It’s the best estimate of a population mean that we can come up with, given that it is impossible to compute all the possible means. If our sample selection were perfect, and the sample fairly represents the population, the difference between the sample and the population averages would be zero, right? Right. If the sampling from a population were not done correctly (randomly and representatively), however, then the standard deviation of all the means of all these samples could be huge, right? Right. So we try to select the perfect sample, but no matter how diligent we are in our efforts, there’s always some error. The standard error of the mean gives a range (remember that confidence interval from Chapter 10?) of where the mean for the entire population probably lies. There can be (and are) standard errors for other measures as well. Time for an Example Dr. McDonald thinks that his group of earth science students is particularly special (in a good way), and he is interested in knowing whether their class average falls within the boundaries of the average score for the larger group of students who have taken earth science over the past 20 years. Because he’s kept good records, he knows the means and standard deviations for his current group of 36 students and the larger population of 1,000 past enrollees. Here are the data. Size Mean Standard Deviation Sample 36 100 5.0 Population 1,000 99 2.5 Here are the famous eight steps and the computation of the Z test statistic. State the null and research hypotheses. The null hypothesis states that the sample average is equal to the population average. If the null is not rejected, it suggests that the sample is representative of the population. If the null is rejected in favor of the research hypothesis, it means that the sample average is probably different from the population average. The null hypothesis is (11.3) H0:X¯=μ The research hypothesis in this example is (11.4) H1:X¯≠μ Set the level of risk (or the level of significance or Type I error) associated with the null hypothesis. The level of risk or Type I error or level of significance (any other names?) here is .05, but this is totally at the discretion of the researcher. Select the appropriate test statistic. Using the flowchart shown in Figure 11.1, we determine that the appropriate test is a onesample Z test. Compute the test statistic value (called the obtained value). Now’s your chance to plug in values and do some computation. The formula for the z value was shown in Formula 11.1. The specific values are plugged in (first for SEM in Formula 11.5 and then for z in Formula 11.6). With the values plugged in, we get the following results: (11.5) SEM=2.5/ squre root 36=0.42 (11.6) z=100−99/0.42=2.38 The z value for a comparison of the sample mean to this population mean, given Dr. McDonald’s data, is 2.38. Determine the value needed for rejection of the null hypothesis using the appropriate table of critical values for the particular statistic. Here’s where we go to Table B.1 in Appendix B, which lists the probabilities associated with specific z values, which are the critical values for the rejection of the null hypothesis. This is exactly the same thing we did with several examples in Chapter 10. We can use the values in Table B.1 to see if two means “belong” to one another by comparing what we would expect by chance (the tabled or critical value) with what we observe (the obtained value). From our work in Chapter 10, we know that a z value of +1.96 has associated with it a probability of .025, and if we consider that the sample mean could be bigger, or smaller, than the population mean, we need to consider both ends of the distribution (and a range of ±1.96) and a total Type I error rate of .05. Compare the obtained value and the critical value. The obtained z value is 2.38. So, for a test of this null hypothesis at the .05 level with 36 participants, the critical value is ±1.96. This value represents the value at which chance is the most attractive explanation of why the sample mean and the population mean differ. A result beyond that critical value in either direction (remember that the research hypothesis is nondirectional and this is a two-tailed test) means that we need to provide an explanation as to why the sample and the population means differ. and 8. Decision time! If the obtained value is more extreme than the critical value (remember Figure 10.2), the null hypothesis should not be accepted. If the obtained value does not exceed the critical value, the null hypothesis is the most attractive explanation. In this case, the obtained value (2.38) does exceed the critical value (1.96), and it is absolutely extreme enough for us to say that the sample of 36 students in Dr. McDonald’s class is different from the previous 1,000 students who have also taken the course. If the obtained value were less than 1.96, it would mean that there is no difference between the test performance of the sample and that of the 1,000 students who have taken the test over the past 20 years. In this case, the 36 students would have performed basically at the same level as the previous 1,000. And the final step? Why, of course. We wonder why this group of students differs? Perhaps McDonald is right in that they are smarter, but they may also be better users of technology or more motivated. Perhaps they just studied harder. All these are questions to be tested some other time. COMPUTING THE T TEST STATISTIC The formula for computing the t value for the t test for independent means is shown in Formula 12.1. All those symbols really just provide two important values. The difference between the two means makes up the numerator, the top of the equation; the amount of variance within and between each of the two groups makes up the denominator. (12.1) where X¯1 is the mean for Group 1, X¯2 is the mean for Group 2, n1 is the number of participants in Group 1, n2 is the number of participants in Group 2, s12 is the variance for Group 1, and s22 is the variance for Group 2. This is a bigger formula than we’ve seen before, but there’s really nothing new here at all. It’s just a matter of plugging in the correct values. Time for an Example Here are some data (from Chapter 12 Data Set 1) reflecting the number of words remembered following a program designed to help Alzheimer’s patients remember the order of daily tasks. Group 1 was taught using visuals, and Group 2 was taught using visuals and intense verbal rehearsal. We’ll use the data to compute the test statistic in the following example. Group 1 Group 2 7 5 5 5 3 4 3 4 7 4 2 3 3 6 1 4 5 2 2 10 9 5 4 7 3 10 2 5 4 6 8 5 5 7 6 2 8 1 2 8 7 8 5 1 12 8 7 9 8 4 15 9 5 7 5 3 4 8 6 6 Here are the famous eight steps and the computation of the t test statistic. State the null and research hypotheses. As represented by Formula 12.2, the null hypothesis states that there is no difference between the means for Group 1 and Group 2. For our purposes, the research hypothesis (shown as Formula 12.3) states that there is a difference between the means of the two groups. The research hypothesis is two-tailed and nondirectional because it posits a difference but in no particular direction. The null hypothesis is (12.2) H0:μ1=μ2 The research hypothesis is (12.3) H1:X¯1≠X¯2 Set the level of risk (or the level of significance or chance of Type I error) associated with the null hypothesis. The level of risk or probability of Type I error or level of significance (any other names?) here is .05, but this is totally the decision of the researcher. Select the appropriate test statistic. Using the flowchart shown in Figure 12.1, we determined that the appropriate test is a t test for independent means, because the groups are independent of one another. Compute the test statistic value (called the obtained value). Now’s your chance to plug in values and do some computation. The formula for the t value was shown in Formula 12.1. When the specific values are plugged in, we get the equation shown in Formula 12.4. (We already computed the mean and standard deviation.) (12.4) t=5.43−5.53[(30−1)3.422+(30−1)2.06230+30−2][30+3030×30] With the numbers plugged in, Formula 12.5 shows how we get the final value of −0.137. The value is negative because a larger value (the mean of Group 2, which is 5.53) is being subtracted from a smaller number (the mean of Group 1, which is 5.43). Remember, though, that because the test is nondirectional—the research hypothesis is that any difference exists—the sign of the difference is meaningless. Just pay attention to which group mean was larger so you can understand what happened. (12.5) t=−0.1(339.20+123.0658)(60900)=−0.137 Determine the value needed for rejection of the null hypothesis using the appropriate table of critical values for the particular statistic. Here’s where we go to Table B.2 in Appendix B, which lists the critical values for the t test. We can use this distribution to see whether two independent means differ from one another by comparing what we would expect by chance if there was no difference in the population (the tabled or critical value) with what we observe (the obtained value). Our first task is to determine the degrees of freedom (df), which approximates the sample size (but, for fancy technical reasons, adjusts it slightly to make for a more accurate outcome). For this particular test statistic, the degrees of freedom is n1 − 1 + n2 − 1 or n1 + n2 − 2 (putting the terms in either order results in the same value). So for each group, add the size of the two samples and subtract 2. In this example, 30 + 30 − 2 = 58. This is the degrees of freedom for this particular application of the t test but not necessarily for any other. Using this number (58), the level of risk you are willing to take (earlier defined as .05), and a two-tailed test (because there is no direction to the research hypothesis), you can use the t test table to look up the critical value. At the .05 level, with 58 degrees of freedom for a two-tailed test, the value needed for rejection of the null hypothesis is . . . oops! There’s no 58 degrees of freedom in the table! What do you do? Well, if you select the value that corresponds to 55, you’re being conservative in that you are using a value for a sample smaller than what you have (and the critical t value you need to reject the null hypothesis will be larger). If you go for 60 degrees of freedom (the closest to your value of 58), you will be closer to the size of the population, but you’ll be a bit liberal in that 60 is larger than 58. Although statisticians differ in their viewpoint as to what to do in this situation, let’s always go with the value that’s closer to the actual sample size. So the value needed to reject the null hypothesis with 58 degrees of freedom at the .05 level of significance is 2.001. Compare the obtained value and the critical value. The obtained value is −0.14 (−0.137 rounded to the nearest hundredth), and the critical value for rejection of the null hypothesis that Group 1 and Group 2 performed differently is 2.001. The critical value of 2.001 represents the largest value at which chance is the most attractive explanation for any of the observed sample differences between the two groups, given 30 participants in each group and the willingness to take a .05 level of risk. and 8. Decision time! Now comes our decision. If the obtained value is more extreme than the critical value (remember Figure 10.2), the null hypothesis should not be accepted. If the obtained value does not exceed the critical value, the null hypothesis is the most attractive explanation. In this case, the obtained value (−0.14) does not exceed the critical value (2.001)—it is not extreme enough for us to say that the difference between Groups 1 and 2 occurred due to anything other than chance. If the value were greater than 2.001, that would be just like getting 8, 9, or 10 heads in a coin toss—too extreme a result for us to believe that mere chance is at work. In the case of the coin, the cause would be an unfair coin; in this example, it would be that there is a better way to teach memory skills to these older people. So, to what can we attribute the small difference between the two groups? If we stick with our current argument, then we could say the difference is due to something like sampling error or simple variability in participants’ scores. Most important, we’re pretty sure (but, of course, not 100% sure—that’s what level of significance and Type I errors are all about, right?) that the difference is not due to anything in particular that one group or the other experienced to make its scores better.
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

View attached explanation and answer. Let me know if you have any questions.😀 Hi there! I completed the assignment! 😀 I'm aiming to be a 5 starts tutor, hence feel free to send me any questions and to review what ever I did. I will be willing to fix it asap.

Political Affiliation

State

Case Rate per
100,000

GDP Per
Capita in
2020 ($)

Arizona
11.702
50.187
Florida
9.815
50.424
Georgia
10.172
57.819
Michigan
8.589
51.766
Swing States
Nevada
10.055
54.998
North Carolina
8.974
55.292
Pennsylvania
8.537
61.031
Wisconsin
11.150
58.066
Arkansas
11.033
42.591
Indiana
10.447
55.165
Kansas
10.480
59.475
Kentucky
9.753
46.909
Solid Red States
Nebraska
11.156
66.480
Oklahoma
11.242
46.871
Tennessee
12.141
52.925
Wyoming
9.884
62.236
California
9.134
78.538
Colorado
8.428
67.169
Connecticut
9.200
78.971
Delaware
10.261
76.522
Solid Blue States
Maryland
7.194
69.805
Massachusetts
9.234
84.722
New Jersey
10.892
69.695
Virginia
7.536
64.229
Data Source: https://covid.cdc.gov/covid-data-tracker; U.S. Bureau of
Economic Analysis

Row Labels
Average of Case Rate per 100,000 StdDev of Case Rate per 100,000
Solid Blue States
8985
1254
Solid Red States
10767
786
Swing States
9874
1155
Grand Total
9875
1275

Average of GDP Per Capita in 2020 ($) StdDev of GDP Per Capita in 2020 ($)
$ 73.706,38
$ 7.012,26
$ 54.081,50
$ 8.343,41
$ 54.947,88
$ 3.934,23
$ 60.911,92
$ 11.242,81

Swing States Solid Red States Solid Blue States
11.702
11.033
9.134
9.815
10.447
8.428
10.172
10.480
9.200
8.589
9.753
10.261
10.055
11.156
7.194
8.974
11.242
9.234
8.537
12.141
10.892
11.150
9.884
7.536

Anova: Single Factor
SUMMARY
Groups
Swing States
Solid Red States
Solid Blue States

Count
8
8
8

ANOVA
Source of Variation
Between Groups
Within Groups

SS
12703893
24684614

Total

37388508

Sum
78994
86136
71879

df
2
21
23

Average
9874
10767
8985

Variance
1334799,9
618318,9
1573254,7

MS
6351946,63
1175457,83

F
5,4038

P-value
0,0128

F crit
3,4668

Political Affiliation

State
Arizona
Florida

GDP Per
Case Rate per
Capita in
100,000
2020 ($)
11.702
50.187
9.815

50.424

Georgia
10.172
57.819
Michigan
8.589
51.766
Swing States
Nevada
10.055
54.998
North Carolina
8.974
55.292
Pennsylvania
8.537
61.031
Wisconsin
11.150
58.066
Arkansas
11.033
42.591
Indiana
10.447
55.165
Kansas
10.480
59.475
Kentucky
9.753
46.909
Solid Red States
Nebraska
11.156
66.480
Oklahoma
11.242
46.871
Tennessee
12.141
52.925
Wyoming
9.884
62.236
California
9.134
78.538
Colorado
8.428
67.169
Connecticut
9.200
78.971
Delaware
10.261
76.522
Solid Blue States
Maryland
7.194
69.805
Massachusetts
9.234
84.722
New Jersey
10.892
69.695
Virginia
7.536
64.229
Data Source: https://covid.cdc.gov/covid-data-tracker; U.S.
Bureau of Economic Analysis

Case Rate per 100,000
GDP Per Capita in 2020 ($)

Coefficient of determination

Case Rate per 100,000

1
-0,388037299
0,150572946

GDP Per Capita in 2020 ($)

1

Political Affiliation

Swing States

Solid Red States

Solid Blue States

State

GDP Per
Case Rate per
Capita in
100,000
2020 ($)

Arizona

11.702

50.187

Florida

9.815

50.424

Georgia

10.172

57.819

Michigan

8.589

51.766

Nevada

10.055

54.998

North Carolina

8.974

55.292

Pennsylvania

8.537

61.031

Wisconsin

11.150

58.066

Arkansas
Indiana
Kansas
Kentucky
Nebraska
Oklahoma
Tennessee
Wyoming
California

11.033
10.447
10.480
9.753
11.156
11.242
12.141
9.884
9.134

42.591
55.165
59.475
46.909
66.480
46.871
52.925
62.236
78.538

Colorado

8.428

67.169

Connecticut

9.200

78.971

Delaware

10.261

76.522

Maryland
7.194
69.805
Massachusetts
9.234
84.722
New Jersey
10.892
69.695
Virginia
7.536
64.229
Data Source: https://covid.cdc.gov/covid-data-tracker; U.S.
Bureau of Economic Analysis

SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
ANOVA
Regression
Residual
Total
Intercept
Case Rate per 100,000

For case rate per 100,000 = 15,000
43376,93718

Scatterplot
95.000
85.000

ession Statistics
0,3880
0,1506
0,1120
10594,7440
24

df
1
22
23

75.000

55.000
45.000
35.000
6.000

7.000

8.000

SS
MS
437748305 437748305
2469469213 112248601
2907217518

Coefficients Standard Error
94702,584
17247,11
-3,422
1,73

per 100,000 = 15,000

y = -3,4217x + 94703
R² = 0,1506

65.000

9.000

10.000 11.000 12.000 13.000

F
Significance F
3,8998
0,0610

t Stat
P-value
5,4909 0,0000
-1,9748 0,0610


Central tendency and variability
Below we...


Anonymous
Just what I was looking for! Super helpful.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Related Tags