BST 322 CNUAS Week 3 Crosstabulation Table Creation Discussion

User Generated

Qrynavr19

Mathematics

BST 322

California National University for Advanced Studies

BST

Description

Now that we are more familiar with StatCrunch and making cross-tabulation tables, discuss how the textbook author (Dr. Polit) created Figure 8.2 (the cross tabulation table on p. 174). Can you get the data into StatCrunch and perform the Chi-Square test? All the answers are there for you in the table- can you get StatCrunch to generate them?

If you need help doing this analysis, you can find a video on how to do the work in StatCrunch by clicking on this link: "How to do a Chi-Square in StatCrunch."

Unformatted Attachment Preview

BST 322 Week Three Slides Revised March 2020 Brooks Ensign, M.B.A. Week Three • More hypothesis testing, with many of the same assumptions, but: • This week is EASIER • One chapter, with the focus on one test Search Youtube • In Youtube: search for • Chi Square Bozeman Science • Excellent video !! Have you watched this video? Excellent introduction to Chi Square • Youtube: Chi Square Bozeman Science Types of Statistical Tests We have two broad type of tests: (last week): Parametric tests (ex: t tests) Dependent variable is interval or ratio and there is a Normal distribution in a population , etc…. This Week: esp. KAI-square Non- Parametric tests (ex: Chi-Square) variables are measured on a nominal or ordinal scale (no assumption about normal distribution) We are usually talking about NOMINAL DVs Parametric tests are more powerful • Parametric tests are more powerful • But sometimes ( i.e., NOW )we cannot meet the more stringent requirements for parametric tests, so we use nonparametric tests instead. Week Three: Easier • Inferential statistics (like last week) • But this week: “non”-parametric ( i.e., nominal and ordinal) D.V., with last week being parametric (interval and ratio) • 70% of effort: one test: Chi Square, pages 170-179 • 30% of effort: the “other” nonparametric tests, pages 180-192 (do not calculate) Week Three (“Chi-squared”) Χ 2 • The easiest week of the course! (week four is easy also) • One chapter: chapter 8 (mostly the first half) • One test: the Χ 2 Chi (Kai)-square test • Focus on the first half of the chapter • Skim the second half, lightly. Look for the concepts explained in this slide deck. You need to know “just a little bit” about these “other nonparametric tests” Review Test statistic - computed from a formula to determine critical region of the data If the absolute value we find for the test statistic is > than a certain critical value Then the null hypothesis is rejected and the result is significant. Statistical significance - the results seen are “PROBABLY not” attributable to chance P value–the probability that the result occurred by chance “They Work Together:” Think of the test statistic and the p value as the opposite ends of a seesaw. They work in opposite directions. For statistical significance, we want a large test statistic (larger than the table value) and a small p value (smaller than 0.05, i.e., alpha). KAI-squared Chi-Square X2 Test statistic greater than table value P value less than alpha (0.05), the level of significance Review – cont. Interpreting your p value -For small p values (usually < 0.05), reject Ho, Your data don’t support Ho and your evidence is beyond a reasonable doubt -For large p values (usually > 0.05), you can’t reject Ho, you don’t have enough evidence against it -If your p value is close to 0.05 – your results marginal (could go either way) Selecting the Test (next slide) • The Next Slide is Very Important! ( course roadmap for hypothesis test selection) • Select the column first: DV, output data, the dependent data: Nominal, Ordinal, Interval or Ratio? • Last week: the tests in the far right column: t test and ANOVA • This week: look for the Small Blue Box Selecting the Test (next slide) • The Next Slide is Very Important! • Select the column first: output data, the dependent data: Nominal, Ordinal, Interval or Ratio? • Select the Row: 2 groups? 3 (or more)? • Select the Row: Are groups independent or dependent? • Look in the middle column (“nominal DV”) – look for the Blue Box (middle, top) This Week Non-Parametric Tests Last Week, Week 2 Level of Measurement of Dependent Variable Number of Groups Nominal Ordinal Interval/Ratio 2 Independent χ2 test or Fisher’s Exact MannWhitney U t-test 3+ Independent χ2 test Kruskal-Wallis ANOVA 2 Dependent/ repeated measures McNemar Wilcoxon signed-ranks Paired t-test 3+ Dependent/ repeated measures Cochran’s Q Friedman RANOVA Selecting the Test (next slide) • The Next Slide is Very Important! • Select the column first: output data, the dependent data: Nominal, Ordinal, Interval or Ratio? • Select the Row: 2 groups? 3 (or more)? • Select the Row: Are groups independent or dependent? • Look in the middle column (“nominal DV”) – look for the Blue Box (middle, top) Tests for Independent Groups Level of Measurement (Outcome Variable) Nominal Number of Measures Groups Two Groups Chi-square test Three or More Groups Chi-square test Ordinal Measures Mann-Whitney U test Kruskal-Wallis test “KAI” Chi Square Test: “ Χ2 “ Chi ( “kai” ) Square: Χ2 Most important topic this week Friendlier name (English, not Greek): Comparison of Proportions (observed vs. expected) Military lesson: • Military expression: “You don’t get what you expect, you get what you inspect.” • So, … ( Observed minus Expected ) • How different is what we observe (inspect) from what we expect? • Chi Square: ( observed minus expected) • Null hypothesis: observed - expected = 0 (because observed = expected ) Null hypothesis: O = E • What is the null hypothesis? O – E = 0 • The opposite of our experimental idea (alternative hypothesis). • NULL HYP: There is no (“ null “) relationship between the independent and dependent variables • In Chi Square: the “expected proportionality” will be distributed in each group, because group membership does not matter. Apply hypothesis tests to Chi-Square Deciding to accept or reject the Null hypothesis (Polit p. 105) If the absolute value of the computed statistic is greater than the tabled value, the null hypothesis can be rejected and the result is said to be statistically significant at the specified probability level Apply to Chi-Square (Polit p. 171) Null Hypothesis here is ______________ A. The variables are not independent (they are related). B. The variables are independent (unrelated). C. The variables are moderately related. Vote now! Apply hypothesis tests to Chi-Square Deciding to accept or reject the Null hypothesis (Polit p. 105) If the absolute value of the computed statistic is greater than the tabled value, the null hypothesis can be rejected and the result is said to be statistically significant at the specified probability level Apply to Chi-Square (Polit p. 171) Null Hypothesis here is ______________ B. The variables are independent (unrelated). … so the distribution should be “proportionate / pro rata” And the expected results should be the observed results! If the null hypothesis is true? • (O minus E): The Chi-squared ( Χ2 ) statistic will be (close to zero) zero, because there will be no large difference between the expected values and the observed values. • The null hypothesis says that the expected values should be “proportionately” (evenly) distributed ( so, take the proportion, the percentage from the “overall” column, and apply it to each subgroup = Expected Values) Kai-squared Test Statistic Value X2 = 0 for two variables that are completely unrelated (see the distribution curve, p. 171) TABLED VALUE: Page 416, Table A.4 If the X2 value is less than the tabled value, we accept the null hypothesis: not statistically significant If the X2 value is greater than the tabled value, we reject the null hypothesis: statistically significant Discussion Question 1 fig 8.2 From our textbook, in the first six pages of chapter 8 We compare two protocols: change the heparin lock after 72 hours vs. after 96 hours. We suspect that the 72-hour protocol may result fewer complications (infections? blockage?) We see a small difference in the data, But: IS THE DIFFERENCE STATISTICALLY SIGNIFICANT? What is the nature of the dependent variable? Nominal data: Complications, Yes, or No (count the complications) – counting is the only calculation we can make with nominal data Teaser: the critical values • • • • • • Page 416 The critical values (table values) are easy To find, on page 416 For 2X2 tables the critical value is 3.84 For 3X2: the critical value value is 5.99 Why? Stay tuned The results (from the book) • At the bottom of the next page: • Chi-squared value is 0.25 (very low), < 3.84 • P value is 0.6 (very high), > 0.05 • These results are NOT SIGNIFICANT, probably could have occurred randomly Discussion Question 1 fig 8.2 Chi Sq Use StatCrunch StatCrunch is much easier to use, as explained in the video (#9). Most of the time: Use the “WITH SUMMARY” method StatCrunch this week, W3: • Use “With Summary” for both DQs and for first some homework questions. No data file needed. (if you already have summarized data, in a small table) • Ask for help (my written guidance): THE WITH SUMMARY METHOD • For other HW questions : use “with data” – I will help you. There is a data file in the Blackboard course shell. The “with data” approach is the same as W1 DQ2 (see video). StatCrunch Results – remember to add the “finishing touches” – see the Cross-tabulation table video in DQ-2, Week One; Try to make your table look like The table on page 61. Contingency table results: Rows: Complic Columns: Group Cell format Count (Row percent) (Column percent) (Total percent) Expected count 72hr no yes 96hr Total 41 (51.25%) (82%) (41%) 40 39 80 (48.75%) (100.00%) (78%) (80%) (39%) (80%) 40 9 (45%) (18%) (9%) 10 11 20 (55%) (100.00%) (22%) (20%) (11%) (20%) 10 50 50 100 (50%) (50%) (100.00%) Total (100.00%) (100.00%) (100.00%) (50%) (50%) (100.00%) Chi-Square test: Statistic DF Value P-value Chi-square 1 0.25 0.6171 Chi-squared value is 0.25 P value is 0.6 DQ2 Week Three • For this second question - let's try to make a crosstabulation table for a problem we don't have in the textbook. • Using the data set provided above from a Pew research study on dementia in 400 Nashville elderly residents (labeled "Week3DiscussionQuestion2Data"), can you create a StatCrunch crosstabulation table (copied and pasted, as well as updated in Word) result for "used pot" and "Forgot date"? And the Chi-Square analysis? What hypotheses did you test? What can you conclude using marijuana and dementia in this sample? W3 DQ2 • can you create a StatCrunch crosstabulation table (copied and pasted, as well as updated in Word) result for "used pot" and "Forgot date"? And the Chi-Square analysis? What hypotheses did you test? What can you conclude using marijuana and dementia in this sample? Contingency table results: Rows: Forgot Date Columns: Usedpot Null hypothesis: There is no relationship between marijuana use and forgetting the date Alternative hypothesis: The two variables between marijuana use and forgetting the date are related. X2 =0.56017047, p-value= 0.7557, alpha=0.05, critical value=5.99, df=2 The results are not statistically significant because the x2 is less than the critical value and the p-value is greater than the level of significance. Therefore, we must fail to reject the null hypothesis. It shows that there is no relationship between marijuana use and forgetting the date. Start with O minus E • O: the observed value • E: the expected value • Essence of Chi-Square (Χ2) is: Subtract E from O: Χ2 = O minus E • Then: calculate: Χ2 = ( O – E )2 / E • The problem gives us the O values. How do we find the E values? (stay tuned, - we apply “expected proportionality”) Contingency Table Example Does this experimental therapy reduce incontinence? ExperimentControl al Group (E) Group (C) Total 10 20.0% 20 40.0% 30.0% Not Incontinent 40 80.0% 30 60.0% 70 70.0% Total 50 100.0% 50 100.0% 100 100.0% Incontinent •A 30 higher proportion of Cs than Es were incontinent—but is this just random fluctuation? Divide by E ?? • E cannot be zero … why not? What is the title of this slide? … dividing by zero is not allowed • E may also be a problem if it is too low … look for the explanation about the “corrections” (Yates and Fisher) on page 193 and later in these slides Start with O minus E • O: the observed value • E: the expected value • Essence of Chi-Saquare is: Subtract E from O: = O minus E • Then: calculate: ( O – E )2 / E • The problem gives us the O values. How do we find the E values? (stay tuned, - we apply “expected proportionality”) Contingency Table Example these are the Observed values ExperimentControl al Group (E) Group (C) Total 10 20.0% 20 40.0% 30.0% Not Incontinent 40 80.0% 30 60.0% 70 70.0% Total 50 100.0% 50 100.0% 100 100.0% Incontinent 30 • A higher proportion of Cs than Es were incontinent—but is this just random fluctuation? Why can’t E be zero? • The formula: For each cell in the table, compute the following: • Chi-squared: Χ2 = (O – E)2 ÷ E • Χ2 ( say kai-squared, “Chi-squared,” not “ex”-squared • Can you divide by zero? NO!!! • We actually want the E, the expected value to be larger than 10. If it is greater than zero, but less than 10, we need to use either the Yates Correction or the Fisher Exact Test (stay tuned, and see p. 193) Computation of Chi-Square • Page 173: For each cell, compute the following: Χ2 = (O – E)2 ÷ E • Cell A in our example: (10 - 15)2 ÷ 15 = 1.67 • Then add all the cell components together to obtain χ2 • In our example with four cells: χ2 = 1.67 + 1.67 + .71 + .71 = 4.46 Q: How Do We Get the Expected Values? (proportionality) even though StatCrunch does this for us !!! A: We apply the “overall proportion” (the expected proportion) to each cell But, How??? Expected Values (page 172) χ2 = (∑ (Oij – Eij)2 ) / Eij with df = (r-1)(c-1) Where Oij = observed cell frequencies E = expected proportions Eij = expected cell frequencies = row total X column total Total count (N) r = number of rows c = number of columns Calculating the Expected Value Eij = expected cell frequencies = row total X column total / Total count (N) Or (preferred explanation, intuitive): (row total / Total count (N) ) x times column total Provides “proportionality” Observed Versus Expected Frequencies E Group C Group Total Incontinent 10 (20.0%) (CELL A) 20 (40.0%) 30 (30.0%) Not Incontinent 40 (80.0%) 30 (60.0%) 70 (70.0%) Total 50 (100.0%) 50 (100.0%) 100 (100.0%) • If null true, both Es and Cs would have 30% incontinent (see Total Row %): 15 each • Cell A, Observed (OA) = 10 Expected (EA) = 15 • EA = Row TotA (30) ÷ N (100) = 30% and • 30%  Col TotA (50) = 15 (E) expected value “ phi “ in a 2X2 Required vocabulary Know this definition from Chapter 8 • phi is the measure of strength of relationship most commonly used with a 2X2 table. Testing Significance of Chi-Square • Why is your d.f. probably = 1 or 2 • ??? Why table value = 3.84 or 5.99 ?? • Page 416: Table of critical values requires knowing (1) df and (2) significance criterion (e.g., .05) • Can also use: =CHIINV(.05,df) • df is “usually” just = 1 or = 2, because … • In χ2 , df =(#Rows – 1)  (#Columns – 1) – Here: df = (2 – 1)  (2 – 1) = 1 • If calculated χ2 > tabled value, results are significant – Critical value for df = 1 and α = .05: 3.84 χ2 = 4.46, so null hypothesis is rejected 3.84 or 5.99 will often be the critical values – WHY?? Page 416 The critical value • Next slide: Page 416: the “table” ( critical) value • Will usually be 3.84 or 5.99 ( Why?) • Use shaded column ( α = 0.05 ) • Row: usually one or two, why? • d.f. = ( r -1 ) x ( c – 1 ) = probably 1 or 2 • With a 2X2 table: d.f. = 1, so: 3.84 • With a 3X2 table: d.f. = 2, so: 5.99 Remember the Happy(?) Married Men? • We asked last week about “happiness,” measured with a continuous variable that we assumed to have a normal distribution (the “happiness score”) • Some of us questioned whether this score was a valid measurement. Good question! • Let’s use a nominal dependent variable: are you, or are you not, on Prozac? Another Example (2 X 3 table) (same as Assignment question #1) Happiness scores for 3 groups of men: Married: 42 45 43 48 45 47 48 46 35 50 Single: 49 50 45 38 47 34 49 44 41 42 Divorced: 37 47 50 44 41 41 42 46 45 50 The same 3 group of men (10 in each group) were asked if they are on Prozac or not: Data: 2 of the married men and 2 of the single men and 7 of the divorced men were on Prozac Is the chi-square value statistically significant? Married Prozac: Yes Prozac: No Total Prozac: Yes Prozac: No Total Single 2 8 10 Expected Frequencies Married Single 3.7 6.3 10 Divorced 2 8 10 Total 7 3 10 Divorced 3.7 6.3 10 Excel solution to the Prozac question – look for the StatCrunch solution in the Discussion Questions… 11 19 30 Total 3.7 6.3 10 11 19 30 Test Results 0 Correction 7.177 χ2 2 Rows 3 Columns 2 df 0.028 p (χ2) 0.489 V (or φ) This example is shown in the video by Dr. Myers, with StatCrunch; watch this video! Prozac example: See the Discussion Questions (#1) SOLVED IN THE DISCUSSION QUESTION ONE POSTING Homework: final HW-Question Chi-Square “Corrections” • Smallest expected value: • Homework Question guidance Page 193: • When an expected frequency is low, in a 2X2 table (only relevant for 2X2 tables): • Look at the “lowest cell value” in the expected table • If less than 10: Yates Correction • If less than 5: Fisher’s Exact test • If ten or greater: Chi-Square with no correction • If expected value is zero: CAN’T USE CHI-SQUARE The “Other” Nonparametrics SKIM THE SECOND HALF OF CHAPTER EIGHT, FOLLOWING THE SLIDES IN THIS PRESENTATION Page 180: Ready for the easier stuff? The “other” tests Introducing (just introducing) the “Other Nonparametric Tests” (harder, but easier; why?) ( pages 180-190) More advanced tests, used much less frequently Easier: In this class, we do not “use” these tests, we just “talk about them” Selecting the Statistical Test: 1. Identify your dependent variable: is it “continuous / scale” (i.e. interval or ratio ?) – if so, then use t test or ANOVA, the parametric tests from Week Two 2. With your dependent variable: choose the column in the following table 3. How many groups in the independent variable? Two or… “Three or more” 4. Are the groups in the independent variable related or unrelated? See the following table, from Chapter 8 (page182) Also: see the inside cover of the textbook: - Week Two tests are on the left side - Week Three tests are on the right side Selecting the Test (next slide) • The Next Slide is Very Important! • Select the column first: output data, the dependent data: Nominal, Ordinal, Interval or Ratio? • Select the Row: 2 groups? 3 (or more)? • Select the Row: Are groups independent or dependent? Selecting the Statistical Test Level of Measurement of Dependent Variable Number of Groups Nominal Ordinal Interval/Ratio 2 Independent χ2 test or Fisher’s Exact MannWhitney U t-test 3+ Independent χ2 test Kruskal-Wallis ANOVA 2 Dependent/ repeated measures McNemar Wilcoxon signed-ranks Paired t-test 3+ Dependent/ repeated measures Cochran’s Q Friedman RANOVA Ready for the “easy” stuff? • The rest of chapter 8 is “harder” because the tests seem obscure and technical, BUT • EASIER: Back away !!! : we just want to “SKIM” • Seek a “light / conceptual” (no math) understanding of the “other nonparametric tests” (from these slides) • Pages 180-192: a “light skim” will do… Let’s skim the second half of Chapter Eight (the “other” tests) HOW FAST DO YOU WANT TO GO? “Analogs” / “Counterparts:” Tests that are in the same row, because they share some of the same inputs Parametric-Nonparametric Analogs (analogs are in same row) Parametric Test Nonparametric Test Independent groups t-test Dependent groups ttest One-way ANOVA Mann-Whitney U test RM-ANOVA Friedman test Wilcoxon signedranks test Kruskal-Wallis test Tests for Independent Groups Level of Measurement (Outcome Variable) Nominal Number of Measures Groups Two Groups Chi-square test Three or More Groups Chi-square test Ordinal Measures Mann-Whitney U test Kruskal-Wallis test After mastering Chi-Square tests, what do you need to know about the “Other” Non-Parametric Tests (after page 182)? [ “not much” ] Learn “WHY / WHEN” (not HOW) What is the nature of the Dependent Variable? In the IV: 2 groups or 3 (or more?) Are the groups in the IV independent or related? Find your place in the master table, from page 178… What is the name of the “other” non-parametric test? What is the name of the variable? What information is on the following slides? Just learn the information in these slides No math is required! - Mann-Whitney U test • Non-parametric tests that involve ordinal dependent variables use some form of ranking. The Mann-Whitney U test is popularly used when the dependent variable is ordinal and is the analog of the independent groups t-test. Kruskal-Wallis Test • Tests the null hypothesis that three or more population distributions are identical: – The nonparametric analog of one-way ANOVA • Compares the ranks of the values for the groups • Test statistic is H, which follows chi-square distribution Tests for Dependent Groups Number of Groups (or Measurement Periods) Two Three or More Level of Measurement (Outcome Variable) Nominal Measures Ordinal Measures McNemar Test Wilcoxon SignedRanks Test Cochran’s Q Test Friedman Test McNemar Test • Tests differences in proportions for the same people measured twice (or for paired groups, like mothers/ daughters) • Yields a statistic distributed as a chisquare, with df = 1 Wilcoxon Signed-Ranks Test • Tests differences in ordinal-level measures for the same people measured twice (or for paired groups, like Sibling A/Sibling B) – The nonparametric analog of a paired t-test • Another example of a rank test • For n > 10, it follows a normal distribution, so the test statistic is z Cochran’s Q Test • Tests differences in proportions for the same people measured three or more times (or correlated groups) • Yields a statistic distributed as a chisquare, with df = 1 • Not many applications in the nursing literature Friedman Test • Tests differences in ordinal-level measures for the same people measured three or more times (or for correlated groups) – The nonparametric analog of an RM-ANOVA • Another example of a rank test • Test statistic is a chi-square with (k – 1) degrees of freedom (k = number of measurements) Independent Project • Let’s talk about: – The Independent Project !! – Not due until the very end of the course, but let’s understand what is required. – Data file is in the course shell – I will send specific guidance … ( you know I will!) – Chi Square IS REQUIRED in the project (Q3) Independent Project Use the data set supplied in docsharing and using StatCrunch, provide the following for the variable(s) of your choice: 1.Frequency distribution of a variable and bar graph of the same variable 2.Descriptives of a continuous variable : mean, median, mode, skewness, kurtosis, standard deviation 3.Cross tabulation of two variables 4.Comparison of the effect of three or more groups (single variable) on a single continuous variable 5.Scatterplot of two continuous variables 6.Correlation between the two continuous variables (from #5) Think carefully about what kind of variables to choose for the given tasks. Ask questions now – we can discuss. I will send you a lot of information to help you. Independent Project: hints in purple 1.Frequency distribution of a NOMINAL variable and bar graph of the same variable 2.Descriptives of a continuous variable : mean, median, mode, skewness, kurtosis, standard deviation – define terms, provide histogram 3.Cross tabulation of two variables with Chi Square test and analysis 4.Comparison of the effect of three or more groups (single variable) on a single continuous variable One-Way Anova 5.Scatterplot of two continuous variables 6.Correlation between the two continuous variables (from #5) with significance testing ( page 199), easy in StatCrunch Think carefully about what kind of variables to choose for the given tasks. Ask questions now – we can discuss. I will send you a lot of information to help you. Independent Project (cont) REQUIRED WRITTEN PARAGRAPH: A descriptive written paragraph should accompany each of the above including a description of the variables used and any meaning that may be attached to the results. The student must show that she or he is able to synthesize and apply the materials learned in class. Part of the class computer time is expected to be spent on this project. Submit to the appropriate Drop Box in a Word document. Grading on this project is as follows: 3 points for each task 1-6: 1 point each for variable choice, appropriate display/test, description of result. 2 points for overall format/readability/construct (so make it neat and tidy) Project guidelines, More guidance: • ANOVA cannot use “discrete quantitative” variable as the dependent variable; example: number of pregnancies is not approximately normal, not continuous, same problem for # miscarriages • Write a paragraph for each question, define your terms • Variable selection is critical; what is appropriate / inappropriate ??
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Attached.

Chi-Square in Stat Crunch
Dr. Polit created Figure 8.2 (the cross-tabulation table on p. 174) using SPSS. The same table can
also be calculated with stat crunch given the original data. However, since the original data is not
available, we can sti...


Anonymous
Great! Studypool always delivers quality work.

Studypool
4.7
Indeed
4.5
Sitejabber
4.4

Related Tags