TIM 7101 NCU Dataset Analysis Interpret Statistical Output Paper

User Generated

bqvahpur

Mathematics

TIM 7101

Northcentral University

TIM

Description

Now that you have analyzed your data, you will need to interpret the output that you obtained from your data analysis. Specifically, you need to discuss what the data analysis findings mean in relation to your research questions and hypotheses, and what actions should be taken as a result.

For this assignment, you must provide a narrative that discusses the key insights from your data analysis findings and highlights the limitations of your analysis. Limitations should pertain to weaknesses in your design and limits on your ability to make conclusions. For example, if you are not able to determine cause and effect, that would be a limitation. If your dataset is small, that would be another limitation.

Length: 5-7 pages, not including title and reference pages

References: Include a minimum of 5 scholarly resources.

The completed assignment should demonstrate thoughtful consideration of the ideas and concepts presented in the course by providing new thoughts and insights relating directly to this topic. The content should reflect scholarly writing and current APA standards

Unformatted Attachment Preview

Week 5 - Assignment: Interpret Statistical Output Hide Folder Information Turnitin® This assignment will be submitted to Turnitin®. Instructions Now that you have analyzed your data, you will need to interpret the output that you obtained from your data analysis. Specifically, you need to discuss what the data analysis findings mean in relation to your research questions and hypotheses, and what actions should be taken as a result. For this assignment, you must provide a narrative that discusses the key insights from your data analysis findings and highlights the limitations of your analysis. Limitations should pertain to weaknesses in your design and limits on your ability to make conclusions. For example, if you are not able to determine cause and effect, that would be a limitation. If your dataset is small, that would be another limitation. Length: 5-7 pages, not including title and reference pages References: Include a minimum of 5 scholarly resources. The completed assignment should demonstrate thoughtful consideration of the ideas and concepts presented in the course by providing new thoughts and insights relating directly to this topic. The content should reflect scholarly writing and current APA standards and should adhere to Northcentral University's Academic Integrity Policy. Upload your document and click the Submit to Dropbox button. Due Date Jan 9, 2022 11:59 PM KJA Korean Journal of Anesthesiology Statistical Round pISSN 2005-6419 • eISSN 2005-7563 What is the proper way to apply the multiple comparison test? Sangseok Lee1 and Dong Kyu Lee2 Department of Anesthesiology and Pain Medicine, 1Sanggye Paik Hospital, Inje University College of Medicine, 2 Guro Hospital, Korea University School of Medicine, Seoul, Korea Multiple comparisons tests (MCTs) are performed several times on the mean of experimental conditions. When the null hypothesis is rejected in a validation, MCTs are performed when certain experimental conditions have a statistically significant mean difference or there is a specific aspect between the group means. A problem occurs if the error rate increases while multiple hypothesis tests are performed simultaneously. Consequently, in an MCT, it is necessary to control the error rate to an appropriate level. In this paper, we discuss how to test multiple hypotheses simultaneously while limiting type I error rate, which is caused by α inflation. To choose the appropriate test, we must maintain the balance between statistical power and type I error rate. If the test is too conservative, a type I error is not likely to occur. However, concurrently, the test may have insufficient power resulted in increased probability of type II error occurrence. Most researchers may hope to find the best way of adjusting the type I error rate to discriminate the real differences between observed data without wasting too much statistical power. It is expected that this paper will help researchers understand the differences between MCTs and apply them appropriately. Keywords: Alpha inflation; Analysis of variance; Bonferroni; Dunnett; Multiple comparison; Scheffé; Statistics; Tukey; Type I error; Type II error. Multiple Comparison Test and Its Imitations We are not always interested in comparison of two groups per experiment. Sometimes (in practice, very often), we may have to determine whether differences exist among the means of three or more groups. The most common analytical method used for such determinations is analysis of variance (ANO- Corresponding author: Dong Kyu Lee, M.D., Ph.D. Department of Anesthesiology and Pain Medicine, Guro Hospital, Korea University School of Medicine, 148 Gurodong-ro, Guro-gu, Seoul 08308, Korea Tel: 82-2-2626-3237, Fax: 82-2-2626-1438 Email: entopic@naver.com ORCID: https://orcid.org/0000-0002-4068-2363 Received: August 19, 2018. Revised: August 26, 2018. Accepted: August 27, 2018. Korean J Anesthesiol 2018 October 71(5): 353-360 https://doi.org/10.4097/kja.d.18.00242 VA).1) When the null hypothesis (H0) is rejected after ANOVA, that is, in the case of three groups, H0: μA = μB = μC, we do not know how one group differs from a certain group. The result of ANOVA does not provide detailed information regarding the differences among various combinations of groups. Therefore, researchers usually perform additional analysis to clarify the differences between particular pairs of experimental groups. If the null hypothesis (H0) is rejected in the ANOVA for the three groups, the following cases are considered: μA ≠ μB ≠ μC or μA ≠ μB = μC or μA = μB ≠ μC or μA ≠ μC = μB In which of these cases is the null hypothesis rejected? The only way to answer this question is to apply the ‘multiple comparison test’ (MCT), which is sometimes also called a ‘post-hoc test.’ 1) In this paper, we do not discuss the fundamental principles of ANOVA. For more details on ANOVA, see Kim TK. Understanding one-way ANOVA using conceptual figures. Korean J Anesthesiol 2017; 70: 22-6. CC This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/ licenses/by-nc/4.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Copyright ⓒ The Korean Society of Anesthesiologists, 2018 Online access in http://ekja.org VOL. 71, NO. 5, October 2018 Applying the multiple comparison test Meaning of P value and ɑ Inflation In a statistical hypothesis test, the significance probability, asymptotic significance, or P value (probability value) denotes the probability that an extreme result will actually be observed if H0 is true. The significance of an experiment is a random variable that is defined in the sample space of the experiment and has a value between 0 and 1. Type I error occurs when H0 is statistically rejected even though it is actually true, whereas type II error refers to a false negative, H0 is statistically accepted but H0 is false (Table 1). In the situation of comparing the three groups, they may form the following three pairs: group 1 versus group 2, group 2 versus group 3, and group 1 versus group 3. A pair for this comparison is called ‘family.’ The type I error that occurs when each family is compared is called the ‘family-wise error’ (FWE). In other words, the method developed to appropriately adjust the FWE is a multiple comparison method. The α inflation can occur when the same (without adjustment) significant level is applied to the statistical analysis to one and other families simultaneously [2]. For example, if one performs a Student’s t-test between two given groups A and B under 5% α error and significantly indifferent statistical result, the probability of trueness of H0 (the hypothesis that groups A and B are same) is 95%. At this point, let us consider another group called group C, which we want to compare it and groups A and B. If one performs another Student’s t-test between the groups A and B and its result is also nonsignificant, the real probability of a nonsignificant result between A and B, and B and C is 0.95 × 0.95 = 0.9025, 90.25% and, consequently, Table 1. Types of Erroneous Conclusions in Statistical Hypothesis Testing Actual fact Error types Statistical inference 354 H0 true H0 true H0 false H0 false Type II error (β) Correct Type I error (α) Correct the testing α error is 1 − 0.9025 = 0.0975, not 0.05. At the same time, if the statistical analysis between groups A and C also has a nonsignificant result, the probability of nonsignificance of all the three pairs (families) is 0.95 × 0.95 × 0.95 = 0.857 and the actual testing α error is 1 − 0.857 = 0.143, which is more than 14%. Inflated α = 1 − (1 − α)N, N = number of hypotheses tested (equation 1) The inflation of probability of type I error increases with the increase in the number of comparisons (Fig. 1, equation 1). Table 2 shows the increases in the probability of rejecting H0 according to the number of comparisons. Unfortunately, the result of controlling the significance level for MCT will probably increase the number of false negative cases which are not detected as being statistically significant, but they are really different (Table 1). False negatives (type II errors) can lead to an increase in cost. Therefore, if this is the case, we may not even want to attempt to control the significance level for MCT. Clearly, such deliberate avoidance increases the possibility of occurrence of false positive findings. Classification (or Type) of Multiple C ­ om­parison: Single-step versus Stepwise Pro­cedures As mentioned earlier, repeated testing with given groups results in the serious problem known as α inflation. Therefore, numerous MCT methods have been developed in statistics over the years.2) Most of the researchers in the field are interested in understanding the differences between relevant groups. These groups could be all pairs in the experiments, or one control and 1.00 Probability of at least one P value less than 0.05 There are several methods for performing MCT, such as the Tukey method, Newman-Keuls method, Bonferroni method, Dunnett method, Scheffé’s test, and so on. In this paper, we discuss the best multiple comparison method for analyzing given data, clarify how to distinguish between these methods, and describe the method for adjusting the P value to prevent α inflation in general multiple comparison situations. Further, we describe the increase in type I error (α inflation), which should always be considered in multiple comparisons, and the method for controlling type I error that applied in each corresponding multiple comparison method. 0.75 0.50 0.25 0.00 0 25 50 75 100 Number of hypothesis test Fig. 1. Depiction of the increasing error rate of multiple comparisons. The X-axis represents the number of simultaneously tested hypotheses, and the Y-axis represents the probability of rejecting at least on true null hypothesis. The curved line follows the function value of 1 − (1 − α)N and N is the number of hypotheses tested. Online access in http://ekja.org KOREAN J ANESTHESIOL Lee and Lee Table 2. Inflation of Significance Level according to the Number of Multiple Comparisons Number of comparisons Significance level* 1 2 3 4 5 6 0.05 0.098 0.143 0.185 0.226 0.265 *Significance level (α) = 1 − (1 − α)N, where N = number of hypothesis test (Adapted from Kim TK. Korean J Anesthesiol 2017; 70: 22-6). other groups, or more than two groups (one subgroup) and another experiment groups (another subgroup). Irrespective of the type of pairs to be compared, all post hoc subgroup comparing methods should be applied under the significance of complete ANOVA result.3) Usually, MCTs are categorized into two classes, single-step and stepwise procedures. Stepwise procedures are further divided into step-up and step-down methods. This classification depends on the method used to handle type I error. As indicated by its name, single-step procedure assumes one hypothetical type I error rate. Under this assumption, almost all pairwise comparisons (multiple hypotheses) are performed (tested using one critical value). In other words, every comparison is independent. A typical example is Fisher’s least significant difference (LSD) test. Other examples are Bonferroni, Sidak, Scheffé, Tukey, Tukey-Kramer, Hochberg’s GF2, Gabriel, and Dunnett tests. The stepwise procedure handles type I error according to previously selected comparison results, that is, it processes pairwise comparisons in a predetermined order, and each comparison is performed only when the previous comparison result is statistically significant. In general, this method improves the statistical power of the process while preserving the type I error rate throughout. Among the comparison test statistics, the most significant test (for step-down procedures) or least significant test (for step-up procedures) is identified, and comparisons are successively performed when the previous test result is significant. If one comparison test during the process fails to reject a null hypothesis, all the remaining tests are rejected. This method does not determine the same level of significance as single-step methods; rather, it classifies all relevant groups into the statistically similar subgroups. The stepwise methods include Ryan-Einot-Gabriel-Welsch Q (REGWQ), Ryan-Einot-Gabriel-Welsch F (REGWF), Student-Newman-Keuls (SNK), and Duncan tests. These methods have different uses, for example, the SNK test is started to compare the two groups with the largest differences; the other two groups with the second largest differences are compared only if there is a significant difference in Online access in http://ekja.org prior comparison. Therefore, this method is called as step-down methods because the extents of the differences are reduced as comparisons proceed. It is noted that the critical value for comparison varies for each pair. That is, it depends on the range of mean differences between groups. The smaller the range of comparison, the smaller the critical value for the range; hence, although the power increases, the probability of type I error increases. All the aforementioned methods can be used only in the situation of equal variance assumption. If equal variance assumption is violent during the ANOVA process, pairwise comparisons should be based on the statistics of Tamhane’s T2, Dunnett’s T3, Games-Howell, and Dunnett’s C tests. Tukey method This test uses pairwise post-hoc testing to determine whether there is a difference between the mean of all possible pairs using a studentized range distribution. This method tests every possible pair of all groups. Initially, the Tukey test was called the ‘Honestly significant difference’ test, or simply the ‘T test,’4) because this method was based on the t-distribution. It is noted that the Tukey test is based on the same sample counts between groups (balanced data) as ANOVA. Subsequently, Kramer modified this method to apply it on unbalanced data, and it became known as the Tukey-Kramer test. This method uses the harmonic mean of the cell size of the two comparisons. The statistical assumptions of ANOVA should be applied to the Tukey method, as well.5) Fig. 2 depicts the example results of one-way ANOVA and 2) There are four criteria for evaluating and comparing the methods of posthoc multiple comparisons: ‘Conservativeness,’ ‘optimality,’ ‘convenience,’ and ‘robustness.’ Conservativeness involves making a strict statistical inference throughout an analysis. In other words, the statistical result of a multiple comparison method has significance only with a certain controlled type I error, that is, this method could produce a reckless result when there are small differences between groups. The second criterion is optimality. The optimal statistic is statistically the smallest CI among conservative statistics. In other words, the standard error is the smallest statistic among conservative statistics. Conservatism is more important than optimality because the former is a characteristic evaluated under conservative. The third criterion convenience is literally considered easy to calculate. Most statistical computer programs will handle this; however, extensive mathematics is required to understand its nature, which means that the criterion is less convenient to use if it is too complicated. The fourth criterion is ‘insensitivity to assumption violence,’ which is commonly referred to as robustness. In other words, in the case of violation of the assumption of equal variance in ANOVA, some methods presented below are less insensitive. Therefore, in this context, it is appropriate to use methods like Tamhane’s T2, Games-Howell, Dunnett’s T2, and Dunnett’s C, which are available in some statistical applications [3]. 3) This is true only if conducted by the post-hoc test of ANOVA. 4) It is different from and should not be confused with Student’s t-test. 355 VOL. 71, NO. 5, October 2018 Applying the multiple comparison test Oneway ANOVA Value Between groups Within groups Total Sum of squares df Mean square F Sig. 85.929 83.000 168.929 2 11 13 42.964 7.545 5.694 .020 Post hoc tests Multiple comparisons Dependent variable: value Tukey HSD 95% confidence interval Upper bound Lower bound Mean difference (I-J) Std. error Sig. B C 5.70000* 1.10000 1.84268 1.84268 .026 .825 10.6768 6.0768 .7232 3.8768 B A C 5.70000* 4.60000 1.84268 1.73729 .026 .055 .7232 .0922 10.6768 9.2922 C A B 1.10000 4.60000 1.84268 1.73729 .825 .055 3.8768 9.2922 6.0768 .0922 (I) Group (J) Group A *The mean difference is significant at the 0.05 level. Homogeneous subsets Value a, b Tukey HSD Group N A C B Sig. 4 5 5 Subset for alpha = 0.05 1 4.5000 5.6000 .819 2 5.6000 10.2000 .065 Means for groups in homogeneous subsets are displayed. a. Uses harmonic mean sample size = 4.615 b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed. Fig. 2. An example of a one-way analysis of variance (ANOVA) result with Tukey test for multiple comparison performed using IBM Ⓡ SPSSⓇ Statistics (ver 23.0, IBMⓇ Co., USA). Groups A, B, and C are compared. The Tukey honestly significant difference (HSD) test was performed under the significant result of ANOVA. Multiple comparison results presented statistical differences between groups A and B, but not between groups A and C and between groups B and C. However, in the last table ‘Homogenous subsets’, there is a contradictory result: the differences between groups A and C and groups B and C are not significant, although a significant difference existed between groups A and B. This inconsistent interpretation could have originated from insufficient evidence. Tukey test for multiple comparisons. According to this figure, the Tukey test is performed with one critical level, as described earlier, and the results of all pairwise comparisons are presented in one table under the section ‘post-hoc test.’ The results conclude that groups A and B are different, whereas groups A and 5) Independent variables must be independent of each other (independence), dependent variables must satisfy the normal distribution (normality), and the variance of the dependent variable distribution by independent variables should be the same for each group (equivalence of variance). 356 C are not different and groups B and C are also not different. These odd results are continued in the last table named ‘Homogeneous subsets.’ Groups A and C are similar and groups B and C are also similar; however, groups A and B are different. An inference of this type is different with the syllogistic reasoning. In mathematics, if A = B and B = C, then A = C. However, in statistics, when A = B and B = C, A is not the same as C because all these results are probable outcomes based on statistics. Such contradictory results can originate from inadequate statistical power, that is, a small sample size. The Tukey test is a generous Online access in http://ekja.org KOREAN J ANESTHESIOL method to detect the difference during pairwise comparison (less conservative); to avoid this illogical result, an adequate sample size should be guaranteed, which gives rise to smaller standard errors and increases the probability of rejecting the null hypothesis. Bonferroni method: ɑ splitting (Dunn’s method) The Bonferroni method can be used to compare different groups at the baseline, study the relationship between variables, or examine one or more endpoints in clinical trials. It is applied as a post-hoc test in many statistical procedures such as ANOVA and its variants, including analysis of covariance (ANCOVA) and multivariate ANOVA (MANOVA); multiple t-tests; and Pearson’s correlation analysis. It is also used in several nonparametric tests, including the Mann-Whitney U test, Wilcoxon signed rank test, and Kruskal-Wallis test by ranks [4], and as a test for categorical data, such as Chi-squared test. When used as a post hoc test after ANOVA, the Bonferroni method uses thresholds based on the t-distribution; the Bonferroni method is more rigorous than the Tukey test, which tolerates type I errors, and more generous than the very conservative Scheffé’s method. However, it has disadvantages, as well, since it is unnecessarily conservative (with weak statistical power). The adjusted α is often smaller than required, particularly if there are many tests and/or the test statistics are positively correlated. Therefore, this method often fails to detect real differences. If the proposed study requires that type II error should be avoided and possible effects should not be missed, we should not use Bonferroni correction. Rather, we should use a more liberal method like Fisher’s LSD, which does not control the family-wise error rate (FWER).6) Another alternative to the Bonferroni correction to yield overly conservative results is to use the stepwise (sequential) method, for which the Bonferroni-Holm and Hochberg methods are suitable, which are less conservative than the Bonferroni test [5]. Dunnett method This is a particularly useful method to analyze studies having control groups, based on modified t-test statistics (Dunnett’s t-distribution). It is a powerful statistic and, therefore, can discover relatively small but significant differences among groups or combinations of groups. The Dunnett test is used by researchers interested in testing two or more experimental groups against a single control group. However, the Dunnett test has the disadvantage that it does not compare the groups other than the control group among themselves at all. As an example, suppose there are three experimental groups A, B, and C, in which an experimental drug is used, and a Online access in http://ekja.org Lee and Lee control group in a study. In the Dunnett test, a comparison of control group with A, B, C, or their combinations is performed; however, no comparison is made between the experimental groups A, B, and C. Therefore, the power of the test is higher because the number of tests is reduced compared to the ‘all pairwise comparison.’ On the other hand, the Dunnett method is capable of ‘twotailed’ or ‘one-tailed’ testing, which makes it different from other pairwise comparison methods. For example, if the effect of a new drug is not known at all, the two-tailed test should be used to confirm whether the effect of the new drug is better or worse than that of a conventional control. Subsequently, a one-sided test is required to compare the new drug and control. Since the two-sided or single-sided test can be performed according to the situation, the Dunnett method can be used without any restrictions. Scheffé’s method: exploratory post-hoc method Scheffé’s method is not a simple pairwise comparison test. Based on F-distribution, it is a method for performing simultaneous, joint pairwise comparisons for all possible pairwise combinations of each group mean [6]. It controls FWER after considering every possible pairwise combination, whereas the Tukey test controls the FWER when only all pairwise comparisons are made.7) This is why the Scheffé’s method is very conservative than other methods and has small power to detect the differences. Since Scheffé’s method generates hypotheses based on all possible comparisons to confirm significance, this method is preferred when theoretical background for differences between groups is unavailable or previous studies have not been completely implemented (exploratory data analysis). The hypotheses generated in this manner should be tested by subsequent studies that are specifically designed to test new hypotheses. This is important in exploratory data analysis or the theoretic testing process (e.g., if a type I error is likely to occur in this type of study and the differences should be identified in subsequent studies). Follow-up studies testing specific subgroup contrasts discovered through the application of Scheffé’s method should use. Bonferroni methods that are appropriate for theoretical test studies. It is further noted that Bonferroni methods are less sensitive to type 6) In this paper, we do not discuss Fisher’s LSD, Duncan’s multiple range test, and Student-Newman-Keul’s procedure. Since these methods do not control FWER, they do not suit the purpose of this paper. 7) Basically, a multiple pairwise comparison should be designed according to the planned contrasts. A classical deductive multiple comparison is performed using predetermined contrasts, which are decided early in the study design step. By assigning a contrast to each group, pairing can be varied from some or all pairs of two selected groups to subgroups, including several groups that are independent or partially dependent on each other. 357 Applying the multiple comparison test I errors than Scheffé’s method. Finally, Scheffé’s method enables simple or complex averaging comparisons in both balanced and unbalanced data. Violation of the assumption of equivalence of variance One-way ANOVA is performed only in cases where the assumption of equivalence of variance holds. However, it is a robust statistic that can be used even when there is a deviation from the equivalence assumption. In such cases, the Games-Howell, Tamhane’s T2, Dunnett’s T3, and Dunnett’s C tests can be applied. The Games-Howell method is an improved version of the Tukey-Kramer method and is applicable in cases where the equivalence of variance assumption is violated. It is a t-test using Welch’s degree of freedom. This method uses a strategy for controlling the type I error for the entire comparison and is known to maintain the preset significance level even when the size of the sample is different. However, the smaller the number of samples in each group, the it is more tolerant the type I error control. Thus, this method can be applied when the number of samples is six or more. Tamhane’s T2 method gives a test statistic using the t-distribution by applying the concept of ‘multiplicative inequality’ introduced by Sidak. Sidak’s multiplicative inequality theorem implies that the probability of occurrence of intersection of each event is more than or equal to the probability of occurrence of each event. Compared to the Games-Howell method, Sidak’s theorem provides a more rigorous multiple comparison method by adjusting the significance level. In other words, it is more conservative than type I error control. Contrarily, Dunnett’s T3 method does not use the t-distribution but uses a quasi-normalized maximum-magnitude distribution (studentized maximum modulus distribution), which always provides a narrower CI than T2. The degrees of freedom are calculated using the Welch methods, such as Games-Howell or T2. This Dunnett’s T3 test is understood to be more appropriate than the Games-Howell test when the number of samples in the each group is less than 50. It is noted that Dunnett’s C test uses studentized range distribution, which generates a slightly narrower CI than the Games-Howell test for a sample size of 50 or more in the experimental group; however, the power of Dunnett’s C test is better than that of the Games-Howell test. Methods for Adjusting P value Many research designs use numerous sources of multiple comparison, such as multiple outcomes, multiple predictors, subgroup analyses, multiple definitions for exposures and 358 VOL. 71, NO. 5, October 2018 outcomes, multiple time points for outcomes (repeated measures), and multiple looks at the data during sequential interim monitoring. Therefore, multiple comparisons performed in a previous situation are accompanied by increased type I error problem, and it is necessary to adjust the P value accordingly. Various methods are used to adjust the P value. However, there is no universally accepted single method to control multiple test problems. Therefore, we introduce two representative methods for multiple test adjustment: FWER and false discovery rate (FDR). Controlling the family-wise error rate: Bonferroni adjustment The classic approach for solving a multiple comparison problem involves controlling FWER. A threshold value of α less than 0.05, which is conventionally used, can be set. If the H0 is true for all tests, the probability of obtaining a significant result from this new, lower critical value is 0.05. In other words, if all the null hypotheses, H0, are true, the probability that the family of tests includes one or more false positives due to chance is 0.05. Usually, these methods are used when it is important not to make any type I errors at all. The methods belonging to this category are Bonferroni, Holm, Hochberg, Hommel adjustment, and so on. The Bonferroni method is one of the most commonly used methods to control FWER. With an increase in the number of hypotheses tested, type I error increases. Therefore, the significance level is divided into numbers of hypotheses tests. In this manner, type I error can be lowered. In other words, the higher the number of hypotheses to be tested, the more stringent the criterion, the lesser the probability of production of type I errors, and the lower the power. For example, for performing 50 t-tests, one would set each t-test to 0.05 / 50 = 0.001. Therefore, one should consider the test as significant only for P < 0.001, not P < 0.05 (equation 2). Adjusted alpha (α) = α / k (number of hypothesis tested) (equation 2) The advantage of this method is that the calculation is straightforward and intuitive. However, it is too conservative, since when the number of comparisons increases, the level of significance becomes very small and the power of the system decreases [7]. The Bonferroni correction is strongly recommended for testing a single universal null hypothesis (H0) that all tests are not significant. This is true for the following situations, as well: to avoid type I error or perform many tests without a preplanned hypothesis for the purpose of obtaining significant results [8]. The Bonferroni correction is suitable when one false positive Online access in http://ekja.org KOREAN J ANESTHESIOL Lee and Lee in a series of tests are an issue. It is usually useful when there are numerous multiple comparisons and one is looking for one or two important ones. However, if one requires many comparisons and items that are considered important, Bonferroni modifications can have a high false negative rate [9]. Controlling the false discovery rate: BenjaminiHochberg adjustment An alternative to controlling the FWER is to control the FDR using the Benjamini-Hochberg and Benjamini & Yekutieli adjustments. The FDR controls the expected rate of the null hypothesis that is incorrectly rejected (type I error) in the rejected hypothesis list. It is less conservative. By performing the comparison procedure with a greater power compared to FWER control, the probability that a type I error will occur can be increased [10]. Although FDR limits the number of false discoveries, some will still be obtained; hence, these procedures may be used if some type I errors are acceptable. In other words, it is a method to filter the hypotheses that have errors in the test from the hypotheses that are judged important, rather than testing all the hypotheses like FWER. The Benjamini-Hochberg adjustment is very popular due to its simplicity. Rearrange all the P values in order from the smallest to largest value. The smallest P value has a rank of i = 1, the next smallest has i = 2, and so on. p(1) ≤ p(2) ≤ p(3) ≤…≤ p(i) ≤ p(N) Compare each individual P value to its Benjamini-Hochberg critical value (equation 3). Benjamini-Hochberg critical value = (i / m)∙Q (equation 3) (i, rank; m, total number of tests; Q, chosen FDR) The largest P value for which P < (i / m)∙Q is significant, and all the P values smaller than the largest value are also significant, even the ones that are not less than their Benjamini-Hochberg critical value. When you perform this correcting procedure with an FDR ≧ 0.05, it is possible for individual tests to be significant, even though their P ≧ 0.05. Finally, only the hypothesis smaller than the individual P value among the listed rejected regions adjusted by FDR will be rejected. One should be careful while choosing FDR. If we decide to proceed with more experiments on interesting individual results and if the additional cost of the experiments is low and the cost of false positives (missing potentially important findings) is high, then we should use a high FDR, such as 0.10 or 0.20, to ensure that important things are not missed. Moreover, it is noted that both Bonferroni correction and Benjamini-Hochberg procedure assume the individual tests to be independent. Conclusions and Implications The purpose of the multiple comparison methods mentioned in this paper is to control the ‘overall significance level’ of the set of inferences performed as a post-test after ANOVA or as a pairwise comparison performed in various assays. The overall significance level is the probability that all the tested null hypotheses are conditional, at least one is denied, or one or more CIs do not contain a true value. In general, the common statistical errors found in medical research papers arise from problems with multiple comparisons [11]. This is because researchers attempt to test multiple hypotheses concurrently in a single experiment, the authors of this Range test Test statistics Pairwise multiple comparison test Range test Pairwise multiple comparison test With control group Stepwise Procedures Single-step F-distribution Sample distribution t-distribution t-distribution Range distribution based on error rate Conservativeness Reckless Dunnett Online access in http://ekja.org Strict Newman-Keuls Tukey HSD Bonferroni (Dunn) Scheffe' Fig. 3. Comparative chart of multiple comparison tests (MCTs). Five repre­ sentative methods are listed along the X-axis, and the parameters to be compared among these methods are listed along the Y-axis. Some methods use the range test and pairwise MCT concomitantly. The Dunnett and New­ man­- Keuls methods are comparable with respect to conservativeness. The Dunnett method uses one significance level, and the Newman-Keuls method compares pairs using the stepwise procedure based on the changes in range test statistics during the procedure. According to the range between the groups, the significance level is changed in the Newman-Keuls method. HSD: honestly significant difference. 359 VOL. 71, NO. 5, October 2018 Applying the multiple comparison test paper have already pointed out this issue. Since biomedical papers emphasize the importance of multiple comparisons, a growing number of journals have started including a process of separately ascertaining whether multiple comparisons are appropriately used during the submission and review process. According to the results of a study on the appropriateness of multiple comparisons of articles published in three medical journals for 10 years, 33% (47/142) of papers did not use multiple comparison correction. Comparatively, in 61% (86/142) of papers, correction without rationale was applied. Only 6.3% (9/142) of the examined papers used suitable correction methods [8]. The Bonferroni method was used in 35.9% of papers. Most (71%) of the papers provided little or no discussion, whereas only 29% showed some rationale for and/ or discussion on the method [8]. The implications of these results are very significant. Some authors make the decision to not use adjusted P values or compare the results of corrected and uncorrected P values, which results in a potentially complicated interpretation of the results. This decision reduces the reliability of the results of published studies. In a study, many situations occur that may affect the choice of MCTs. For example, a group might have different sample sizes. A several multiple comparison analysis tests was specifically developed to handle nonidentical groups. In the study, power can be a problem, and some tests have more power than others. Whereas all comparative tests are important in some studies, only predetermined combinations of experimental groups or comparators should be tested in others. When a special situation affects a particular pairwise analysis, the selection of multiple comparative analysis tests should be controlled by the ability of specific statistics to address the questions of interest and the types of data to be analyzed. Therefore, it is important that researchers select the tests that best suit their data, the types of information on group comparisons, and the power required for analysis (Fig. 3). In general, most of the pairwise MCTs are based on balanced data. Therefore, when there are large differences in the number of samples, care should be taken when selecting multiple comparison procedures. LSD, Sidak, Bonferroni, and Dunnett using the t-statistic do not pose any problems, since there is no assumption that the number of samples in each group is the same. The Tukey test using the studentized range distribution can be problematic since there is a premise that all sample sizes are the same in the null hypothesis. Therefore, the Tukey-Kramer test, which uses the harmonic mean of sample numbers, can be used when the sample numbers are different. Finally, we must check whether the equilibrium of variance assumption is satisfied. The methods of multiple comparisons that have been mentioned previously are all assumed to be equally distributed. Tamhane’s T2, Dunnett’s T3, Games-Howell, and Dunnett’s C are multiple comparison tests that do not assume equilibrium. Although the Korean Journal of Anesthesiology has not formally examined this view, it is expected that the journal’s view on this subject is not significantly different from the view expressed by this paper [8]. Therefore, it is important that all authors are aware of the problems posed by multiple comparisons, and further research is required to spread awareness regarding these problems and their solutions. ORCID Sangseok Lee, https://orcid.org/0000-0001-7023-3668 Dong Kyu Lee, https://orcid.org/0000-0002-4068-2363 References 1. Lee DK. Alternatives to P value: confidence interval and effect size. Korean J Anesthesiol 2016; 69: 555-62. 2. Kim TK. Understanding one-way ANOVA using conceptual figures. Korean J Anesthesiol 2017; 70: 22-6. 3. Stoline MR. The status of multiple comparisons: simultaneous estimation of all pairwise comparisons in one-way ANOVA designs. Am Stat 1981; 35: 134-41. 4. Dunn OJ. Multiple comparisons among means. J Am Stat Assoc 1961; 56: 52-64. 5. Chen SY, Feng Z, Yi X. A general introduction to adjustment for multiple comparisons. J Thorac Dis 2017; 9: 1725-9. 6. Scheffé H. A method for judging all contrasts in the analysis of variance. Biometrika 1953; 40: 87-110. 7. Dunnett CW. A multiple comparison procedure for comparing several treatments with a control. J Am Stat Assoc 1955; 50: 1096-121. 8. Armstrong RA. When to use the Bonferroni correction. Ophthalmic Physiol Opt 2014; 34: 502-8. 9. Streiner DL, Norman GR. Correction for multiple testing: is there a resolution? Chest 2011; 140: 16-8. 10. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Method) 1995; 57: 289-300. 11. Lee S. Avoiding negative reviewer comments: common statistical errors in anesthesia journals. Korean J Anesthesiol 2016; 69: 219-26. 360 Online access in http://ekja.org Copyright of Korean Journal of Anesthesiology is the property of Korean Society of Anesthesiologists and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. 1 Dataset Analysis Uchechukwu Ohiri TIM-7101 V1: Statistics with Technology Applications North Central University (NCU) Dr. Nicholas Harkiolakis January 2, 2021 2 Dataset Analysis Description of the Problem The video game dataset provided comprises columns labeled, Date, Visits, VisitTime, TotalTime, Game, and Advertising. The two main variables to be used for this study are the Visits, Game, and Advertising. The independent variables to be used for the study are the type of player (police officer or thief) and advertising period (advertising period or no advertising period), while the dependent variable is the number of video game visits. The first objective is to determine if the number of video game visits are different for the type of player, while the second objective is to determine if the number of video game visits are different for the advertising period. Since the data is normally distributed, independent t-tests will be used as the inferential model, which will be compared to the two levels of each independent variable. Hypotheses to be Tested Given there are two objectives, the null and alternate hypotheses to be tested are: H0: • There is no statistically significant difference in the number of video game visits for the type of player • There is no statistically significant difference in the number of video game visits for the advertising period H1: • There is a statistically significant difference in the number of video game visits for the type of player 3 • There is a statistically significant difference in the number of video game visits for the advertising period Data Characteristics The below descriptive statistics output report is for the number of video game visits. The data properties included in the report are central tendency, variability measures, outlier detection, and other distribution attributes. The mean (average) is a measure of location for all the observations obtained by dividing the sum of all the observations by the number of observations. The number of observations (Count) is 44. The mean for the data is 1.45, while the standard error is 0.40; standard error indicates the preciseness of the sample mean relative to the population mean. Therefore, a small standard error like the one obtained for this data shows that the mean for the sample data set provides a more precise estimate of the population value. The median and mode values for the number of video game visits are both zero. The median is a measure of location, and this value splits the frequency distribution into two equal parts in ordered data values, while the mode is the value that occurs most frequently (Trajkovski, 2016). The standard deviation, which is a measure of spread (scatter) between the individual data values and the sample mean is 2.67. The sample variance of 7.14 shows how widely the individual data values (observations) vary from the sample mean. The kurtosis for the data is 2.60 while the skewness is 1.93. The two measures indicate the comparison between the distribution’s shape to the normal and symmetric distribution. Since the two values do not deviate significantly from zero, it is an indication that the data does follow a normal distribution. The minimum and maximum values for the data set are 0 and 10, respectively, giving the data range of 10. The range is based on the two most extreme values within the data, and it increases with the sample size. The sum of all values for video game visits is 64. Table 1 4 Dataset of Visits Visits Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count 1.454545455 0.402758125 0 0 2.671595164 7.137420719 2.601907486 1.93103248 10 0 10 64 44 Table 2 below is a pivot table of the total number of video game visits for Police are 31, while the total number of video game visits for Thief are 33. Table 3 is the total number of video game visits for the advertising period are 59, while the total number of video game visits when there is no advertising is 5. Table 2 Visits vs Game Sum of Visits Row Labels 0 1 3 5 6 7 8 10 Grand Total Table 3 Column Labels Police Grand Thief Total 0 0 2 6 3 3 5 6 6 7 8 8 10 31 33 0 8 6 5 12 7 16 10 64 5 Visits vs Advertising Sum of Visits Row Labels 0 1 3 5 6 7 8 10 Grand Total Column Labels No 0 2 3 5 Grand Yes Total 0 6 3 5 12 7 16 10 59 0 8 6 5 12 7 16 10 64 Statistical Assumptions & Findings T-tests assess the means of one or two data sets. The standard form tests the null hypotheses (that the two sample means are equal) and alternate hypotheses (that the two sample means are not equal). The assumption is that the significance level is 0.05, and if the p-value is less than the significance level, the null hypotheses will be rejected because it will indicate that the difference between the two means is statistically significant. The vice versa is true. It is essential to note that the p-value is the probability that an extreme result will be observed if the null hypotheses is true. Given the data contains both numerical and non-numerical values, a Chi-square test is appropriate for the categorical data (Lee & Lee, 2018). Chi-square test is a non-parametric test used in discrete data to show the probability of some non-random factors likely to take account of the correlation observed between non-numeric variables (Turhan, 2020). One of the attributes of the Chi-square test is the test of independence like the one conducted in this scenario, with the goal of establishing association between variables. The main assumptions linked to the Chi-square test include (Rana & Singhal, 2015; McHugh, 2013): • The data used is randomly obtained from a population. 6 • The data used is in frequencies or counts as opposed to percentages or other data transformations. • The cell values are adequate when the expected counts are 5 or more, and there are no cells containing zero values. • The sample size is large enough to avoid type II error – this type of error arises when the null hypothesis is accepted when it is actually false. A sample size of 20-50 is considered to be the adequate minimum, of which our analyses are being doing using a sample size of 64. • The variables under consideration are mutually exclusive – they are counted only once in a given category. To analyze the data, two contingency tables have been created, one for the type of player (Police/Thief) and the second, for the advertising period (Yes/No). The Chi-square statistic is based on the relationship between the observed and the expected counts. The sum totals for the Chi-square statistics calculations are 23.9609 (type of player) and 22.34576 (advertising period). In order to determine if we should reject our null hypotheses based on the Chi-square statistic, the degrees of freedom (df) are obtained. The df is again used to obtain the p-value and Chi-square critical value. The df for both data sets is 6 because it is obtained by multiplying the number of rows minus one by the number of columns minus one. If the Chi-square statistic is higher than the critical value, then we will reject the null hypothesis. Additionally, if the p-value is less than the 0.05 significance level, we will still reject the null hypothesis. The findings show that the Chi-square critical value for both the type of player and advertising period is 12.59159 because both variables have the same df and significance level. Lastly, the p-values obtained from the calculations are 0.000530978 for the type of player and 0.00104799 for the advertising period. Results of the Inferential Analyses 7 The results of the inferential analyses show that the p-values are lower than the 0.05 significance levels. This result was observed after analyzing both the data for type of player and advertising period. For the first null hypothesis, the aim of the study was to determine that there is no difference in the number of video game visits for the type of player (Police/Thief), of which the p-value result from the Chi-square test is 0.000531. For the second null hypothesis, the aim was to determine that there is no difference in the number of video game visits for the advertising period (advertising period/no advertising). The Chisquare test result for the second analysis yielded a p-value of 0.001048. 8 References Lee, S., & Lee, D. K. (2018). What is the proper way to apply the multiple comparison test? Korean Journal of Anesthesiology, 71(5), 353–360. https://doi.org/10.4097/kja.d.18.00242 McHugh, M. L. (2013). The Chi-square test of independence. Biochemia Medica, 23(2), 143– 149. https://doi.org/10.11613/bm.2013.018 (PDF) Chi-square test and its application in hypothesis testing. (n.d.). ResearchGate. https://www.researchgate.net/publication/277935900_Chisquare_test_and_its_application_in_hypothesis_testing Trajkovski, V. (2016). How to Select Appropriate Statistical Test in Scientific Articles [review of how to select appropriate statistical test in scientific articles]. ProQuest; Skopje. https://www.proquest.com/openview/9cf20c8ed2794218965e44ac4196385f/1?pqorigsite=gscholar&cbl=52199 Nihan, S. T. (2020). Karl Pearsons chi-square tests. Educational Research and Reviews, 15(9), 575–580. https://doi.org/10.5897/err2019.3817 ПРЕДГОВОР ПРЕДГОВОР EDITORIAL КАКО ДА СЕ ОДБЕРЕ НАЈСООДВЕТНИОТ СТАТИСТИЧКИ ТЕСТ ВО НАУЧНИТЕ ТРУДОВИ HOW TO SELECT APPROPRIATE STATISTICAL TEST IN SCIENTIFIC ARTICLES Владимир ТРАЈКОВСКИ Vladimir TRAJKOVSKI Дефектолошка теорија и практика Институт за дефектологија Филозофски факултет Скопје, Република Македонија Journal of Special Education and Rehabilitation Institute of Special Education and Rehabilitation Faculty of Philosophy Skopje, Republic of Macedonia Примено: 08.07.2016 Прифатено: 20.07.2016 Recived: 08.07.2016 Accepted: 20.07.2016 Editorial Резиме Abstract Статистиката е дел од математиката која се занимава со собирање, анализирање, интерпретирање и презентирање маса (голем број примероци) нумерички податоци со цел да се извлечат релевантни заклучоци од истата. Статистиката е форма на математичка анализа која користи квантификувани модели, репрезентации и синопсиси за даден број експериментални податоци или истражувања кои се спроведуваат со жива материја. Студентите и младите истражувачи во биомедицинските науки како и во специјалната едукација и рехабилитација често го искажуваат своето мислење дека одбрале да се запишат на тие студии поради тоа што не поседуваат големо знаење или интерес за математика. Тоа е тажна изјава, но има и вистина во неа. Целта на овој едиторијал е да им послужи и да им помогне на младите истражувачи да ја одберат најсоодветната техника за статистичка обработка на податоците која ќе соодветствува на целите и условите на одредена анализа. Најважните статистички тестови ќе бидат прикажани во овој труд. Statistics is mathematical science dealing with the collection, analysis, interpretation, and presentation of masses of numerical data in order to draw relevant conclusions. Statistics is a form of mathematical analysis that uses quantified models, representations and synopses for a given set of experimental data or real-life studies. The students and young researchers in biomedical sciences and in special education and rehabilitation often declare that they have chosen to enroll that study program because they have lack of knowledge or interest in mathematics. This is a sad statement, but there is much truth in it. The aim of this editorial is to help young researchers to select statistics or statistical techniques and statistical software appropriate for the purposes and conditions of a particular analysis. The most important statistical tests are reviewed in the article. Адреса за кореспонденција: Владимир ТРАЈКОВСКИ Дефектолошка теорија и практика Универзитет „Св. Кирил и Методиј“ Филозофски факултет Институт за дефектологија Бул. Гоце Делчев 9А, 1000 Скопје Република Македонија Е-пошта: vladotra@fzf.ukim.edu.mk Corresponding address: Vladimir TRAJKOVSKI Journal of Special Education and Rehabilitation “Ss Cyril and Methodius” University Faculty of Philosophy Institute of Special Education and Rehabilitation Bull. Goce Delchev 9A 1000 Skopje Republic of Macedonia E-mail: vladotra@fzf.ukim.edu.mk ДЕФЕКТОЛОШКА ТЕОРИЈА И ПРАКТИКА 2016; 17(3–4):5–28 DOI: 10.19057/jser.2016.7 5 EDITORIAL Да знаеш како да го одбереш правилниот статистички тест е многу важна одлука во делот на обработката на добиените податоци и во пишувањето на научниот труд. Младите истражувачи и автори би требало да знаат како да ги одберат и како да ги користат овие статистички методи. Компетентниот истражувач мора да поседува одреден степен на знаење во однос на статистичките процедури. Тука може да се подразбира курс за вовед во статистиката и, се разбира, користењето на добар учебник по статистика. За оваа цел, постои потреба предметот Статистика да биде задолжителен предмет во наставната програма на Институтот за дефектологија при Филозофскиот факултет во Скопје. Младите истражувачи имаат потреба од дополнителни курсеви за да се здобијат со поголемо знаење во областа на статистиката. Тие мора да се обучат за да користат статистички компјутерски програми на соодветен начин. Knowing how to choose right statistical test is an important asset and decision in the research data processing and in the writing of scientific papers. Young researchers and authors should know how to choose and how to use statistical methods. The competent researcher will need knowledge in statistical procedures. That might include an introductory statistics course, and it most certainly includes using a good statistics textbook. For this purpose, there is need to return of Statistics mandatory subject in the curriculum of the Institute of Special Education and Rehabilitation at Faculty of Philosophy in Skopje. Young researchers have a need of additional courses in statistics. They need to train themselves to use statistical software on appropriate way. Клучни зборови: Статистичка селекција на тест, статистика, научен труд, статистички програми Keywords: statistical test selection, statistics, scientific article, statistical software Вовед Introduction Статистиката е дел од математиката која се занимава со собирање, анализирање, интерпретирање и презентирање маса (голем број примероци) нумерички податоци со цел да се извлечат релевантни заклучоци од истата. Статистиката е форма на математичка анализа која користи квантификувани модели, репрезентации и синопсиси за даден број експериментални податоци или истражувања кои се спроведуваат со жива материја. Статистиката се користи во неколку различни дисциплини (како научни така и кај оние кои не се занимаваат со наука) за добиените податоци да се сведат на заклучоци (1). Студентите и младите истражувачи во биомедицинските науки како и во специјалната едукација и рехабилитација, често објавуваат и го искажуваат сопственото мислење дека одбрале да се запишат на тие студии поради тоа што не поседуваат големо знаење или интерес за математика. Тоа е тажна изјава, но има и вистина во неа. Тие често не знаат како да ја извршат статистичката обработка на добиените податоци од истражувањето за нивните додипломски, последипломски, па и докторски студии, па затоа најчесто бараат помош од статистичари. За оваа цел, тие мора да платат одредена сума пари. Најчесто во Statistics is mathematical science dealing with the collection, analysis, interpretation, and presentation of masses of numerical data in order to draw relevant conclusions. Statistics is a form of mathematical analysis that uses quantified models, representations and synopses for a given set of experimental data or real-life studies. Statistics is used in several different disciplines (both scientific and non-scientific) to make decisions and draw conclusions based on data (1). The students and young researchers in biomedical sciences and in special education and rehabilitation often declare that they have chosen to enroll that study program because they have lack of knowledge or interest in mathematics. This is a sad statement, but there is much truth in it. They often do not know to make their statistical processing of data for its undergraduate, master's and doctoral theses, and seek help from a statistician. For this purpose, they have to pay certain amount of money. There 6 JOURNAL OF SPECIAL EDUCATION AND REHABILITATION 2016; 2016; 17(3–4):5–28 DOI: 10.19057/jser.2016.7 ПРЕДГОВОР нивните тези има погрешно одбрани статистички методи кои водат кон погрешни заклучоци. Селекцијата на правилниот статистички метод или техника може да претставува голем проблем за младите истражувачи. Во истражувањето, значајните заклучоци можат да бидат изведени од собраните податоци од валиден научен дизајн користејќи го соодветниот статистички метод или техника. Во однос на селекцијата на статистичкиот метод кој би се користел, најважното прашање е „Која е главната хипотеза на истражувањето?“ Во некои случаи нема главна хипотеза; истражувачот само сака да „види што има таму“. На пример, во студија за преваленција нема хипотеза која би се тестирала, и големината на студијата е одредена од тоа колку прецизно истражувачот сака да ја одреди преваленцијата. Доколку нема поставено главна хипотеза, тогаш нема статистички тест. Важно е уште пред да се започне со истражувањето кои хипотези би се потврдиле како точни (тоа се однесува на некои за некои врски за кои се претпоставува дека би излегол таков резултат) и кои би биле прелиминарни (индицирани од добиените податоци од истражувањето) (2). Во истражувачки студии користењето на погрешните статистички тестови може да се види во голем број случаи, како користењето на тестови за парен број на податоци кај податоци добиени од непарен број или користењето на параметриски статистички тест за обработка на податоците кој не ја следи нормалната дистрибуција или некомпатибилен статистички тест за добиените податоци од истражувањето (3). Достапноста на различни типови статистички програми, го прави изведувањето на статистиката и статистичките тестови многу лесно, но изборот на соодветниот статистички тест или метод сè уште претставува проблем. Најдобар пристап е чекор по чекор систематски да се дојде до одлука на кој начин да се анализираат добиените податоци. Се препорачува да се следат овие чекори (4).  да се одреди и специфицира во форма на прашање што сакаме да постигнеме со истражувањето;  да се постави прашањето во форма на статистичка нулта хипотеза и да се издвојат алтернативни хипотези од главната или нултата хипотеза;  да се одредат кои варијабли се релевантни за прашањето;  да се одреди од кој тип е секоја варијабДЕФЕКТОЛОШКА ТЕОРИЈА И ПРАКТИКА 2016; 17(3–4):5–28 DOI: 10.19057/jser.2016.7 are in the theses very often wrong selected statistical methods which then lead to erroneous conclusions. Selecting the right statistical test may represent a huge problem for younger researchers. In research, meaningful conclusions can only be drawn based on data collected from a valid scientific design using appropriate statistical tests. Regarding to selecting a statistical test, the most important question is "what is the main study hypothesis?" In some cases there is no hypothesis; the investigator just wants to "see what is there". For example, in a prevalence study there is no hypothesis to test, and the size of the study is determined by how accurately the investigator wants to determine the prevalence. If there is no hypothesis, then there is no statistical test. It is important to decide a priori which hypotheses are confirmatory (that is, are testing some presupposed relationship), and which are exploratory (are suggested by the data) (2). In research studies wrong statistical tests can be seen in many conditions like use of paired test for unpaired data or use of parametric statistical tests for the data which does not follow the normal distribution or incompatibility of statistical tests with the type of data (3). The availability of different types of statistical software makes performing of the statistical tests to become easy, but selection of appropriate statistical test is still a problem. Systematic step-by-step approach is the best way to decide how to analyze data. It is recommended that you follow these steps (4):  Specify the question you are asking.  Put the question in the form of a statistical null hypothesis and alternate hypothesis.  Determine which variables are relevant to the question.  Determine what kind of variable each 7 EDITORIAL ла посебно; да се дизајнира студија која ги контролира или ги распределува случајните варијабли;  да се одбере најдобриот статистички тест или метод базиран врз бројот и видот на варијаблите за да се утврди дали очекуваните резултати одговараат на претпоставките кои сме ги поставиле во параметрите и да се тестираат хипотезите;  доколку е можно, да се направи претходна анализа за да се одреди големината на примерокот кој ќе се испитува во истражувањето;  да се направи истражувањето;  да се прегледаат добиените податоци и да се утврди дали соодветствуваат со претпоставките од статистичкиот тест кој е одбран. Доколку не се соодветни тогаш се бара посоодветен тест;  да се спроведе статистичкиот тест кој ќе се покаже како најсоодветен и да се интерпретираат резултатите и  да се презентираат добиените резултати ефективно, најчесто со графикони или табели. Marusteri и Bacarea укажуваат и на други услови кои би требало да се земат предвид кога вршиме анализа на добиените податоци од одредено истражување:  основно ниво на познавање на базичната статистичка терминологија и концепти;  да се поседува знаење за неколку аспекти поврзани со податоците кои сме ги добиле за време на истражувањето / експериментот (пр. каков тип на податоци сме добиле – номинални, ординални, интервални или размерни скали (скали на односи) како се организирани добиените податоци, колку истражувачки групи се опфатени (обично експериментална и контролна група), дали групите се во пар или непар, дали примерокот или примероците припаѓаат на нормална дистрибуирана / Гаусова популација);  добро разбирање на целта за нашата статистичка анализа;  добра анализа на целиот статистички протокол во еден добар структуриран, разгранет, алгоритамски начин, со цел да се избегнат можни грешки (5). Целта на овој едиторијал е да им помогне на младите истражувачи да можат да ги одберат статистичките техники или статистички  8 one is. Design a study that controls or randomizes the confounding variables.  Based on the number of variables, the kinds of variables, the expected fit to the parametric assumptions, and the hypothesis to be tested, choose the best statistical test to use.  If possible, do a power analysis to determine a good sample size for the study.  Do the study.  Examine the data to see if it meets the assumptions of the statistical test you chose. If it doesn't, choose a more appropriate test.  Apply the statistical test you chose, and interpret the results.  Show your results effectively, usually with a table or a figure. Marusteri and Bacarea mentioned other things we should have in our mind when we are analyzing the data from some study:  Decent understanding of some basic statistical terms and concepts;  Some knowledge about few aspects related to the data we collected during the research/experiment (e.g. what types of data we have - nominal, ordinal, interval or ratio, how the data are organized, how many study groups (usually experimental and control at least) we have, are the groups paired or unpaired, and are the sample(s) extracted from a normally distributed/Gaussian population);  Good understanding of the goal of our statistical analysis;  We have to parse the entire statistical protocol in a well structured - decision tree /algorithmic manner, in order to avoid some mistakes (5). The aim of this editorial is to help young researchers to select statistics or statistical  JOURNAL OF SPECIAL EDUCATION AND REHABILITATION 2016; 2016; 17(3–4):5–28 DOI: 10.19057/jser.2016.7 ПРЕДГОВОР компјутерски програми кои би биле соодветни во исполнувањето на целите и условите на одредена анализа. Неколку од овие чекори ќе бидат подетално објаснети во долунаведениот текст. techniques and statistical software appropriate for the purposes and conditions of a particular analysis. In the following text it will be explained some of these steps. Видови скали Types of scales Пред да можеме да ја спроведеме статистичката анализа, мораме да извршиме мерење на зависната варијабла. Начинот на кој се врши мерењето ќе зависи целосно од типот на варијаблата која е вклучена при самата анализа. Различни типови се мерат на различен начин. Иако процедурите за мерење се разликуваат една од друга на многу начини, можат да бидат класифицирани користејќи неколку фундаментални категории. Во секоја категорија сите процедури меѓусебно споделуваат дел од важните особини. Постојат четири типови скали. Before we can conduct a statistical analysis, we need to measure our dependent variable. Exactly how the measurement is carried out depends on the type of variable involved in the analysis. Different types are measured differently. Although procedures for measurement differ in many ways, they can be classified using a few fundamental categories. In a given category, all of the procedures share some properties that are important to know about. There are four types of scales. Номинални скали Кога при мерењата се користи номиналната скала, тогаш само се именуваат или категоризираат дадени одговори. Пол, брачен статус, омилена боја, како и религиска определба се примери на варијабли измерени со номинална скала. Есенцијалната цел на номиналните скали се состои во тоа што тие не вршат подредување на дадени одговори од субјектите кои се испитуваат. На пример, кога ги класифицираме луѓето според нивната омилена боја, нема смисла кога црвената боја е ставена пред жолтата. Одговорите само се категоризираат. Со номиналните скали се отелотворуваат најниските видови мерења во статистиката (6). Nominal scales When measuring using a nominal scale, one simply names or categorizes responses. Gender, marital status, handedness, favorite color, and religion are examples of variables measured on a nominal scale. The essential point about nominal scales is that they do not imply any ordering among the responses. For example, when classifying people according to their favorite color, there is no sense in which red is placed “ahead of” yellow. Responses are merely categorized. Nominal scales embody the lowest level of measurement (6). Ординални скали Истражувач кој сака да изврши мерење на варијаблата колку се задоволни родителите од третманот на нивното дете во текот на наставата, може да им постави специфично прашање за тоа како се чувствуваат: „многу незадоволно“, „малку незадоволно“, „малку задоволно“, „многу задоволно“. Во овој случај варијаблите се подредени, рангирајќи од најмалку до најмногу задоволни. Ова е основната разлика помеѓу ординарната и номиналната скала. За разлика од номиналните скали, ординарните скали дозволуваат да се направи споредба до кој степен два субјекти кои се испитуваат ја поседуваат зависната варијабла. На пример, нашето задоволство ДЕФЕКТОЛОШКА ТЕОРИЈА И ПРАКТИКА 2016; 17(3–4):5–28 DOI: 10.19057/jser.2016.7 Ordinal scales A researcher wishing to measure satisfaction of parents with treatment of their child in regular classroom might ask them to specify their feelings as either “very dissatisfied,” “somewhat dissatisfied,” “somewhat satisfied,” or “very satisfied.” The items in this scale are ordered, ranging from least to most satisfied. This is what distinguishes ordinal from nominal scales. Unlike nominal scales, ordinal scales allow comparisons of the degree to which two subjects possess the dependent variable. For example, our satisfaction ordering makes it meaningful to assert that one person is more satisfied than another with their microwave ovens. Such an assertion reflects the first person's use of a 9 EDITORIAL при купување на микробранова печка може да е поголемо од она на други купувачи. Од друга страна, со ординалните скали не можеме да ги добиеме важните информации кои се присутни кај другите видови скали. На пример, разликата помеѓу две нивоа на една ординална скала не може да се претпостави дека ќе бидат исти како и разликата помеѓу други две нивоа. Кај скалите со кои се мери задоволството, на пример, разликата помеѓу одговорите „многу незадоволен“ и „малку незадоволен“ сигурно не е еквивалентна на разликата помеѓу „малку незадоволен“ и „малку задоволен“. Ништо што е во нашата процедура за мерење не може да ни детерминира дали двете разлики ја рефлектираат истата разлика во психолошко задоволство (6). verbal label that comes later in the list than the label chosen by the second person. On the other hand, ordinal scales fail to capture important information that will be present in the other scales we examine. In particular, the difference between two levels of an ordinal scale cannot be assumed to be the same as the difference between two other levels. In satisfaction scale, for example, the difference between the responses “very dissatisfied” and “somewhat dissatisfied” is probably not equivalent to the difference between “somewhat dissatisfied” and “somewhat satisfied.” Nothing in our measurement procedure allows us to determine whether the two differences reflect the same difference in psychological satisfaction (6). Интервални скали Интервалните скали се нумерички скали кои вклучуваат: возраст (години), тежина (кг) или должина на коска (цм), во која интервалите ја имаат истата интерпретација низ целата скала. Интервалните податоци се подредени по значаен редослед и го поседуваат квалитетот кој е еднаков со интервалите направени помеѓу мерењата и ја претставуваат истата промена во квантитетот на тоа што го мериме. Но кај овие типови податоци не постои природна нула. На пример, во Целзиусовата скала за температура. Во Целзиусовата скала, не постои природна нула, така што не можеме да кажеме дека 50°C е дупло од 25°C. Кај интервалните скали нулта точката може да биде поставена арбитражно. IQ-тестот исто така претставува податок за интервална скала кај која не постои природна (апсолутна) нула (7). Interval scales Interval scales are numerical scales including: age (years), weight (kg) or length of bone (cm), in which intervals have the same interpretation throughout. Interval data has a meaningful order and also has the quality that equal intervals between measurements represent equal changes in the quantity of whatever is being measured. But these types of data have no natural zero. Example is Celsius scale of temperature. In the Celsius scale, there is no natural zero, so we cannot say that 50°C is double than 25°C. In interval scale, zero point can be chosen arbitral. IQ test is also interval data as it has no natural zero (7). Размерни скали Размерната скала за мерење на добиени податоци содржи најголем број информации. Тоа е интервална скала со дополнителна особина каде што положбата на нулата посочува на отсуство од квантитетот што се мери. За размерната скала може да се каже дека е составена од сите три претходни скали. Како и номиналната скала, ни дава име или категорија за одреден објект (броевите служат како обележја). Како кај ординалната скала, објектите се подредени (како подредување на броеви). Кај размерната скала истата разлика на две места го има истото значење, како и кај интервалната скала. Но исто така, истиот размер на две места на скалата носи исто Ratio scales The ratio scale of measurement is the most informative scale. It is an interval scale with the additional property that its zero position indicates the absence of the quantity being measured. You can think of a ratio scale as the three earlier scales rolled up in one. Like a nominal scale, it provides a name or category for each object (the numbers serve as labels). Like an ordinal scale, the objects are ordered (in terms of the ordering of the numbers). Like an interval scale, the same difference at two places on the scale has the same meaning. And in addition, the same ratio at two places on the scale also carries the same meaning. Example of a ratio scale 10 JOURNAL OF SPECIAL EDUCATION AND REHABILITATION 2016; 2016; 17(3–4):5–28 DOI: 10.19057/jser.2016.7 ПРЕДГОВОР значење. Пример за размерна скала е количината на пари која ја имате во овој момент (500 денари, 1000 денари итн.). Парите се мерат со размерна скала, бидејќи, покрај тоа што ги имаат особините на интервална скала, постои вистинска нулта точка: доколку имате нула денари, ова посочува на отсуство на пари. Бидејќи парите имаат вистинска нулта точка, има смисла да кажеме дека некој со 1000 денари има двапати повеќе отколку некој со 500 денари (или дека Марк Цукерберг има милион пати повеќе пари отколку што имате вие) (6). is the amount of money you have in your pocket right now (500 denars, 1000 denars, etc.). Money is measured on a ratio scale because, in addition to having the properties of an interval scale, it has a true zero point: if you have zero money, this implies the absence of money. Since money has a true zero point, it makes sense to say that someone with 1000 denars has twice as much money as someone with 500 denars (or that Mark Zuckerberg has a million times more money than you do) (6). Нормална дистрибуција или не Normal distribution or not Ова е уште еден проблем при селекцијата на правилниот статистички тест. Доколку знаете каков е видот на податоците (номинални, ординални, интервални или размерни) и дистрибуцијата на податоците (нормална дистрибуција или ненормална дистрибуција), селекцијата на статистичкиот тест е многу лесна. Нема потреба да се проверува дистрибуцијата кај ординалните и номиналните скали на податоци добиени од истражувањето. Дистрибуцијата обично се проверува само кај интервални или размерни податоци. Ако вашите податоци ја следат нормалната дистрибуција, би бил користен параметриски (стандардизиран) статистички тест, додека доколку не се следи нормалната дистрибуција, тогаш би се користел непараметриски тест. Постојат различни методи за да се провери нормалната дистрибуција, некои од нив преку различни видови на хистограми, мерење на искривеност на кривата и куртозис, како на пример, статистичкиот тест на нормалност (Колмогоров-Смирнов тест, ШапироВилк-тестот итн.). Формалните статистички тестови како Колмогоров-Смирнов-тестот и Шапиро-Вилк-тестот најчесто се користат за да се провери дистрибуцијата на добиените податоци. Сите овие тестови се базирани на нултата хипотеза дека податоците се земени од популација која ја следи нормалната дистрибуција. P вредноста се одредува за да се увиди алфа грешката. Доколку P вредноста е помала од 0,05, тогаш добиените податоци не ја следат нормалната дистрибуција и во овој случај би требало да се користи нестандардизиран тест. Доколку примерокот кој се испитува е помал, веројатноста за ненормална дистрибуција се зголемува (7). This is another issue for selection of right statistical test. If you know the type of data (nominal, ordinal, interval, and ratio) and distribution of data (normal distribution or not normal distribution), selection of statistical test will be very easy. There is no need to check distribution in the case of ordinal and nominal data. Distribution should only be checked in the case of ratio and interval data. If your data are following the normal distribution, parametric statistical test should be used and nonparametric tests should only be used when normal distribution is not followed. There are various methods for checking the normal distribution, some of them are plotting histogram, plotting box and whisker plot, plotting Q-Q plot, measuring skewness and kurtosis, using formal statistical test for normality (KolmogorovSmirnov test, Shapiro-Wilk test, etc). Formal statistical tests like KolmogorovSmirnov and Shapiro-Wilk are used frequently to check the distribution of data. All these tests are based on null hypothesis that data are taken from the population which follows the normal distribution. P value is determined to see the alpha error. If P value is less than 0.05, data is not following the normal distribution and nonparametric test should be used in that kind of data. If the sample size is less, chances of non-normal distribution are increased (7). ДЕФЕКТОЛОШКА ТЕОРИЈА И ПРАКТИКА 2016; 17(3–4):5–28 DOI: 10.19057/jser.2016.7 11 EDITORIAL Параметриски и непараметриски процедури Parametric and non-parametric procedures Стандардизираните статистички процедури се основани на претпоставки за формата на дистрибуцијата (се претпоставува нормална дистрибуција) во основната популација и за формата на параметрите кои се земени (начини и стандардни девијации) од претпоставената дистрибуција. Нестандардизираните статистички процедури се поткрепуваат на неколку претпоставки во однос на формата на параметрите на популациската дистрибуција од која самиот примерок бил извлечен (8). Нестандардизираните методи обично се послаби и помалку флексибилни за разлика од стандардизираните. Стандардизираните методи се користат тогаш кога претпоставките можеме да ги оправдаме. Некогаш можеме да направиме трансформација на добиените податоци за да извршиме оправдување на претпоставките, како трансформација на дневник (9). Табела 1 ни ја покажува употребата на стандардизирани и нестандардизирани статистички методи. Parametric statistical procedures rely on assumptions about the shape of the distribution (assume a normal distribution) in the underlying population and about the form or parameters (means and standard deviations) of the assumed distribution. Nonparametric statistical procedures rely on no or few assumptions about the shape or parameters of the population distribution from which the sample was drawn (8). Nonparametric methods are typically less powerful and less flexible than their parametric counterparts. Parametric methods are preferred if the assumptions can be justified. Sometimes a transformation can be applied to the data to satisfy the assumptions, such as log transformation (9). Table 1 shows the use of parametric and non-parametric statistical methods. Табела 1. Параметриски наспроти непараметриски методи/ Table 1. Parameteric vs non-parametric methods Assumed distribution Assumed variance Typical data Dataset r lationships Usual central measure Benefits Tests Choosing Correlation test Independent measures, 2 groups Independent measures, >2 groups Repeated measures, 2 conditions Repeated measures, >2 conditions Parametric Normal Homogeneous Ratio or interval Independent Mean Can draw more conclusions Non-parametric Any Any Ordinal or nominal Any Median Simplicity; Less affected by outliers Choosing parametric test Pearson Independent-measures t-test Choosing non-parametric test Spearman Mann-Whitney test One-way independent-measures ANOVA Kruskal-Wallis test Matched-pair t-test Wilcoxon test One-way repeated measures ANOVA Friedman’s test Аритметичка средина (или просек) претставува мерење на локација од една група вредности добиени преку податоците; сумата на сите добиени податоци поделена со бројот на елементи во дистрибуцијата. Придружен елемент на мерење кој ја следи аритметичката средина обично е стандардната девијација. За разлика од медијаната и модата, не е соодветно да се користи овој тип на мерење за да се карактеризира или опише искривена (ненормална) дистрибуција. 12 Arithmetic Mean (or average): a measure of location for a batch of data values; the sum of all data values divided by the number of elements in the distribution. Its accompanying measure of spread is usually the standard deviation. Unlike the median and the mode, it is not appropriate to use the mean to characterize a skewed distribution. JOURNAL OF SPECIAL EDUCATION AND REHABILITATION 2016; 2016; 17(3–4):5–28 DOI: 10.19057/jser.2016.7 ПРЕДГОВОР Медијаната е уште едно мерење на локација како и аритметичката средина. Вредноста која ја дели дистрибуцијата на фреквенцијата на средина кога сите податоци се подредени по редослед. Кај овој тип на мерења се гледа дека не постои сензитивност кога се мерат мали броеви во екстремно големи резултати во една дистрибуција. Затоа, таа е преферирана мерка за мерење на централната тенденција кај искривена дистрибуција (каде аритметичката средина е пристрасна) и обично оди заедно со интеркварталниот ранг (dQ) како придружна мерка за раст. Интерквартален ранг (dQ) е мерка на раст и е спротивна на стандардната девијација кај искривена или ненормална дистрибуција на податоците. dQ е растојанието помеѓу горните и долните квартали (Qu- QL). Варијанса е нумеричка вредност која се користи за да се утврди и укаже на тоа колку индивидуите на една група се разликуваат или варираат во однос на некои особини кои ние ги мериме. Ако индивидуалната опсервација се разликува многу од средната вредност добиена за групата, тогаш разликата е голема; и обратно. Многу е важно да се прави разлика помеѓу разликата во една популација и разликата кај еден примерок. Тие се забележани на различен начин, и податоците за секој од нив се обработува посебно. Варијабилноста кај популацијата се обележува со σ2, а варијабилноста на еден примерок се обележува со s2. Стандардна девијација (SD) претставува мерка за мерење на одреден сет податоци и нивниот раст. За разлика од варијансата која е изразена во квадратни единици, SD се изразува во истите единици како и оригиналните податоци добиени од истражувањето. Се пресметува според отстапувањата помеѓу секој податок поединечно како и од аритметичката средина на примерокот. Тоа е квадратниот корен од варијансата. За различни цели, n (целосниот број на вредности) или n1 може да се користи при пресметувањето на варијабилноста/SD. Доколку ја имате пресметано SD делејќи ја со n но сакате да ја претворите во SD и да одговара на именителот на n-1, тогаш се множи резултатот со квадратниот корен од n/(n-1). Доколку дистрибуцијата на SD е поголема од аритметичката средина, тогаш аритметичката средина не е адекватна како репрезентативна единица за мерење на централната тенденција. За податоци кои имаат нормална дистриДЕФЕКТОЛОШКА ТЕОРИЈА И ПРАКТИКА 2016; 17(3–4):5–28 DOI: 10.19057/jser.2016.7 Median is another measure of location just like the mean. The value that divides the frequency distribution in half when all data values are listed in order. It is insensitive to small numbers of extreme scores in a distribution. Therefore, it is the preferred measure of central tendency for a skewed distribution (in which the mean would be biased) and is usually paired with the interquartile range (dQ) as the accompanying measure of spread. Interquartile range (dQ) is a measure of spread and is the counterpart of the standard deviation for skewed distributions. dQ is the distance between the upper and lower quartiles (QU-QL). Variance is a numerical value used to indicate how widely individuals in a group vary. If individual observations vary greatly from the group mean, the variance is big; and vice versa. It is important to distinguish between the variance of a population and the variance of a sample. They have different notation, and they are computed differently. The variance of a population is denoted by σ2; and the variance of a sample, by s2. Standard deviation (SD): is a measure of spread (scatter) of a set of data. Unlike variance, which is expressed in squared units of measurement, the SD is expressed in the same units as the measurements of the original data. It is calculated from the deviations between each data value and the sample mean. It is the square root of the variance. For different purposes, n (the total number of values) or n-1 may be used in computing the variance/SD. If you have a SD calculated by dividing by n and want to convert it to a SD corresponding to a denominator of n-1, multiply the result by the square root of n/(n-1). If a distribution's SD is greater than its mean, the mean is inadequate as a representative measure of central tendency. For normally distributed data 13 EDITORIAL буција, приближно 68% од дистрибуцијата припаѓа ±1 SD од аритметичката средина, 95% од дистрибуцијата припаѓа на ± 2 SD од аритметичката средина, и 99.7% од дистрибуцијата припаѓа на ± 3 SD од аритметичката средина (емпириско правило). Стандардна грешка (SE) или како што ...
Purchase answer to see full attachment
Explanation & Answer:
5 pages
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

View attached explanation and answer. Let me know if you have any questions.

1

Dataset Analysis

Uchechukwu Ohiri
TIM-7101 V1: Statistics with Technology Applications
North Central University (NCU)
Dr. Nicholas Harkiolakis
January 2, 2021

2
Dataset Analysis
Description of the Problem
The video game dataset provided comprises columns labeled, Date, Visits, VisitTime,
TotalTime, Game, and Advertising. The two main variables to be used for this study are the
Visits, Game, and Advertising. The independent variables to be used for the study are the
type of player (police officer or thief) and advertising period (advertising period or no
advertising period), while the dependent variable is the number of video game visits. The first
objective is to determine if the mean for the video game visits is significantly different from
the mean for the type of player, while the second objective is to determine if the mean for the
video game visits is statistically different from the mean for the advertising period. Since the
data is normally distributed, independent t-tests will be used as the inferential model, which
will be compared to the two levels of each independent variable.
Hypotheses to be Tested
Given there are two objectives, the null and alternate hypotheses to be tested are:
H0:


There is no statistically significant difference in the mean video game visits and the
mean type of player.



There is no statistically significant difference in the mean video game visits and the
mean advertising period.

H1:


There is a statistically significant difference in the mean video game visits and the
mean type of player.

3


There is a statistically significant difference in the mean video game visits and the
mean advertising period.
Data Characteristics
The below descriptive statistics output report is for the number of video game visits.

The data properties included in the report are central tendency, variability measures, outlier
detection, and other distribution attributes. The mean (average) is a measure of location for
all the observations obtained by dividing the sum of all the observations by the number of
observations. The number of observations (Count) is 44. The mean for the data is 1.45, while
the standard error is 0.40; standard error indicates the preciseness of the sample mean relative
to the population mean. Therefore, a small standard error like the one obtained for this data
shows that the mean for the sample data set provides a more precise estimate of the
population value. The median and mode values for the number of video game visits are both
zero. The median is a measure of location, and this value splits the frequency distribution into
two equal parts in ordered data values, while the mode is the value that occurs most
frequently (Trajkovski, 2016). The standard deviation, which is a measure of spread (scatter)
between the individual data values and the sample mean is 2.67. The sample variance of 7.14
shows how widely the individual data values (observations) vary from the sample mean. The
kurtosis for the data is 2.60 while the skewness is 1.93. The two measures indicate the
comparison between the distribution’s shape to the normal and symmetric distribution. Since
the two values do not deviate significantly from zero, it is an indication that the data follows a
normal distribution. The minimum and maximum values for the data set are 0 and 10,
respectively, giving the data range of 10. The range is based on the two most extreme values
within the data, and it increases with the sample size. The sum of all values for video game
visits is 64.
Table 1

4
Dataset of Visits
Visits
Mean
Standard Error
Median
Mode
Standard
Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count

1.454545455
0.402758125
0
0
2.671595164
7.137420719
2.601907486
1.93103248
10
0
10
64
44

Table 2 below is a pivot table of the total number of video game visits for Police,
which are 31, while the total number of video game visits for Thief are 33. Table 3 shows that
the total number of video game visits for the advertising period is 59, while the total number
of video game visits when there is no advertising is 5.
Table 2
Visits vs Game
Sum of
Visits
Row Labels
0
1
3
5
6
7
8
10
Grand Total

Table 3

Column Labels
Police

Grand
Thief Total
0
0
2
6
3
3
5
6
6
7
8
8
10
31
33

0
8
6
5
12
7
16
10
64

5
Visits vs Advertising
Sum of
Visits
Row Labels
0
1
3
5
6
7
8
10
Grand Total

Column Labels
No
0
2
3

5

Grand
Yes Total
0
6
3
5
12
7
16
10
59

0
8
6
5
12
7
16
10
64

Statistical Assumptions & Findings
T-tests assess the means of one or two data sets. The st...

Related Tags