Data Analytics

User Generated

MFZW

Mathematics

Description

Data Analytics Exercise 4: ANOVA using SAS & Excel. Submit your work here also when finished. Attached are the assignment instructions, and the Business Memo Template.

Unformatted Attachment Preview

Current Date From: Your Name To: InstructorSalutation InstructorLastName (e.g. Dr. Sherman) Subj: Name of your issue analysis (edit this title) 1. Identify the issue. In this paragraph, summarize the analysis project. In other words, identify the issue under investigation so that the reviewer knows what this analysis is about. This could be a problem or an opportunity. What is the “issue” or “opportunity” that is better understood by examining this question (you can be creative here, tie it back to a business function or need). 2. Analyze the issue. In this paragraph, briefly describe: a. where the data came from b. the steps you took to prepare the data for analysis c. how you analyzed the data d. the results of your analysis i. you should have a table with summary statistics ii. you should have some type of graphic (histogram, box plot, scatter graph, etc.) that illustrates your findings 3. Discussion. In this paragraph, in clear language, what did you find in your analysis, and what are the implications of your findings (for this exercise, you can be creative). Also, briefly discuss which tool you liked better to do the ANOVA (Excel, or SAS). Why? 4. Conclusion. In this paragraph, briefly discuss what you learned in doing this exercise. Data Analysis Exercise 4 (DAX4) ANOVA Comparison between SAS and Excel In this exercise, we will compare doing an analysis on the same dataset in Excel and SAS, and then report on our analysis. The data that we will use for this analysis is called SASHELP.CARS, and the analysis we will do is an ANOVA. I have explored the dataset and have developed a research question: Is there a statistically significant difference in the weight of sedans based on their origin? To find the answer to this question, I am going to analyze the dataset and perform an ANOVA comparing the sample means of the weight variable by continent of origin (USA, Asia, or Europe). An ANOVA is a set of statistical methods used mainly to compare the means of two or more samples. Estimates of variance are the key intermediate statistics calculated, hence the reference to variance in the title ANOVA.1 In testing the null hypothesis for a One-Way ANOVA, we generally follow these steps: 1. State the plain language research question: Is the average weight of sedans in this dataset statistically similar regardless of origin? 2. State the hypotheses: • Null hypothesis – HO: µAsia = µEurope = µUSA • Alternative hypothesis – HA: Not all means are equal. 3. State the criteria for rejecting HO: • α = 0.05 4. Test the assumptions for the One-Way ANOVA: • Assumption of Independence (Research Design) • Assumption of Normality (Shapiro-Wilk Pr α, then retain the null hypothesis • If p < α, then reject the null hypothesis, and accept the alternative hypothesis Please watch the video linked to this module for detailed instructions. In general, you are going to: 1. Download “Business Memo Template DAX4.docx” from the DAX4 Assignment. 2. Save the downloaded file as a Word file and name it “LastName DAX4” without the quotation marks. The file extension will be added automatically. For example, my file would be “Bohler DAX4.docx”. Again, ignore the quotation marks in this part of the instructions. 3. Edit the date, from, and to fields of the memo to have the correct information. 4. For the subject area, change the title to something like “Examination of Sedan Weight by Origin” or something similar. 5. In paragraph 1, Identify the issue, discuss your research question. Why do you care if there is any difference by weight of sedans by country of origin? Be creative, but logical. Try to tie your interest in this question to some business function (Finance & Accounting, Sales & Marketing, Human Resources, or Manufacturing & 1 http://www.statsdirect.com/help/analysis_of_variance/anova.htm 1 Data Analysis Exercise 4 (DAX4) ANOVA Comparison between SAS and Excel Operations). Clearly state your research question “Is there a statistically significant difference in the weight of sedans based on their origin?” Clearly state the statistical question (you can copy and paste the following): • Null hypothesis – HO: µAsia = µEurope = µUSA • Alternative hypothesis – HA: Not all means are equal. 6. In paragraph 2, Analyze the issue, discuss your analysis. This is your “methodology” or recipe for analyzing the data to answer your research question. The following steps will help you fill out paragraph 2. 7. Open SAS. 8. Find the SASHELP.CARS data 9. Explore the SASHELP.CARS data (e.g. use the Characterize Data task). Choose the Origin and Weight variables to explore. Run the task (F3). Check the LOG for Errors, Warnings, and Notes, if there are any issues, fix the problem and rerun the task. Look at the OUTPUT. What are the values for “Origin”? What is the frequency for each “Origin” value? What is the N, Minimum, Mean, Maximum, and Standard Deviation of Weight for the dataset? Are there any missing values? Add any appropriate charts and tables to your Business Memo. 10. To perform an ANOVA, we need to make a few assumptions: a. Independence: The observations are random and independent samples from the population. b. Normality: Each group sample is drawn from a normally distributed population (errors are normally distributed). c. Homogeneity of variance: The variances of the distributions in the populations are equal. 11. Test for Independence: This is a methodological concern and is determined by the set up for the study. For this assignment, please consider that the assumption of independence has been met in the research study design. 12. Test for Normality: In SAS, run the “Summary Statistics” task. On the DATA tab, select the “SASHELP.CARS” dataset. Use Weight for the “Analysis Variable” and “Origin” for the “Classification Variable”. On the OPTIONS tab, expand “Statistics”, expand “Basic Statistics” and choose all options (e.g. Mean, Standard Deviation, etc.). Expand “Additional Statistics” and choose “Standard Error”, “Variance”, “Confidence limits for the mean” (Confidence level: 95%), “Coefficient of variation”, “Skewness”, and “Kurtosis”. Expand PLOTS and choose “Histogram” and “Add normal density curve”, “Comparative box plot”, and add inset statistics (overall and group). Run the task (F3). Check the LOG for Errors, Warnings, and Notes, if there are any issues, fix the problem and rerun the task. Look at the OUTPUT. Do you think we met the assumption of Normality? There are many things to consider (sample size for example) when determining the answer to this question, but some general rules of thumb to consider include: a. Is Skewness more than ± 0.8? If yes, that is a concern. b. Is Kurtosis more than ± 3.0? If yes, that is a concern. 2 Data Analysis Exercise 4 (DAX4) ANOVA Comparison between SAS and Excel c. Do the histograms appear normal (more or less)? If they do not approximate the normal distribution curve, that could be an issue that indicates that the normality assumption for an ANOVA is invalid. d. Look at the boxplots. Are there any extreme outliers? e. If your data does not violate these rules of thumb, you can accept the ANOVA assumptions have been satisfied for this analysis. Additional test for normality (Optional for this exercise) Open the “t Test” task, and on the DATA tab, select the SASHELP.CARS dataset, One-sample test, and select “Weight” as the “Analysis variable”. Click on the Filter icon and the Type = “Sedan” filter by typing it into the text box. On the OPTIONS tab, make sure “Tails:” is “Two-tailed test” and the “Tests for normality” box is checked. Run the tasks. Scroll down the input to you find the “Tests for Normality” table and review the “p Value” for the “Shapiro-Wilk” statistic. If this value is greater than alpha (α = 0.05) then, accept the assumption that the data came from a normally distributed population. If the Shapiro-Wilk p Value is less than α, then you need to consider if your assumption of normality is valid. In our example, the Shapiro-Wilk p Value is 0.0113 which is less than 0.05, which would mean that we should reject the null hypothesis that the population that Weight came from was normally distributed. However, Shapiro-Wilk is very conservative, and ANOVA is robust against the assumption of normality given a large enough data set and the group sizes are approximately equal. For our example: Asia n=94, Europe n=78, and USA n=90. That means in our case, we can consider our population as nearly normal in distribution and proceed with our analysis. If there no other concerns from the examination of the distribution of the data, then you can consider the assumption of normality to have been met. 13. Test for homogeneity of variance: We will examine this assumption by running the ANOVA task, alternatively you can run the “Distribution Analysis” task, but it is more involved and most of the information we will need can be found in the OUTPUT from the ANOVA Diagnostic Plots. 14. Select the “One-Way ANOVA” task. 15. Perform an ANOVA on the SASHELP.CARS. On the DATA tab, select the SASHELP.CARS dataset. 16. Click on the filter icon and in the text box that opens input the following: Type = “Sedan” 17. Select data using looking at “Weight” as the Dependent variable and the “Origin” variable as providing the categories. Under the OPTIONS tab, select all plots as Individual Plots, and leave the other options to default. 18. Run the task (F3). Is the mean Weight statistically different for the different values of “Origin” in the filtered (Sedan) data set? Report your findings in the Business Memo. 19. Download the SASHELP.CARS data set from the SAS Library. Refer to DAX 3 if you need help remembering how to download data from SAS. 20. Open Excel. If you have not already done so, enable the Data Analysis tools in Excel. 21. Open the SASHELP.CARS data file in Excel. 3 Data Analysis Exercise 4 (DAX4) ANOVA Comparison between SAS and Excel 22. Prepare the data for analysis in Excel (Excel requires the data be in columns), by putting the Origin and Weight data into a separate sheet. Rename the sheet “ANOVA Data”. I suggest using the Excel Table function to help organize and sort your data. 23. Run the Excel “Descriptive Statistics” task on the data, compare your results to the SAS descriptive stats. 24. Perform an ANOVA on the SASHELP.CARS data looking at “Origin” (Asia, Europe, and USA) and Weight. Is the Weight of the three values of “Origin” statistically different in this data set? 25. In paragraph 3, Discussion, record the findings of your analysis. 26. Compare the results from both Excel and SAS. Which tool did you like best for your analysis? Why? Report your observations in the Discussion area. 27. Any table, chart, or figure you include in your report must have a caption, and that table, chart, or figure must be referenced in your discussion. Include any relevant details (P values, tables, etc.) in your discussion to adequately justify any conclusions. 28. Spell check and grammar check your report. Your report should not exceed three pages, have at least one table, and one graphic. 29. In paragraph 4, Conclusion, clearly answer your research question, “Is the average weight of sedans in this dataset statistically similar regardless of origin?” and tie it back to your interest in this analysis (your creative writing from paragraph 1). 30. Save your Word file as a PDF file as “LastName DAX4” without the quotation marks. The file extension will be added automatically. For example, my file would be “Bohler DAX4.pdf”. Again, ignore the quotation marks in the instructions. 31. Upload your completed Business Memo to this assignment. Now you know how to do an ANOVA in SAS and Excel! Try this exercise using a different dataset to see if you understand how to apply this new knowledge and skills to other real-world questions. 4
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Attached.

1

Current Date
From: [Name Here]
To: [Name Here]
Hypothesis
Sub: Examination of Sedan Weight by Origin

Research question: Is the average weight of sedans in this dataset statistically similar regardless
of background?
It is estimated 1.4 tons as the average weight of a car, and such investigation action is
channeled to truth excavation of the average weight of sedans with similar issues apart from its
originality. Many nations manufacture different vehicles with different weight. The problem in
hand is the argument on the average sedan of each car brands offered by SASHELP Cars. These
types of cars have a curb weight of about 2,073 pounds that is about 23 pounds more than the
well-known smart vehicle that only accommodates two people (Evera, 2018). Thus such data
makes the marketability of sedans from various nations not the same.
• HO: µAsia same to µEurope and equal to µUSA
• HA: means are not the same.
2. Issue Analyze
By applying SASHELP Cars database, full scrutiny was carried out. Weight and Origin were
important explored variables. First, weight-saving technologies effects on the pressure on some
cars that were anticipated from SASHELP Cars. Others vehicles have a lower weight something
that is credited for ensuring they are much fuel efficient. The major primary opposing factors that
lead to such changes in pressure were vehicle capability improvements like higher performance

2

and higher equipment. This is not the same for all the three blocks, i.e., Asia, America, and
Europe. Material enhancements, better design as well as man...


Anonymous
Super useful! Studypool never disappoints.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Related Tags