statistics

User Generated

eraalqneyvat

Mathematics

Description

Week 2 Project - STAT 3001

Student Name: <Type your name here>

Date:<Enter the date on which you began working on this assignment.>

Instructions:To complete this project, you will need the following materials:

  • STATDISK User Manual (found in the classroom in DocSharing)
  • Access to the Internet to download the STATDISK program.

This assignment is worth a total of 60 points.

Part I.Histograms and Frequency Tables

Instructions

Answers

  • Open the file Freshmen 15 using menu option Datasets and then Elementary Stats, 13th Edition.This file contains data on sex, weight (in kg) and BMI in September and April of college freshmen.What are the names of the variables in this file?
  • Create a histogram for the weight in September (column 2) using the Auto-fit option.Paste the chart here. Once your histogram displays, click Turn on Labels to get the height of the bars.
  • Using the information in the above histogram, complete this table.Be sure to estimate the range of heights for each bar as well as the frequency, relative frequency, and cumulative frequency

Depth

Frequency

Relative Frequency

Cumulative Frequency

35-44.9

45-54.9

55-64.9

65-74.9

75-84.9

85-94.9

95-104.9

a.Using the frequency table above, how many of the freshmen had a weight of less than 64.9 kg or less?How do you know?

b.Using the frequency table above, how many of the freshmen weighed between 55 and 74.9 kg?Show your work.

c.What percent of the freshman have a weight of 75kg or more?

Part II.Comparing Datasets

Instructions

Answers

  • Create a boxplot that compares the weight in September and April.(Columns 2 and 3) Paste it here.
  • Describe the similarities and differences in the data sets.Please be specific to the graph created. You might compare the percentiles, range, and median.
  • Open the file named Word Counts (using Datasets and then Elementary Stats, 13th Edition).This gives information on the word count of different age groups for male and female psychology students.List all the variables in the dataset.
  • Find the Mean, median, and midrange for the Data in Column 3.
  • Find the Range, variance, and standard deviation for Column 3.
  • List any values for column 3 that you think may be outliers.Why do you think that?
  • Find the Mean, median, and midrange for the data in Column 4.
  • Find the Range, variance, and standard deviation for the data in Column 4.
  • List any values for column 4 that you think may be outliers.Why do you think that?
  • Find the five-number summary for the number of words data in Columns 3 and 4. You will need to label each of the columns with an appropriate measure in the top row for clarity. (Refer to page 167 in the text.)
  • Compare the number of words for females and males using a boxplot of Columns 3 and 4.Paste your boxplot here
  • Create a histogram for the
    Column 3 data.
  • Create a histogram for the
    Column 4 data.

Part III.Finding Descriptive Numbers

Instructions

Answers

[Hint:You may want to sort the data and look at the smallest and largest values.]

Part IV.Interpreting Statistical Information

The Word Count by Males and Females contains the count of the number of words spoken in a day by males and females in six different age groups.

Group A:Column 3 contains the number of words spoken in a day for male student’s age 17 to 25 in introductory psychology classes in Mexico

Group B:Column 4 contains the number of words spoken in a day for female student’s age 17 to 25 in introductory psychology classes in Mexico

Using the descriptive statistics found above, what conclusions can you make comparing the number of words spoken in Group A with Group B? You want to address EACH of the following points below.Please be sure to use SPECIFIC values to support your reasoning (hint: you may want to consider the descriptive statistics found in Part III as well as the histograms). You must justify your conclusions with Stat Disk output from Part III of this project for each portion below.

  • One conclusion about a measure of center (mean, median, midrange).
  • One conclusion about the variability in the two datasets (variance, standard deviation, range)
  • One conclusion about the shape of the distribution (mention direction of skew and relationship of the mean and median).

Submit your final draft of your Word file by going to Week 2, Project, and follow the directions under Week 2 Assignment 2. Please use the naming convention "WK2Assgn2+first initial+last name" as the Submission Title.

Unformatted Attachment Preview

Week 2 Project - STAT 3001 Student Name: Date: Instructions: To complete this project, you will need the following materials: • STATDISK User Manual (found in the classroom in DocSharing) • Access to the Internet to download the STATDISK program. This assignment is worth a total of 60 points. Part I. Histograms and Frequency Tables Instructions Answers 1. Open the file Freshmen 15 using menu option Datasets and then Elementary Stats, 13th Edition. This file contains data on sex, weight (in kg) and BMI in September and April of college freshmen. What are the names of the variables in this file? 2. Create a histogram for the weight in September (column 2) using the Auto-fit option. Paste the chart here. Once your histogram displays, click Turn on Labels to get the height of the bars. 3. Using the information in the above histogram, complete this table. Be sure to estimate the range of heights for each bar as well as the frequency, relative frequency, and cumulative frequency Depth Frequency 35-44.9 45-54.9 55-64.9 65-74.9 75-84.9 85-94.9 95-104.9 a. Using the frequency table above, how many of the freshmen had a weight of less than 64.9 kg or less? How do you know? 1 Relative Frequency Cumulative Frequency b. Using the frequency table above, how many of the freshmen weighed between 55 and 74.9 kg? Show your work. c. What percent of the freshman have a weight of 75kg or more? Part II. Comparing Datasets Instructions Answers 1. Create a boxplot that compares the weight in September and April. (Columns 2 and 3) Paste it here. 2. Describe the similarities and differences in the data sets. Please be specific to the graph created. You might compare the percentiles, range, and median. Part III. Finding Descriptive Numbers Instructions Answers 3. Open the file named Word Counts (using Datasets and then Elementary Stats, 13th Edition). This gives information on the word count of different age groups for male and female psychology students. List all the variables in the dataset. 4. Find the Mean, median, and midrange for the Data in Column 3. 5. Find the Range, variance, and standard deviation for Column 3. 2 6. List any values for column 3 that you think may be outliers. Why do you think that? [Hint: You may want to sort the data and look at the smallest and largest values.] 7. Find the Mean, median, and midrange for the data in Column 4. 8. Find the Range, variance, and standard deviation for the data in Column 4. 9. List any values for column 4 that you think may be outliers. Why do you think that? 10. Find the five-number summary for the number of words data in Columns 3 and 4. You will need to label each of the columns with an appropriate measure in the top row for clarity. (Refer to page 167 in the text.) 11. Compare the number of words for females and males using a boxplot of Columns 3 and 4. Paste your boxplot here 12. Create a histogram for the Column 3 data. 13. Create a histogram for the Column 4 data. Part IV. Interpreting Statistical Information The Word Count by Males and Females contains the count of the number of words spoken in a day by males and females in six different age groups. 3 Group A: Column 3 contains the number of words spoken in a day for male student’s age 17 to 25 in introductory psychology classes in Mexico Group B: Column 4 contains the number of words spoken in a day for female student’s age 17 to 25 in introductory psychology classes in Mexico Using the descriptive statistics found above, what conclusions can you make comparing the number of words spoken in Group A with Group B? You want to address EACH of the following points below. Please be sure to use SPECIFIC values to support your reasoning (hint: you may want to consider the descriptive statistics found in Part III as well as the histograms). You must justify your conclusions with Stat Disk output from Part III of this project for each portion below. a. One conclusion about a measure of center (mean, median, midrange). b. One conclusion about the variability in the two datasets (variance, standard deviation, range) c. One conclusion about the shape of the distribution (mention direction of skew and relationship of the mean and median). Submit your final draft of your Word file by going to Week 2, Project, and follow the directions under Week 2 Assignment 2. Please use the naming convention "WK2Assgn2+first initial+last name" as the Submission Title. 4 Statdisk User Manual Statdisk User Manual 13.0.0 for STAT 3001 1 Statdisk User Manual Table of Contents Open a File …………………………..………………………………………………………………4 Edit Column Titles …………………………………………………………………………………..4 Copy a Dataset ………………………………………………………………………………………5 Paste a Dataset ………………………………………………………………………………………5 Sort a Dataset ………………………………………………………………………………………..6 Sample Transformations …………………………………………………………………………….6 Descriptive Statistics ………………………………………………………………………………..7 Creating a Histogram ……………………………………………………………………………….8 Creating Boxplots …………………………………………………………………………….….…9 Normal Distribution …………………………………………………….………………….…….…9 Confidence Intervals ……………………………………………………………………….………11 Hypothesis Testing ………………………………………………………………………….….…..12 Correlation and Regression ………………………………………………………………….….….13 Multiple Regression …………………………………………………………………….…….……14 Chi-Square Goodness-of-Fit …………………………………………………………………….…15 Chi-Square Test of Independence …………………………………………………………………18 One-Way Analysis of Variance ……………………………………………………………………19 2 Statdisk User Manual When you open the Statdisk program you will see the screen shown in Figure 1. Be certain that you are using Version 13.0.0. Click on the OK button to close the Statdisk information screen. Figure 1 You can perform all Statdisk functions from the Sample Editor Screen. The top of the screen has the following menus: File, Edit, Analysis, Data, Datasets, Window, and Help as shown in Figure 2. Figure 2 Along with performing statistical calculations, Statdisk is also compatible with many popular software application packages. You can import, copy, paste, save, print and transform data sets. You can also copy, paste, save, or print any of the Statdisk numerical or graphical outputs and export them into other programs such as Microsoft Word. Those options are available as clickable buttons at the top of the Sample Editor screen as shown in Figure 3. Figure 3 3 Statdisk User Manual Opening a File Statdisk has numerous datasets stored in the program and can be accessed by clicking on Datasets at the top of the Sample Editor window. After opening Datasets go to Elementary Statistics 13th Edition. The names of the datasets will appear to the right. Click on Body Data and the data values will appear in the Sample Editor as shown in figure 4. Figure 4 You can preview the datasets quickly by opening a data set, review the data, and then select Clear to move in to the next file. You can also access datasets that Statdisk has available online by going to Help and then Triola Statistics Series. Using Data Tools After you have opened a dataset or have typed in data to the Sample Editor, you can edit column titles, sort data, delete columns, add columns or rows, or explore the data set by opening the Data Tools menu. The Data Tools button is located at the top of the Sample Editor page. To Edit column titles open up Data Tools and then Edit column titles. Type in the names of the column titles into the box shown in Figure 5. 4 Statdisk User Manual Figure 5 Click on the Save Changes button to enter the new column titles. Copy and Paste The Copy button is at the top of the Sample Editor Screen. To copy columns from a data set simply click on the Copy button and a screen will appear asking you which column of data you want to copy (see figure 6). You can copy all of the columns or select columns. To Paste the column of data values into another column. Click on the column title (or number) then open the Edit menu and select Paste. Figure 6 5 Statdisk User Manual Sort Data To sort data, select Data Tools and then select Sort Data. Use the drop-down arrow to select Sort One column, then select the column title and order from A to Z (see figure 7). Then click on Sort. The data values in that column will be sorted from lowest value to highest value. Figure 7 The Data Menu The two main menus in Statdisk are Analysis and Data. The Data menu is used to sort data, add data, transform data, generate descriptive statistics including charts and graphs, assess normality and generate sets of data values that emulate one of the standard types of statistical distributions. The Analysis menu is use to find area under the curve for many of the standard statistical distributions, determine sample size, create confidence intervals, perform hypothesis tests for parametric and nonparametric models. Using the Data Menu To transform a dataset you first need to type data into the sample editor or select an existing dataset. Open the Body Data file that was referenced in Opening a File earlier in the manual. Select Data and then Sample Transformations to open the Sample Transformer widow (see Figure 8). The Source column is the column containing the dataset that you want to transform. Select the operation that will be used to change the data values and type in the constant that you will add, subtract, multiply, divide, mod value, or raise to a power to the data values. After you click on Basic Transform the new data set will appear in the Sample Transformer window. 6 Statdisk User Manual Figure 8 Descriptive Statistics The descriptive statistics of a data set can be found by opening the Data menu and selecting Descriptive Statistics. Select the column that the data set is in and then click on Evaluate. A list of the most commonly used numerical descriptive statistics will be shown (see figure 9). Figure 9 7 Statdisk User Manual Histogram A visual display of a single set of data values can be shown by opening the Data menu and then selecting Histogram. Select the column that the data values are in. If you would like the Statdisk program to automatically select the class width and the class start, select Auto-fit. You can display the count or the frequency for each class by selecting Bar Labels. Click on Plot to display the graph (see figure 10). Figure 10 You can also Print, Copy or Save the histogram and later paste the display in a Word file. 8 Statdisk User Manual Boxplots If you would like to compare two or more sets of data values you can plot them on one graph by using boxplots. Open the Data menu and select Boxplot. Then select the columns containing the data values that you would like to compare. You can then select Boxplot to show a standard view of the boxplots or Modified Boxplot which will emphasize outliers (see figure 11). Figure 11 Using the Analysis Menu Statdisk can perform many basic statistical functions relating to probability distributions, confidence intervals, hypothesis testing, correlation and regression, Chi-square and other non-parametric tests, and sample-size determination. This manual will explain how to perform many of those basic statistical functions. Normal Distribution You do not need to have a set of data values in the Sample Editor to use the probability distribution functions. Open the Analysis menu and select Probability Distributions. The first four functions, Normal Distribution, Student-t Distribution, Chi-Square Distribution, and the F Distribution perform the same type of tasks. Select Normal Distribution. The screen shown in Figure 12 will appear. 9 Statdisk User Manual Figure 12 You can enter a Z value into the box to the right of z Value: or you can enter an amount of area to the left of some Z value under the standard normal distribution in the box to the right of Cumulative area from the left:. Figure 13 shows the standard normal distribution with Z-values along the bottom axis and the area under the curve between the given Z-values. Statdisk will find the given values and any other values that are not shown on the table. Figure 13 Open the Analysis menu and then select Probability Distributions and then Normal Distribution. Enter -1 into the box for Z Value and then click on Evaluate. Figure 14 shows the Statdisk output. 10 Statdisk User Manual Figure 14 The output gives the discrete probability of getting -1 or .2419707. It also gives the cumulative area to the left of -1 or .158655. If you add the areas to the left of -1 shown in Figure 13 you will get the same amount. If you put in any value between 0 and 1 representing the area to the left of a Z score and then press Evaluate you will get the associated Z value. Confidence Intervals To find a confidence interval for a sample statistic you do not need to type in any data values or have a dataset in the Sample Editor. For example, to find a confidence interval for one-sample mean open up the Analysis menu then select Confidence Intervals and then Mean-One Sample. Figure 15 shows the Statdisk output screen for a 95% confidence interval with a sample mean of 26.7, a sample standard deviation of 4.1, and a sample size of 40. The confidence interval of 25.29 to 28.01 is given. The Margin of error is the distance from the mean to the upper value and the distance from the mean to the lower value of the confidence interval. Figure 15 11 Statdisk User Manual If you are given a set of data values and not given any of the sample statistics such as the mean and standard deviation you must first use Descriptive Statistics to find the values needed to enter into the Confidence Interval: Mean-One Sample window that is shown in Figure 15. Hypothesis Testing The hypothesis testing procedures in Statdisk are very similar to the confidence interval procedures. To perform a hypothesis test about a one-sample mean open up the Analysis menu and then select Hypothesis Testing, and then Mean-One Sample. Figure 16 shows the Statdisk output for an alternative hypothesis that the population mean is equal to the claimed mean. The claimed mean is equal to 25 and the sample mean is 23.7 with a sample standard deviation of 4.5 with a sample size of 32. The hypothesis is tested at the .05 level of significance. After you select Evaluate, you get the information shown in Figure 16. The information is provided on the right of the screen for the provided inputs. Figure 16 As with confidence intervals if you are given a set of data values and not given any of the sample statistics such as the mean and standard deviation you must first use Descriptive Statistics to find the values needed to enter into the Hypothesis Testing: One Mean window that is shown in Figure 16. Figure 17 shows a normal probability plot representing the visual interpretation of the hypothesis test. 12 Statdisk User Manual Figure 17 Correlation and Regression To compute a correlation or create a regression equation you first need to type data into the Sample Editor or select an existing dataset. Open Datasets and select Elementary Stats 13th Edition. Open the IQ and Brain Size dataset. Select Analysis and then Correlation and Regression. Select column 4 for the x-variable and column 5 for the y-variable and then click on Evaluate (see Figure 18). The information for both the correlation and the regression is shown in the output window on the right. Figure 18 If you click on Plot a scatterplot of the correlation data and the line-of-best fit from the regression will be displayed (see Figure 19). 13 Statdisk User Manual Figure 19 Multiple Regression To generate a multiple regression equation you first need to type data into the Sample Editor or select an existing dataset. Open Datasets and select Elementary Stats 13th Edition. Open the IQ and Brain Size dataset. Select Analysis and then Multiple Regression. Select columns 4, 5, and 8 to be included in the regression analysis. Select 4 for the Dependent variable column. Click on Evaluate to generate the multiple regression statistics (see Figure 20). Your regression equation with rounded coefficients would be y = 29.4 - .019X1 + 1.65X2 The efficiency of the regression equation would be the Adjusted R2 value. Figure 20 14 Statdisk User Manual Chi-Square Goodness-of-Fit : Equal Expected Frequencies To generate a Goodness-of-Fit test you must first type data into the Sample Editor or select an existing dataset. Imagine that a company wants to know if auto accidents occur equally throughout the days of the week. Use the Clear button at the top-left of the Sample Editor screen to erase any existing data. The number of accidents that occur each day of the week are as follows: M T W TR F 45 36 17 29 52 Type the data into column 1, then use the Edit Column Titles option under the Data Tools button at the bottom of the Sample Editor screen to name List 1: Accidents (see Figure 21). Figure 21 Select Analysis and then Goodness-of-fit. Chose Equal Expected Frequencies since the company is testing to see if accidents occur equally. Set the significance level to 0.05 and select 1 as the column to be the Observed Frequencies. Click on Evaluate to generate the Goodness-of-Fit test. The results are shown in the output window to the right (see Figure 22). 15 Statdisk User Manual Figure 22 Press Plot to view a visual representation of the Chi-Square Distribution of the data. The graph shows the Critical Value, X2 : 9.488 and the Test Statistic, X2: 20.860 (see figure 22). Goodness-of-Fit: Unequal Expected Frequencies An ice cream company wishes to discover the popularity of their offered ice cream flavors. The Expected frequencies are given: Vanilla Chocolate Strawberry Other 42% 33% 14% 11% The University of Florida surveyed a sample size of n=250 students questioning their preferred ice cream flavor. The observed data collected is shown in the table below. Vanilla Chocolate Strawberry Other 114 68 47 21 In order to generate the goodness-of-fit test, the data must be entered into the Sample Editor. Use the Clear button at the bottem of the Sample Editor screen to erase any existing data. Enter the observed values into List 1 and enter the expected frequencies into List 2. Click on Analysis and then Goodnessof-Fit. Chose the Unequal Expected Frequencies option since the company is not testing to see if the flavors are equally popular. Because the expected frequencies were given as proportions, chose the As Proportions option under Enter Expected Frequencies (as decimals). Set the Observed Column option as 1 and the Expected Column option as 2. We will set the Significance level to 0.05. Click Evaluate (see Figure 25). 16 Statdisk User Manual Figure 25 Click Plot to view a visual representation of the Chi-Square Distribution. The Critical Value, X2 is shown as 7.815 and the Test Statistic, X2 is shown as 8.971 (see Figure 26). Figure 26 17 Statdisk User Manual Chi-Square Test of Independence (Contingency Tables) To generate a Contingency table test you must first type data into the Sample editor or select an existing data set. A company seeks to discover which color of car that males prefer and which color of car that females prefer. Use the Clear button at the bottem of the Sample Editor screen to erase any existing data. The data collected is as follows: Red Blue Green White Male 21 17 44 8 Female 28 24 14 18 Enter the data into the Sample Editor exactly as it is shown in the table (see Figure 27). Figure 27 Select Analysis and then Contingency Tables. Then chose columns 1, 2, 3 and 4 to include in the analysis. We will set the significance level ot 0.05. Click Evaluate to view the results shown in the output window to the right (see Figure 28). Figure 28 18 Statdisk User Manual Click Plot to display a visual of the Chi-Square Distribution. The Critical Value, X2 is shown to be 7.815 and the Test Statistic, X2 is shown to be 21.377 (see Figure 29). Figure 29 One-Way Analysis of Variance (ANOVA) To use the Analysis of Variance (ANOVA) function in Statdisk you first need to type data into the sample editor or select an existing dataset. Go to Elementary Statistics 13th Edition and select the Garbage Weights dataset. Go to the Analysis menu and then select One-Way Analysis of Variance. Select columns 2, 3, and 4 and click on Evaluate. Figure 30. The hypothesis testing results are shown in the box on the right. (see figure 30.) 19
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Buddy, above is a sol...


Anonymous
I use Studypool every time I need help studying, and it never disappoints.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Related Tags