Week 2 Project - STAT 3001
Student Name:
Date:
Instructions: To complete this project, you will need the following materials:
• STATDISK User Manual (found in the classroom in DocSharing)
• Access to the Internet to download the STATDISK program.
This assignment is worth a total of 60 points.
Part I. Histograms and Frequency Tables
Instructions
Answers
1. Open the file Freshmen 15
using menu option Datasets
and then Elementary Stats,
13th Edition.
This file
contains data on sex, weight
(in kg) and BMI in September
and
April
of
college
freshmen.
What are the
names of the variables in this
file?
2. Create a histogram for the
weight in September (column
2) using the Auto-fit option.
Paste the chart here. Once
your histogram displays, click
Turn on Labels to get the
height of the bars.
3. Using the information in the
above histogram, complete
this table. Be sure to estimate
the range of heights for each
bar as well as the frequency,
relative
frequency,
and
cumulative frequency
Depth
Frequency
35-44.9
45-54.9
55-64.9
65-74.9
75-84.9
85-94.9
95-104.9
a. Using the frequency table
above, how many of the
freshmen had a weight of
less than 64.9 kg or less?
How do you know?
1
Relative
Frequency
Cumulative
Frequency
b. Using the frequency table
above, how many of the
freshmen weighed
between 55 and 74.9 kg?
Show your work.
c. What percent of the
freshman have a weight
of 75kg or more?
Part II. Comparing Datasets
Instructions
Answers
1. Create a boxplot that compares
the weight in September and
April. (Columns 2 and 3)
Paste it here.
2. Describe the similarities and
differences in the data sets.
Please be specific to the graph
created. You might compare
the percentiles, range, and
median.
Part III. Finding Descriptive Numbers
Instructions
Answers
3. Open the file named Word
Counts (using Datasets and
then Elementary Stats, 13th
Edition). This gives
information on the word count
of different age groups for
male and female psychology
students. List all the variables
in the dataset.
4. Find the Mean, median, and
midrange for the Data in
Column 3.
5. Find the Range, variance, and
standard deviation for Column
3.
2
6. List any values for column 3
that you think may be outliers.
Why do you think that?
[Hint: You may want to sort the
data and look at the smallest and
largest values.]
7. Find the Mean, median, and
midrange for the data in
Column 4.
8. Find the Range, variance, and
standard deviation for the data
in Column 4.
9. List any values for column 4
that you think may be outliers.
Why do you think that?
10. Find the five-number summary
for the number of words data
in Columns 3 and 4. You will
need to label each of the
columns with an appropriate
measure in the top row for
clarity. (Refer to page 167 in
the text.)
11. Compare the number of words
for females and males using a
boxplot of Columns 3 and 4.
Paste your boxplot here
12. Create a histogram for the
Column 3 data.
13. Create a histogram for the
Column 4 data.
Part IV. Interpreting Statistical Information
The Word Count by Males and Females contains the count of the number of words spoken in a day by males
and females in six different age groups.
3
Group A: Column 3 contains the number of words spoken in a day for male student’s age 17 to 25 in
introductory psychology classes in Mexico
Group B: Column 4 contains the number of words spoken in a day for female student’s age 17 to 25 in
introductory psychology classes in Mexico
Using the descriptive statistics found above, what conclusions can you make comparing the number of
words spoken in Group A with Group B? You want to address EACH of the following points below.
Please be sure to use SPECIFIC values to support your reasoning (hint: you may want to consider the
descriptive statistics found in Part III as well as the histograms). You must justify your conclusions with
Stat Disk output from Part III of this project for each portion below.
a. One conclusion about a measure of center (mean, median, midrange).
b. One conclusion about the variability in the two datasets (variance, standard deviation, range)
c. One conclusion about the shape of the distribution (mention direction of skew and relationship of
the mean and median).
Submit your final draft of your Word file by going to Week 2, Project, and follow the directions under
Week 2 Assignment 2. Please use the naming convention "WK2Assgn2+first initial+last name" as the
Submission Title.
4
Statdisk User Manual
Statdisk User Manual 13.0.0
for
STAT 3001
1
Statdisk User Manual
Table of Contents
Open a File …………………………..………………………………………………………………4
Edit Column Titles …………………………………………………………………………………..4
Copy a Dataset ………………………………………………………………………………………5
Paste a Dataset ………………………………………………………………………………………5
Sort a Dataset ………………………………………………………………………………………..6
Sample Transformations …………………………………………………………………………….6
Descriptive Statistics ………………………………………………………………………………..7
Creating a Histogram ……………………………………………………………………………….8
Creating Boxplots …………………………………………………………………………….….…9
Normal Distribution …………………………………………………….………………….…….…9
Confidence Intervals ……………………………………………………………………….………11
Hypothesis Testing ………………………………………………………………………….….…..12
Correlation and Regression ………………………………………………………………….….….13
Multiple Regression …………………………………………………………………….…….……14
Chi-Square Goodness-of-Fit …………………………………………………………………….…15
Chi-Square Test of Independence …………………………………………………………………18
One-Way Analysis of Variance ……………………………………………………………………19
2
Statdisk User Manual
When you open the Statdisk program you will see the screen shown in Figure 1. Be certain that you are
using Version 13.0.0. Click on the OK button to close the Statdisk information screen.
Figure 1
You can perform all Statdisk functions from the Sample Editor Screen. The top of the screen has the
following menus: File, Edit, Analysis, Data, Datasets, Window, and Help as shown in Figure 2.
Figure 2
Along with performing statistical calculations, Statdisk is also compatible with many popular software
application packages. You can import, copy, paste, save, print and transform data sets. You can also
copy, paste, save, or print any of the Statdisk numerical or graphical outputs and export them into other
programs such as Microsoft Word. Those options are available as clickable buttons at the top of the
Sample Editor screen as shown in Figure 3.
Figure 3
3
Statdisk User Manual
Opening a File
Statdisk has numerous datasets stored in the program and can be accessed by clicking on Datasets at the
top of the Sample Editor window. After opening Datasets go to Elementary Statistics 13th Edition.
The names of the datasets will appear to the right. Click on Body Data and the data values will appear
in the Sample Editor as shown in figure 4.
Figure 4
You can preview the datasets quickly by opening a data set, review the data, and then select Clear to
move in to the next file. You can also access datasets that Statdisk has available online by going to
Help and then Triola Statistics Series.
Using Data Tools
After you have opened a dataset or have typed in data to the Sample Editor, you can edit column titles,
sort data, delete columns, add columns or rows, or explore the data set by opening the Data Tools menu.
The Data Tools
button is located at the top of the Sample Editor page.
To Edit column titles open up Data Tools and then Edit column titles. Type in the names of the column
titles into the box shown in Figure 5.
4
Statdisk User Manual
Figure 5
Click on the Save Changes button to enter the new column titles.
Copy and Paste
The Copy
button is at the top of the Sample Editor Screen.
To copy columns from a data set simply click on the Copy button and a screen will appear asking you
which column of data you want to copy (see figure 6). You can copy all of the columns or select
columns. To Paste the column of data values into another column. Click on the column title (or
number) then open the Edit menu and select Paste.
Figure 6
5
Statdisk User Manual
Sort Data
To sort data, select Data Tools and then select Sort Data. Use the drop-down arrow to select Sort One
column, then select the column title and order from A to Z (see figure 7). Then click on Sort. The data
values in that column will be sorted from lowest value to highest value.
Figure 7
The Data Menu
The two main menus in Statdisk are Analysis
and Data.
The Data menu is used to sort data, add data, transform data, generate descriptive statistics including
charts and graphs, assess normality and generate sets of data values that emulate one of the standard
types of statistical distributions.
The Analysis menu is use to find area under the curve for many of the standard statistical distributions,
determine sample size, create confidence intervals, perform hypothesis tests for parametric and nonparametric models.
Using the Data Menu
To transform a dataset you first need to type data into the sample editor or select an existing dataset.
Open the Body Data file that was referenced in Opening a File earlier in the manual. Select Data and
then Sample Transformations to open the Sample Transformer widow (see Figure 8). The Source
column is the column containing the dataset that you want to transform. Select the operation that will be
used to change the data values and type in the constant that you will add, subtract, multiply, divide, mod
value, or raise to a power to the data values. After you click on Basic Transform the new data set will
appear in the Sample Transformer window.
6
Statdisk User Manual
Figure 8
Descriptive Statistics
The descriptive statistics of a data set can be found by opening the Data menu and selecting Descriptive
Statistics. Select the column that the data set is in and then click on Evaluate. A list of the most
commonly used numerical descriptive statistics will be shown (see figure 9).
Figure 9
7
Statdisk User Manual
Histogram
A visual display of a single set of data values can be shown by opening the Data menu and then
selecting Histogram.
Select the column that the data values are in. If you would like the Statdisk program to automatically
select the class width and the class start, select Auto-fit. You can display the count or the frequency for
each class by selecting Bar Labels. Click on Plot to display the graph (see figure 10).
Figure 10
You can also Print, Copy or Save the histogram and later paste the display in a Word file.
8
Statdisk User Manual
Boxplots
If you would like to compare two or more sets of data values you can plot them on
one graph by using boxplots. Open the Data menu and select Boxplot. Then
select the columns containing the data values that you would like to compare. You
can then select Boxplot to show a standard view of the boxplots or Modified
Boxplot which will emphasize outliers (see figure 11).
Figure 11
Using the Analysis Menu
Statdisk can perform many basic statistical functions relating to probability distributions, confidence
intervals, hypothesis testing, correlation and regression, Chi-square and other non-parametric tests, and
sample-size determination. This manual will explain how to perform many of those basic statistical
functions.
Normal Distribution
You do not need to have a set of data values in the Sample Editor to use the probability distribution
functions. Open the Analysis menu and select Probability Distributions. The first four functions,
Normal Distribution, Student-t Distribution, Chi-Square Distribution, and the F Distribution perform the
same type of tasks. Select Normal Distribution. The screen shown in Figure 12 will appear.
9
Statdisk User Manual
Figure 12
You can enter a Z value into the box to the right of z Value: or you can enter an amount of area to the
left of some Z value under the standard normal distribution in the box to the right of Cumulative area
from the left:. Figure 13 shows the standard normal distribution with Z-values along the bottom axis
and the area under the curve between the given Z-values. Statdisk will find the given values and any
other values that are not shown on the table.
Figure 13
Open the Analysis menu and then select Probability Distributions and then Normal Distribution.
Enter -1 into the box for Z Value and then click on Evaluate. Figure 14 shows the Statdisk output.
10
Statdisk User Manual
Figure 14
The output gives the discrete probability of getting -1 or .2419707. It also gives the cumulative area to
the left of -1 or .158655. If you add the areas to the left of -1 shown in Figure 13 you will get the same
amount.
If you put in any value between 0 and 1 representing the area to the left of a Z score and then press
Evaluate you will get the associated Z value.
Confidence Intervals
To find a confidence interval for a sample statistic you do not need to type in any data values or have a
dataset in the Sample Editor. For example, to find a confidence interval for one-sample mean open up
the Analysis menu then select Confidence Intervals and then Mean-One Sample. Figure 15 shows the
Statdisk output screen for a 95% confidence interval with a sample mean of 26.7, a sample standard
deviation of 4.1, and a sample size of 40. The confidence interval of 25.29 to 28.01 is given. The
Margin of error is the distance from the mean to the upper value and the distance from the mean to the
lower value of the confidence interval.
Figure 15
11
Statdisk User Manual
If you are given a set of data values and not given any of the sample statistics such as the mean and
standard deviation you must first use Descriptive Statistics to find the values needed to enter into the
Confidence Interval: Mean-One Sample window that is shown in Figure 15.
Hypothesis Testing
The hypothesis testing procedures in Statdisk are very similar to the confidence interval procedures. To
perform a hypothesis test about a one-sample mean open up the Analysis menu and then select
Hypothesis Testing, and then Mean-One Sample. Figure 16 shows the Statdisk output for an
alternative hypothesis that the population mean is equal to the claimed mean. The claimed mean is equal
to 25 and the sample mean is 23.7 with a sample standard deviation of 4.5 with a sample size of 32. The
hypothesis is tested at the .05 level of significance. After you select Evaluate, you get the information
shown in Figure 16. The information is provided on the right of the screen for the provided inputs.
Figure 16
As with confidence intervals if you are given a set of data values and not given any of the sample
statistics such as the mean and standard deviation you must first use Descriptive Statistics to find the
values needed to enter into the Hypothesis Testing: One Mean window that is shown in Figure 16.
Figure 17 shows a normal probability plot representing the visual interpretation of the hypothesis test.
12
Statdisk User Manual
Figure 17
Correlation and Regression
To compute a correlation or create a regression equation you first need to type data into the Sample
Editor or select an existing dataset. Open Datasets and select Elementary Stats 13th Edition. Open
the IQ and Brain Size dataset. Select Analysis and then Correlation and Regression. Select column 4
for the x-variable and column 5 for the y-variable and then click on Evaluate (see Figure 18). The
information for both the correlation and the regression is shown in the output window on the right.
Figure 18
If you click on Plot a scatterplot of the correlation data and the line-of-best fit from the regression will
be displayed (see Figure 19).
13
Statdisk User Manual
Figure 19
Multiple Regression
To generate a multiple regression equation you first need to type data into the Sample Editor or select
an existing dataset. Open Datasets and select Elementary Stats 13th Edition. Open the IQ and Brain
Size dataset. Select Analysis and then Multiple Regression. Select columns 4, 5, and 8 to be included
in the regression analysis. Select 4 for the Dependent variable column. Click on Evaluate to generate
the multiple regression statistics (see Figure 20). Your regression equation with rounded coefficients
would be y = 29.4 - .019X1 + 1.65X2 The efficiency of the regression equation would be the Adjusted
R2 value.
Figure 20
14
Statdisk User Manual
Chi-Square Goodness-of-Fit : Equal Expected Frequencies
To generate a Goodness-of-Fit test you must first type data into the Sample Editor or select an existing
dataset. Imagine that a company wants to know if auto accidents occur equally throughout the days of
the week. Use the Clear button at the top-left of the Sample Editor screen to erase any existing data.
The number of accidents that occur each day of the week are as follows:
M
T
W
TR
F
45
36
17
29
52
Type the data into column 1, then use the Edit Column Titles option under the Data Tools button at the
bottom of the Sample Editor screen to name List 1: Accidents (see Figure 21).
Figure 21
Select Analysis and then Goodness-of-fit. Chose Equal Expected Frequencies since the company is
testing to see if accidents occur equally. Set the significance level to 0.05 and select 1 as the column to
be the Observed Frequencies. Click on Evaluate to generate the Goodness-of-Fit test. The results are
shown in the output window to the right (see Figure 22).
15
Statdisk User Manual
Figure 22
Press Plot to view a visual representation of the Chi-Square Distribution of the data. The graph shows
the Critical Value, X2 : 9.488 and the Test Statistic, X2: 20.860 (see figure 22).
Goodness-of-Fit: Unequal Expected Frequencies
An ice cream company wishes to discover the popularity of their offered ice cream flavors. The
Expected frequencies are given:
Vanilla
Chocolate
Strawberry
Other
42%
33%
14%
11%
The University of Florida surveyed a sample size of n=250 students questioning their preferred ice
cream flavor. The observed data collected is shown in the table below.
Vanilla
Chocolate
Strawberry
Other
114
68
47
21
In order to generate the goodness-of-fit test, the data must be entered into the Sample Editor. Use the
Clear button at the bottem of the Sample Editor screen to erase any existing data. Enter the observed
values into List 1 and enter the expected frequencies into List 2. Click on Analysis and then Goodnessof-Fit. Chose the Unequal Expected Frequencies option since the company is not testing to see if the
flavors are equally popular. Because the expected frequencies were given as proportions, chose the As
Proportions option under Enter Expected Frequencies (as decimals). Set the Observed Column
option as 1 and the Expected Column option as 2. We will set the Significance level to 0.05. Click
Evaluate (see Figure 25).
16
Statdisk User Manual
Figure 25
Click Plot to view a visual representation of the Chi-Square Distribution. The Critical Value, X2 is
shown as 7.815 and the Test Statistic, X2 is shown as 8.971 (see Figure 26).
Figure 26
17
Statdisk User Manual
Chi-Square Test of Independence (Contingency Tables)
To generate a Contingency table test you must first type data into the Sample editor or select an existing
data set. A company seeks to discover which color of car that males prefer and which color of car that
females prefer. Use the Clear button at the bottem of the Sample Editor screen to erase any existing
data. The data collected is as follows:
Red
Blue
Green
White
Male
21
17
44
8
Female
28
24
14
18
Enter the data into the Sample Editor exactly as it is shown in the table (see Figure 27).
Figure 27
Select Analysis and then Contingency Tables. Then chose columns 1, 2, 3 and 4 to include in the
analysis. We will set the significance level ot 0.05. Click Evaluate to view the results shown in the
output window to the right (see Figure 28).
Figure 28
18
Statdisk User Manual
Click Plot to display a visual of the Chi-Square Distribution. The Critical Value, X2 is shown to be
7.815 and the Test Statistic, X2 is shown to be 21.377 (see Figure 29).
Figure 29
One-Way Analysis of Variance (ANOVA)
To use the Analysis of Variance (ANOVA) function in Statdisk you first need to type data into the
sample editor or select an existing dataset. Go to Elementary Statistics 13th Edition and select the
Garbage Weights dataset. Go to the Analysis menu and then select One-Way Analysis of Variance.
Select columns 2, 3, and 4 and click on Evaluate.
Figure 30.
The hypothesis testing results are shown in the box on the right. (see figure 30.)
19
Purchase answer to see full
attachment