Phase 5 Course Project Inferential Statistics and Analytics

Content Type

User Generated

User

gevivn_03

Subject

Mathematics

Description

This week you will submit Phase 5, the final phase, of your course project. For Phase 5 of your course project, you will want to review your instructor's feedback from your Phase 4 submission and make any necessary corrections. Remember if you have questions about the feedback to ask your instructor for assistance.

Once you have made your corrections, you will make your final submission for the course project. Below is a summary of the expectations for Phase 5 of the course project:

Introduce your scenario and data set.

Discuss the importance of the Measures of Center and the Measures of Variation.
- What are the measures of center and why are they important?
- What are the measures of variation and why are they important?

Calculate the measures of center and measures of variation. Interpret your results in context of the selected topic.
- Mean
- Median
- Mode
- Midrange
- Range
- Variance
- Standard Deviation

Discuss the importance of constructing confidence intervals for the population mean.
- What are confidence intervals?
- What is a point estimate?
- What is the best point estimate for the population mean? Explain.
- Why do we need confidence intervals?

Based on your selected topic, evaluate the following:
- Find the best point estimate of the population mean.
- Construct a 95% confidence interval for the population mean. Assume that your data is normally distributed and σ is unknown.
  - Please show your work for the construction of this confidence interval and be sure to use the Equation Editor to format your equations.
- Write a statement that correctly interprets the confidence interval in context of your selected topic.

Based on your selected topic, evaluate the following:
- Find the best point estimate of the population mean.
- Construct a 99% confidence interval for the population mean. Assume that your data is normally distributed and σ is unknown.
  - Please show your work for the construction of this confidence interval and be sure to use the Equation Editor to format your equations.
- Write a statement that correctly interprets the confidence interval in context of your selected topic.

Compare and contrast your findings for the 95% and 99% confidence interval.
- Did you notice any changes in your interval estimate? Explain.
- What conclusion(s) can be drawn about your interval estimates when the confidence level is increased? Explain.

Discuss the process for hypothesis testing.
- Discuss the 8 steps of hypothesis testing?
- When performing the 8 steps for hypothesis testing, which method do you prefer; P-Value method or Critical Value method? Why?

Perform the hypothesis test.
- If you selected Option 1:
  - Original Claim: The average salary for all jobs in Minnesota is less than $65,000.
  - Test the claim using α = 0.05 and assume your data is normally distributed and σ is unknown.

If you selected Option 2:
- Original Claim: The average age of all patients admitted to the hospital with infectious diseases is less than 65 years of age.
- Test the claim using α = 0.05 and assume your data is normally distributed and σ is unknown.

Based on your selected topic, answer the following:
1. Write the null and alternative hypothesis symbolically and identify which hypothesis is the claim.
2. Is the test two-tailed, left-tailed, or right-tailed? Explain.
3. Which test statistic will you use for your hypothesis test; z-test or t-test? Explain.
4. What is the value of the test-statistic? What is the P-value?
  What is the critical value?
5. What is your decision; reject the null or do not reject the null?
  1. Explain why you made your decision including the results for your p-value and the critical value.
6. State the final conclusion in non-technical terms.

Conclusion

Tags: APA statistics inferential statistics course project dataset salary distributions The salary variable

User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Attached.

Running head: COURSE PROJECT

1

COURSE PROJECT:
Name:
Institution affiliation:
Date:

COURSE PROJECT

2

PHASE 1
Overview of the scenario
A client is interested in knowing salary distributions of jobs in Minnesota State.
Therefore, in this paper, an analysis will be carried out on data comprising of 364 records of job
listings by title as well as yearly salaries that range from approximately $40,000 to $120,000.
Variables
There are two variables in the dataset. The first variable is job titles that comprise of
different Job titles while the second variable is the salaries of each job title in the state of
Minnesota. The qualitative variable in the dataset is job titles. This is so since it describes data
that is not numerical as well as data that fit into categories. The quantitative variable is salaries.
This is so it comprises of numerical data.
The salary variable is a continuous variable since it can have almost any numerical value.
Additionally, it can be subdivided into finer increments that depend on upon the precision of
measurement. The different numerical variables in the dataset within the salary variable are
attributed to the fact that they represent different job titles.
With the job title variable lacking numerical significance as a result of being a qualitative
variable, the nominal level of measurement has been used as the level of measurement for this
variable. Secondly, the level of measurement of the salary variable is the ration level of
measurement. This is so since this variable can have a value of zero.
Measures of center
These are measures that provide a representative value that summarizes the data set. They
include the mean, mode, median as well as the midrange. To begin with, the mean which is also
referred to as the average is the most common measure of center. However, it tends to be

COURSE PROJECT

3

affected by extreme variables thereby making it unreliable in a skewed distribution (Witte,
2017).
Secondly, the median is simply the value which is at the center of a given dataset. Half of
the values in the dataset will be less than the median while the remaining half is greater than the
median value. For this reason, the median is the most suitable measure of center for skewed
distribution. Thirdly, the mode value is simply the value that appears more than any other v...