# Phase 5 Course Project Inferential Statistics and Analytics

Anonymous
account_balance_wallet \$20

### Question Description

This week you will submit Phase 5, the final phase, of your course project. For Phase 5 of your course project, you will want to review your instructor's feedback from your Phase 4 submission and make any necessary corrections. Remember if you have questions about the feedback to ask your instructor for assistance.

Once you have made your corrections, you will make your final submission for the course project. Below is a summary of the expectations for Phase 5 of the course project:

1. Introduce your scenario and data set.
• Provide a brief overview of the scenario you are given above and the data set that you will be analyzing.
• Classify the variables in your data set.
• Which variables are quantitative/qualitative?
• Which variables are discrete/continuous?
• Describe the level of measurement for each variable included in your data set.
1. Discuss the importance of the Measures of Center and the Measures of Variation.
• What are the measures of center and why are they important?
• What are the measures of variation and why are they important?
1. Calculate the measures of center and measures of variation. Interpret your results in context of the selected topic.
• Mean
• Median
• Mode
• Midrange
• Range
• Variance
• Standard Deviation
1. Discuss the importance of constructing confidence intervals for the population mean.
• What are confidence intervals?
• What is a point estimate?
• What is the best point estimate for the population mean? Explain.
• Why do we need confidence intervals?
1. Based on your selected topic, evaluate the following:
• Find the best point estimate of the population mean.
• Construct a 95% confidence interval for the population mean. Assume that your data is normally distributed and σ is unknown.
• Please show your work for the construction of this confidence interval and be sure to use the Equation Editor to format your equations.
• Write a statement that correctly interprets the confidence interval in context of your selected topic.
1. Based on your selected topic, evaluate the following:
• Find the best point estimate of the population mean.
• Construct a 99% confidence interval for the population mean. Assume that your data is normally distributed and σ is unknown.
• Please show your work for the construction of this confidence interval and be sure to use the Equation Editor to format your equations.
• Write a statement that correctly interprets the confidence interval in context of your selected topic.
• Compare and contrast your findings for the 95% and 99% confidence interval.
• Did you notice any changes in your interval estimate? Explain.
• What conclusion(s) can be drawn about your interval estimates when the confidence level is increased? Explain.
1. Discuss the process for hypothesis testing.
• Discuss the 8 steps of hypothesis testing?
• When performing the 8 steps for hypothesis testing, which method do you prefer; P-Value method or Critical Value method? Why?
1. Perform the hypothesis test.
• If you selected Option 1:
• Original Claim: The average salary for all jobs in Minnesota is less than \$65,000.
• Test the claim using α = 0.05 and assume your data is normally distributed and σ is unknown.

• If you selected Option 2:
• Original Claim: The average age of all patients admitted to the hospital with infectious diseases is less than 65 years of age.
• Test the claim using α = 0.05 and assume your data is normally distributed and σ is unknown.

1. Write the null and alternative hypothesis symbolically and identify which hypothesis is the claim.
2. Is the test two-tailed, left-tailed, or right-tailed? Explain.
3. Which test statistic will you use for your hypothesis test; z-test or t-test? Explain.
4. What is the value of the test-statistic? What is the P-value?
What is the critical value?
5. What is your decision; reject the null or do not reject the null?
1. Explain why you made your decision including the results for your p-value and the critical value.
6. State the final conclusion in non-technical terms.
1. Conclusion

Chucks574
School: UC Berkeley

Attached.

1

COURSE PROJECT:
Name:
Institution affiliation:
Date:

COURSE PROJECT

2

PHASE 1
Overview of the scenario
A client is interested in knowing salary distributions of jobs in Minnesota State.
Therefore, in this paper, an analysis will be carried out on data comprising of 364 records of job
listings by title as well as yearly salaries that range from approximately \$40,000 to \$120,000.
Variables
There are two variables in the dataset. The first variable is job titles that comprise of
different Job titles while the second variable is the salaries of each job title in the state of
Minnesota. The qualitative variable in the dataset is job titles. This is so since it describes data
that is not numerical as well as data that fit into categories. The quantitative variable is salaries.
This is so it comprises of numerical data.
The salary variable is a continuous variable since it can have almost any numerical value.
Additionally, it can be subdivided into finer increments that depend on upon the precision of
measurement. The different numerical variables in the dataset within the salary variable are
attributed to the fact that they represent different job titles.
With the job title variable lacking numerical significance as a result of being a qualitative
variable, the nominal level of measurement has been used as the level of measurement for this
variable. Secondly, the level of measurement of the salary variable is the ration level of
measurement. This is so since this variable can have a value of zero.
Measures of center
These are measures that provide a representative value that summarizes the data set. They
include the mean, mode, median as well as the midrange. To begin with, the mean which is also
referred to as the average is the most common measure of center. However, it tends to be

COURSE PROJECT

3

affected by extreme variables thereby making it unreliable in a skewed distribution (Witte,
2017).
Secondly, the median is simply the value which is at the center of a given dataset. Half of
the values in the dataset will be less than the median while the remaining half is greater than the
median value. For this reason, the median is the most suitable measure of center for skewed
distribution. Thirdly, the mode value is simply the value that appears more than any other v...

flag Report DMCA
Review

Anonymous
Tutor went the extra mile to help me with this essay. Citations were a bit shaky but I appreciated how well he handled APA styles and how ok he was to change them even though I didnt specify. Got a B+ which is believable and acceptable.

Brown University

1271 Tutors

California Institute of Technology

2131 Tutors

Carnegie Mellon University

982 Tutors

Columbia University

1256 Tutors

Dartmouth University

2113 Tutors

Emory University

2279 Tutors

Harvard University

599 Tutors

Massachusetts Institute of Technology

2319 Tutors

New York University

1645 Tutors

Notre Dam University

1911 Tutors

Oklahoma University

2122 Tutors

Pennsylvania State University

932 Tutors

Princeton University

1211 Tutors

Stanford University

983 Tutors

University of California

1282 Tutors

Oxford University

123 Tutors

Yale University

2325 Tutors