STAT250 Northern Virginia Target versus Safeway Food Prices Data Analysis

Anonymous

Question Description

Data Analysis

Your submitted document should include the following items. Points will be deducted if the following are not included.

  1. Type your Name and STAT 250 with your correct section number (e.g. STAT 250-xxx) right justified and then Data Analysis Assignment #1 centered on the top of page 1 below your name the begin your document.
  2. Number your pages across your entire solutions document.
  3. Your document should include the ANSWERS ONLY with each answer labeled by its corresponding number and subpart. Keep the answers in order. Do not include the questions in your submitted document.
  4. Generate all requested graphs and tables using StatCrunch.
  5. Upload your document onto Blackboard as a Word (docx) file or pdf file using the linkprovided by your instructor. It is your responsibility for uploading a readable file.

Full assignment Instructions, as well as a example is attached as a word file.

Access to StatCrunch is required.

https://www.statcrunch.com/5.0/group.php?groupid=8220

I will provide the login info...

Extra Notes:

- Each graph title should start with "Distribution of.."

- For the questions that require calculation, you can do it on a paper but would have to type the solution into word document.

- Please complete each part

Unformatted Attachment Preview

STAT 250 Summer 2019 Data Analysis Assignment 4 Your submitted document should include the following items. Points will be deducted if the following are not included. 1. Type your Name and STAT 250 with your correct section number (e.g. STAT 250-xxx) right justified and then Data Analysis Assignment #4 centered on the top of page 1 below your name the begin your document. 2. Number your pages across your entire solutions document. 3. Your document should include the ANSWERS ONLY with each answer labeled by its corresponding number and subpart. Keep the answers in order. Do not include the questions in your submitted document. 4. Generate all requested graphs and tables using StatCrunch. 5. Upload your document onto Blackboard as a Word (docx) file or pdf file using the link provided by your instructor. It is your responsibility for uploading a readable file. 6. You may not work with other individuals on this assignment. It is an honor code violation if you do. In addition, using materials for a previous semester of STAT 250 (whether your own or someone else’s) is cheating. Elements of good technical writing: Use complete and coherent sentences to answer the questions. Graphs must be appropriately titled and should refer to the context of the question. Graphical displays must include labels with units if appropriate for each axis. Units should always be included when referring to numerical values. When making a comparison you must use comparative language, such as “greater than”, “less than”, or “about the same as.” Ensure that all graphs and tables appear on one page and are not split across two pages. Type all mathematical calculations when directed to compute an answer ‘by-hand.’ Pictures of actual handwritten work are not accepted on this assignment. When writing mathematical expressions into your document you may use either an equation editor or common shortcuts such as: x can be written as sqrt(x), p̂ can be written as p-hat, x can be written as x-bar. 1 Problem 1: Appropriateness of Inference For the following scenarios, answer the questions for each part. In each part, the underlined text is the name of the StatCrunch data set to be used for that part. Please note, do not conduct inference in either of these parts; just answer each question. a) Food Prices: Target versus Safeway. Grocery prices of the same randomly selected items were collected and compared from Target and Safeway. Imagine you were interested in conducting a hypothesis test to determine whether the mean prices were significantly different. Note: to answer the questions below, subtract Target price – Safeway price (i.e. subtract Safeway price from Target price). i) What is (are) the parameter(s) of interest? Choose one of the following symbols ( (the mean of one sample) D (the mean difference from a paired (dependent) samples)  − 2 (the mean difference of two independent samples) and describe the parameter in context of this question in one sentence. ii) Depending on your answer to part (i), construct one or two relative frequency histograms. Remember to properly title and label the graph(s). Copy and paste these graphs into your document. iii) Describe the shape of the histogram(s) in one sentence. iv) Depending on your answer to part (i), construct one or two boxplots and copy and paste these graphs into your document. v) Does the boxplot (or do the boxplots) show any outliers? Answer this question in one sentence and identify any outliers if they are present. vi) Considering your answers to parts (iii) and (v), is inference appropriate in this case? Why or why not? Defend your answer using the graphs in two to three sentences. b) GMU Health Center Waiting Time. During the flu season, it is known that the waiting time at the GMU Health Center can be extreme. A statistics student wanted to test her claim that the wait time was greater than 100 minutes. She took a random sample of wait times during the flu season and recorded them in StatCrunch. i) What is (are) the parameter(s) of interest? Choose one of the following symbols ( (the mean of one sample) D (the mean difference of two paired (dependent) samples)  − 2 (the mean difference of two independent samples) and describe the parameter in context of this question in one sentence. ii) Depending on your answer to part (i), construct one or two relative frequency histograms. Remember to properly title and label the graph(s). Copy and paste the graph(s) into your document. iii) Describe the shape of the histogram(s) in one sentence. 2 iv) Depending on your answer to part (i), construct one or two boxplots and copy and paste these graphs into your document. v) Does the boxplot (or do the boxplots) show any outliers? Answer this question in one sentence and identify any outliers if they are present. vi) Considering the answers provided in parts (iii) and (v), is inference appropriate in this case? Why or why not? Defend your answer using the graphs in two to three sentences. Problem 2: GPA of Students Depending on Where They Sit. A professor wanted to know whether there was a difference in students’ grade point averages (GPA) depending on whether they sit in the front half of the classroom versus the back half of the classroom. In a previous semester, a random sample of students was selected from the front of a classroom and another random sample was selected from the back of a classroom and the student’s current GPA was recorded. The data provided in StatCrunch represent the GPAs from each random sample. The file is called “GPA Versus Seating Location.” At the 0.01 significance level, can the professor conclude from these data that the mean GPA for front sitters is higher than back sitters? Assume all conditions for conducting inference are satisfied. Conduct a full hypothesis test by following the steps below. Enter an answer for each of these steps in your document. a) Define the population parameter of interest in context of this question in one sentence. b) State the null and alternative hypotheses using correct notation. c) State the significance level for this problem. d) Calculate the test statistic in StatCrunch using STAT → T Stats → 2 Sample → With Data. Copy and paste the output table into your document. e) Label the p-value seen in your output table produced in part (iv) using the probability notation (it begins with P(…)). f) State whether you reject or do not reject the null hypothesis and your reason for your answer in one sentence. g) State your conclusion in context of the problem (i.e. interpret your results and/or answer the question being posed) in one or two complete sentences. Problem 3: Next page 3 Problem 3: Metal Hardness Testing The manufacturer of hardness testing equipment uses steel-ball indenters to indent metal that is being tested. However, the manufacturer thinks there might be a difference in hardness reading when using a diamond indenter. The metal specimens to be tested are large enough so that two indentations can be made. Therefore, the manufacturer wants to use both indenters on each specimen and compare the readings. The order of the indentations will be random. This particular design is called the paired design (or matched pairs design or dependent samples design). Assume all conditions are satisfied in this problem. The data set used for this problem is called “Metal Hardness Testing”. a) Calculate the difference between specimens by subtracting Steel Ball – Diamond. For example, the first difference is 51 – 52 = -1. List the difference for each of the 14 pairs in your document. b) For the first piece of metal, which indenter produced the larger hardness reading? Answer this question in a complete sentence. c) Obtain the mean of these differences and the standard deviation of these differences in StatCrunch. You may copy and paste the box that you obtain from StatCrunch or list the values. Please round these values to four decimal places. d) Construct a 95% confidence interval using the above data. Please do this “by hand” using the formula and showing your work (please type your work). Use your t-table (found in the last page of our formula packet) to obtain your t* critical value needed for the confidence interval. Present this confidence as (lower limit, upper limit) e) Use StatCrunch to obtain a 95% confidence interval for the above data by selecting: Stat → T Stats → Paired. Enter Steel Ball for Sample 1 and Diamond for Sample 2. Copy and paste your output into your document. f) Does your confidence interval capture 0? Answer this question and briefly explain what this implies in one or two sentences in the context of the question. g) Using your answer to part (g), imagine you were using a hypothesis test to determine if a significant difference exists in mean hardness reading between the two indenters (the hypotheses would be H0: D = 0 vs Ha: D ≠ 0). What decision and conclusion can be made in this case? Provide an answer and a reason for your choice in one or two sentences. Please only use your confidence interval to answer this question (i.e. do not run this hypothesis test). Problem 4: Next page 4 Problem 4: Lego Prices The data set named “Lego Prices” contains a selection of Lego sets sold on the Lego website in August 2016. The goal of this problem is to explore one variable (the number of Pieces a set contains) that may help a buyer predict the price of a Lego Set. The Price variable is the response variable in this problem. a) Investigate the relationship between the explanatory variable “Pieces” and response variable “Price” by doing the following: i) Make a scatterplot and copy and paste it in your solutions (use Graph → Scatter Plot in StatCrunch). ii) Calculate the correlation coefficient (use Stat → Summary Stats → Correlation in StatCrunch). Provide this value in your document. iii) Interpret the scatterplot and correlation coefficient in terms of trend, strength, and shape (form) in one complete sentence. b) Using the “Pieces” variable as the explanatory variable, run a Simple Linear Regression analysis in StatCrunch. Use Stat → Regression → Simple Linear. Copy and paste only the StatCrunch results output (no tables). c) Add the fitted line plot to your document. This graph appears on page 2 of your output. d) Type the regression equation into your document. e) Interpret the slope of the regression line (in context of this data set). f) Is it meaningful to interpret the y-intercept? Why or why not? g) State r-squared (i.e., the coefficient of determination) and explain what this value means in context of the data set. h) Use the regression equation from part (d) to predict the price of a randomly selected set containing 556 pieces. State your predicted value in a sentence that is in context of the data. Do not forget to mention the units. Note: You can do this calculation “by hand” or using StatCrunch. i) Is your prediction in part (h) an example of extrapolation? Why or why not? 5 1 Sample Solution to Display Formatting Problem X: Students’ Grades A random sample of 30 students was selected from a STAT 250 course taught during the summer session and their first exam scores were recorded. a) Create a histogram in StatCrunch. Be sure to title and label it correctly. b) Interpret the histogram’s shape See sample solution and formatting on page 2. Notes about submission Following the main points will help you submit a professionally completed assignment. 1) 2) 3) 4) Right justify your name and provide your correct section and the due date. Center the specific homework assignment title. Bold each problem complete problem number. The graph can be around the below size for readability (click on the graph once and only adjust the size of the graph by using the bottom right dot) 5) Remember not to include the questions in your answer. Only provide answers. Please keep the assignment in problem and part order (present 1a, then 1b, and so on). 2 Kenneth Strazzeri STAT 250-0xx (your correct section) Data Analysis Assignment 1 Problem X a) b) The shape of this distribution is left skewed because I see the majority of the data values falling in the upper end of the distribution and a few 50s and 60s skewing the shape. There does not seem to be any outliers visible on the graph. ...
Purchase answer to see full attachment

Tutor Answer

TabbyK
School: UT Austin

Looks good now. Just let me know.

1
Name
Course
Data Analysis Assignment
Problem 1: Appropriateness of Inference
a) Food Prices: Target versus Safeway
i)
D because the mean difference from the paired (dependent) samples needs to be
equal to zero if there is no significant difference.
ii)

iii) The graph shape is only slightly skewed to the left implying that only a few points are
below the average prices.
iv)

2

v) No, it does not show any outliers.
vi) Inference is not appropriate because there mean difference is insignificant and there
are no outliers.
b) GMU Health Center Waiting Time.
i)
because its one sample being tested for average time.
ii)

3

iii) It is skewed to the left implying that more than 50% of the participants experienced
waiting time less than 100.

4

iv) Yes, the boxplot show one outlier above 250 minutes of waiting time.
v) Yes, inference ...

flag Report DMCA
Review

Anonymous
The tutor managed to follow the requirements for my assignment and helped me understand the concepts on it.

Anonymous
The tutor was knowledgeable, will be using the service again.

Anonymous
Awesome quality of the tutor. They were helpful and accommodating given my needs.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Brown University





1271 Tutors

California Institute of Technology




2131 Tutors

Carnegie Mellon University




982 Tutors

Columbia University





1256 Tutors

Dartmouth University





2113 Tutors

Emory University





2279 Tutors

Harvard University





599 Tutors

Massachusetts Institute of Technology



2319 Tutors

New York University





1645 Tutors

Notre Dam University





1911 Tutors

Oklahoma University





2122 Tutors

Pennsylvania State University





932 Tutors

Princeton University





1211 Tutors

Stanford University





983 Tutors

University of California





1282 Tutors

Oxford University





123 Tutors

Yale University





2325 Tutors