Cass Business School Statistics ANOVA & P Value Questions

Cass business school

Question Description

I need help with a Statistics question. All explanations and answers will be used to help me learn.

You do not need to solve Q1 and Q2.

I did Q1 and Q2 already

you have to do just Q3 and Q4 for me.

Thanks my tutor!

Unformatted Attachment Preview

Question 1 A number, 𝑛, of contestants are registered to take part in an archery contest. The distance between the centre of the target and the point that the 𝑖 π‘‘β„Ž archer’s arrow hits is given by the random variable 𝑋𝑖 , for 𝑖 = 1, … , 𝑛. The random variables 𝑋1 , … , 𝑋𝑛 are independent and identically distributed, each following an exponential distribution with a mean of 10cm. Each archer has one shot and the archer whose arrow hits closest to the centre of the target wins the contest. a) Determine the probability density function of the winner’s distance from the centre of the target, that is, the density of the random variable π‘Œ = π‘šπ‘–π‘›{𝑋1 , … , 𝑋𝑛 }. (3 marks) b) Because of an outbreak of food poisoning in the hotel where the contestants are staying, it is possible that not all registered archers can participate in the contest. The number of archers taking part is given by the discrete random variable 𝑁, such that 𝑃(𝑁 = 10) = 0.8, 𝑃(𝑁 = 9) = 0.1, 𝑃(𝑁 = 8) = 0.1 and 𝑁 is independent of 𝑋1 , 𝑋2 , … Hence, the winner’s distance from the centre of the target is given by the random variable 𝑍 = π‘šπ‘–π‘›{𝑋1 , … , 𝑋𝑁 }. Using the properties of conditional expectation, calculate 𝐸(𝑍). (4 marks) Total: 7 marks Question 2 Consider the random variables 𝑋1 and 𝑋2 , with means 𝐸(𝑋1 ) = 𝐸(𝑋2 ) = 0 and variances π‘‰π‘Žπ‘Ÿ(𝑋1 ) = π‘‰π‘Žπ‘Ÿ(𝑋2 ) = 1. The random variables follow the bivariate normal distribution, which means that their joint probability density function is given by 1 π‘₯12 βˆ’ 2π‘₯1 π‘₯2 𝜌 + π‘₯22 𝑓(π‘₯1 , π‘₯2 ) = 𝑒π‘₯𝑝 {βˆ’ }, π‘₯1 , π‘₯2 ∈ ℝ 2(1 βˆ’ 𝜌2 ) 2πœ‹βˆš1 βˆ’ 𝜌2 where 𝜌 ∈ (βˆ’1,1). You can take as given that each of 𝑋1 , 𝑋2 follows a standard normal distribution and that their correlation coefficient is 𝜌. a) Show that if 𝜌 = 0, the random variables 𝑋1 , 𝑋2 are statistically independent. (1 mark) b) Show that the conditional density 𝑓𝑋1 |𝑋2 takes the following form: 2 1 1 π‘₯1 βˆ’ π‘₯2 𝜌 𝑓𝑋1 |𝑋2 (π‘₯1 |π‘₯2 ) = 𝑒π‘₯𝑝 {βˆ’ ( ) } 2 2 √1 βˆ’ 𝜌2 √2πœ‹βˆš1 βˆ’ 𝜌 (1 mark) 1 c) For 𝜌 = 0.99, state the value of π‘‰π‘Žπ‘Ÿ(𝑋2 |𝑋1 = π‘₯) for some π‘₯ and interpret it. d) Define the random variable 𝑋3 = (𝑋1 all steps, and interpret your finding. )2 (2 mark) . Show that πΆπ‘œπ‘£(𝑋1 , 𝑋3 ) = 0, carefully justifying (4 marks) Total: 8 marks Question 3 Four university lecturers (A, B, D, and C) teach four modules each within a given academic year. The sample mean and variance of each lecturer’s module evaluation score, calculated across each lecturer’s modules, are given in the table below. Number of modules Lecturer A 4 Lecturer B 4 Lecturer C 4 Lecturer D 4 Average score Variance of scores 2.60 0.2196 3.13 0.3751 3.56 0.1851 3.92 0.2416 a) Perform a one-way Analysis of Variance for the above data, stating clearly the hypotheses tested and reporting your test result at the 5% significance level. (You may assume that all assumptions of the one-way ANOVA model are satisfied. You are given the following critical values of the F distribution, one of which will be needed to answer this question: 𝐹3,12,0.025 = 4.474, 𝐹12,3,0.05 = 8.745, 𝐹3,12,0.05 = 3.490, 𝐹4,12,0.05 = 3.259.) (4 marks) b) After being called in by his Head of Department to discuss his low feedback scores, Lecturer A claims that the reason his scores are comparatively low is that his class sizes were large. The following scatter-plot shows all lecturers’ scores plotted against the sizes of the four classes they each taught, together with the line of best fit, obtained via simple regression model of the form π‘Œπ‘– = 𝛽0 + 𝛽1 π‘₯𝑖 + πœ€π‘– , πœ€π‘– ∼ 𝑁(0, 𝜎 2 ), where π‘Œπ‘– are the evaluation scores for individual modules and π‘₯𝑖 are the corresponding class sizes. i. The estimate of the variance 𝜎 2 in the simple linear regression model is 𝑠 2 = 0.2744. Calculate the values of 𝑅 2 and of the correlation coefficient of the evaluation scores with the class size. (4 marks) 2 Figure 1 From the plot, estimate the value of the intercept, 𝑏1 . You are given that the standard error of 𝑏1 is 𝑠𝐡1 = 0.0039. Calculate a 95% confidence interval and state your conclusion. (You may assume that the relevant critical value of the t distribution is approximately 2.) (3 marks) The Head of Department is not convinced that class size explains poor evaluation scores. She states that it may just be a coincidence that the worst performing lecturers teach larger classes. Explain what further analysis could be carried out to explore the issue further. (2 marks) Total: 13 marks ii. c) 3 Question 4 Let the random variable 𝑋 represent the effort that a randomly chosen actuarial science student puts towards studying for a statistics module (on a scale from 0 to 5) and the random variable π‘Œ represent that student’s final exam mark. Assume that the conditional expectation of π‘Œ given 𝑋 = π‘₯ be given by the following formula: 𝐸(π‘Œ|𝑋 = π‘₯) = 20 + 10π‘₯ + 20 β‹… tanh(π‘₯ βˆ’ 2), where 𝑒 2π‘₯ βˆ’ 1 tanh(π‘₯) = 2π‘₯ 𝑒 +1 is the hyperbolic tangent function. The graph of the function 𝑔(π‘₯) = 𝐸(π‘Œ|𝑋 = π‘₯) is represented by the solid line in Figure 2 below. a) Let 𝑋 be normally distributed, with mean equal to 2 and standard deviation equal to 0.5. Provide a simulation algorithm for calculating numerically the unconditional expectation 𝐸(π‘Œ), starting from 𝑛 standard normal observations 𝑧1 , … , 𝑧𝑛 . (5 marks) b) Calculate 𝐸(π‘Œ|𝑋 = 0) and interpret the result. (1 mark) c) An education researcher, who is not aware of the formula for 𝐸(π‘Œ|𝑋 = π‘₯) given above, tries to understand the relationship between students’ effort and their final exam mark. The researcher manages to collect data from (𝑋, π‘Œ) for 20 students. The data are shown in Figure 2 as points. The lecturer is fitting two regression models to the data, with prediction equations: Model 1: 𝑦̂ = βˆ’23.88 + 32.56 β‹… π‘₯ Model 2: 𝑦̂ = 19.83 β‹… π‘₯ Model 1 is a standard linear regression model. Model 2 is fitted by fixing the intercept to 2 zero, that is, 19.83 is the minimiser of the expression min βˆ‘20 𝑗=1(𝑦𝑗 βˆ’ 𝑏 β‹… π‘₯𝑗 ) . b Draw the regression lines corresponding to the two models on (a print-out of) Figure 2 and include this in your coursework submission. (3 marks) d) Comment on the rationale behind Model 2. Explain whether Model 1 or Model 2 is preferable. (3 marks) Total: 12 marks 4 Figure 2 5 6 ...
Purchase answer to see full attachment
Student has agreed that all tutoring, explanations, and answers provided by the tutor will be used to help in the learning process and in accordance with Studypool's honor code & terms of service.

Final Answer

Please find answer.Let me know for any clarifications.Pleasure working with you.Good Bye.


One-way Anova
H0:The mean score for each lecture modules are equal.
Ha: At least one of the mean score is different.
Mathematically it can be written as;
Ha:At least one Β΅i is not equal tom 0.

Calculation of F-statistics

The overall mean is 𝑋̿=
= 3.3025

Sum of square between groups,SSB= βˆ‘4𝑖=1 𝑛𝑖 (π‘₯̅𝑖 βˆ’ 𝑋̿) = 4 Γ— (2.6 βˆ’ 3.3025)2 +
4 Γ— (3.13 βˆ’ 3.3025)2 + 4 Γ— (3.56 βˆ’ 3.3025)2 + 4 Γ— (3.92 βˆ’ 3.3025)2

Sum of square within group , SSE = βˆ‘4𝑖=1(𝑛𝑖 βˆ’ 1) Γ— 𝑠𝑖2 = (4 βˆ’ 1) Γ— 0.2196 + (4 βˆ’ 1) Γ—
0.3751 + (4 βˆ’ 1) Γ— 0.1851 + (4 βˆ’ 1) Γ— 0.2416 = 3.0642

SST =SSB+SSE = 3.0642+3.8835= 6.9477

Df (between group) = k-1=4-1=3
Df(within group) =N-k =15-3=12
Total degree of fr...

psumanrec (4270)
UC Berkeley

Top quality work from this tutor! I’ll be back!

Just what I needed… fantastic!

Use Studypool every time I am stuck with an assignment I need guidance.