Intermediate Statistical Theory Questions

Mathematics

University of Toronto

Question Description

The sample questions and notes are attached, please check out first before accept this work

Eight questions (simmilar to practice final) will be post on 04/16/2020 (GMT-3) 8:30 am, and will due in 3 hours on 11:30 am

Please feel free to ask if you have any question

Unformatted Attachment Preview

Parameters n, p λ θ Distribution Binomial Poisson Geometric Discrete Random Variables λ 1−θ θ2 λ 1 θ i e−λ λi! θ(1 − θ)i−1 Var(X) np(1 − p) E(X) np P (X  = i) n−i n i i p (1 − p) Formula Sheet λ̂ = Xi n P MLE p̂ = X n MATH/STAT 3460, Intermediate Statistical Theory Parameters a, b µ, σ 2 λ α,β α,β ν ν ν1 , ν2 Distribution Uniform Normal Exponential Gamma Beta Chi-Square t distribution F distribution Γ(α+β) α−1 (1 − x)β−1 , (0 < x < 1) Γ(α)Γ(β) x ν−2 x 1 x 2 e− 2 , (x > 0) 2ν/2 Γ(ν/2) 2 ν+1 Γ( ν+1 2 ) √ (1 + tν )− 2 πνΓ( ν2 ) ν +ν ν1 ν1 +ν2 ν1 Γ( 1 2 2 ) ( ν1 ) 2 f 2 −1 (1 + νν21 f )− 2 ν ν Γ( 21 )Γ( 22 ) ν2 λe−λx , (x > 0) βα α−1 −βx e , (x > 0) Γ(α) x √1 e 2πσ (x−µ)2 − 2σ2 Probability density function  1 if a < x < b b−a 0 otherwise Continuous Random Variables , (f > 0) 1 − e−λx Φ(x) (see table) cumulative distribution function F (x)  xb ν2 ν2 −2 0 ν α α+β 1 λ α β µ ν ν−2 (ν 2ν > 2) αβ (α+β)2 (α+β+1) 1 λ2 α β2 σ2 â = min(Xi ) b̂ = max(Xi ) (b−a)2 12 a+b 2 λ̂ = Pn Xi µ̂ = X = nXi qP (Xi −X)2 + (X − µ)2 σ̂ = n P MLE Var(X) E(X) Newton’s Method To solve equation g(θ) = 0, we start at θ0 , iteratively using θn+1 = θn − g(θn )/g 0 (θn ) until convergence. Order Statistics Distribution of the rth order statistic: For random samples of size n from an infinite population that has the value f (x) at x, the probability density of the rth order statistic Yr is given by n−r r−1 Z ∞ Z yr n! f (x)dx f (x)dx f (yr ) gr (yr ) = (r − 1)!(n − r)! −∞ yr for −∞ < yr < ∞. Efficiency • Rao-Cramér Lower Bound – Theorem (Rao-Cramér Lower Bound). Let X1 , X2 , ..., Xn be iid with common pdf f (x; θ) for θ ∈ Ω. Also, suppose that the set of x values, where f (x; θ) 6= 0, does not depend on θ (the pdfs have common support for all θ). Assume certain regularity conditions hold. Let Y = u(X1 , X2 , ..., Xn ) be a statistic with mean E(Y ) = E[u(X1 , X2 , ..., Xn )] = k(θ). Then V ar(Y ) ≥ [k 0 (θ)]2 nI(θ) – Corollary: If Θ̂ is an unbiased estimator of θ and V ar(Θ̂) = 1 nI(θ) , then Θ̂ is a minimum variance unbiased estimator of θ. Sampling distribution • If Y and Z are independent random variables, Y has a chi-square distribution with ν degrees of freedom, and Z has the standard normal distribution, then the distribution of T = √Z is the t distribution with ν Y /ν degrees of freedom. • If X̄ and S 2 are the mean and the variance of a random sample of size n from a normal population with the mean µ and the standard deviation σ, then – X̄ and S 2 are independent; – the random variable degrees of freedom. (n−1)S 2 σ2 has a chi-square distribution with n − 1 • If U and V are independent random variables having chi-square distributions with ν1 and ν2 degrees of freedom, then F = U/ν1 V /ν2 is a random variable having an F distribution with df’s ν1 and ν2 . Bayesian Inference Before the data y are considered, the prior predictive distribution of y is Z Z p(y) = p(y, θ)dθ = p(θ)p(y|θ)dθ After the data y have been observed, the posterior predictive distribution of ỹ is: Z p(ỹ|y) = p(ỹ, θ|y) dθ Z = p(ỹ|θ, y)p(θ|y) dθ Z = p(ỹ|θ)p(θ|y) dθ where p(θ|y) ∝ p(θ)p(y|θ) Some useful properties of Gamma function Gamma function is defined as Γ(α) = • Γ(α + 1) = αΓ(α) • Γ(1) = 1 • Γ(1/2) = √ π • Γ(n) = (n − 1)! • B(x, y) = Γ(x)Γ(y) Γ(x+y) R∞ 0 xα−1 e−x dx. MATH/STAT 3460, Intermediate Statistical Theory Winter 2020 Homework Sheet 1 Model Solutions Basic Questions 1. The number of car accidents on a given working day is believed to follow a Poisson distribution with parameter λ. Over a series of 10 working days, the number of accidents observed is 13, 17, 14, 21, 14, 19, 16, 17, 15, and 15. What are the moment estimate and maximum likelihood estimate for λ. The Poisson mean is λ, the mean of the observed data is x̄ = 16.1, thus the moment estimate is 16.1. The likelihood of the given data is e−10λ λ13+17+14+21+14+19+16+17+15+15 (after multiplying by the constant 13!17!14!21!14!19!16!17!15!15!) The log likelihood is therefore 161 log λ − 10λ. To find the maximum likelihood estimate for λ, we set the derivative of this equal to zero — that is 161 λ − 10 = 0, or λ = 161 = 16.1. 10 2. A team of doctors wants to know how common a particular disease is. They test 10,000 people at random. The test is 99% accurate, so it gives the true answer 99% of the time (and the wrong answer 1% of the time). The number of people with positive test results (the test indicates they have the disease) is 175. What are the moment estimate and the maximum likelihood estimate for the proportion of people who actually have the disease? If the proportion of people who have the disease is p, then the probability that an individual has a positive test result is 0.99p + 0.01(1 − p) = 0.01 + 0.98p. This is the theoretical mean for the proportion of people who get the positive test results. The observed proprtion is 0.0175. From 0.01 + 0.98p = 0.0175, we can solve the moment estimate as p̃ = 0.00765. The likelihood of the observed result is therefore (0.01 + 0.98p)175 (0.99 − 0.98p)9825 . The derivative of this with respect to p is 0.98 × 175(0.01 + 0.98p)174 (0.99 − 0.98p)9825 − 0.98 × 9825(0.01 + 0.98p)175 (0.99 − 0.98p)9824 . Setting this equal to zero gives 175(0.99 − 0.98p) − 9825(0.01 + 0.98p) = 0, or 9800p = 173.25 − 98.25 = 75, which gives p̂ = 0.00765. 3. A team of ecologists wants to know how many of a certain species of frogs lives in a forest. They perform the following experiment: They capture a group of frogs and mark them. They then release the frogs, and capture more frogs a week later, and count how many of those are marked. Suppose the first group captured (and marked) contains 114 frogs, and the second group contains 106 frogs, of which 31 are marked. Calculate the 1 moment estimate and the maximum likelihood estimate for the number of frogs in the forest. If the total number of frogs is N . Let X denote the no. of frogs captured the second time as marked, X follows hypogeometric distribution with parameters (N, K=114, n=106). E(X) = n(K/N ) = 106(114/N ), let 106(114/N ) = 31, we have Ñ = 389.8065 ≈ 390. For MLE, we start withthe likelihood   of having   31 marked   in the sec 113 84 N −114 N −115 N −188 114 ··· N ond group is N N −1 · · · N −30 N −31 N −32 −105 . Ig−114)···(N −188) this is proportional to (N So the P188 P105N (N −1)···(N −105) log-likelihood is l(N ) = i=114 log(N − i) − j=0 log(N − j). noring the constant 114! 83! , Following the steps as in the class example, we can derive the formula of MLE as ((n-0.5)(K-0.5)+0.5(n+K-k-0.5))/k, where n=106, K=114 and k=31. This gives mle for N as N̂ = 389.3065. Alternatively we can start with the moment estimate and calculate the log-likelihood values in the neighbourhood of the moment estimate: N 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 l(N ) -206.2401 -206.2364 -206.2331 -206.2303 -206.2279 -206.2259 -206.2244 -206.2232 -206.2225 -206.2222 -206.2223 -206.2227 -206.2236 -206.2248 -206.2264 -206.2284 -206.2308 -206.2335 -206.2365 -206.2399 -206.2437 Thus the MLE of N is N̂ = 389. 4. Let X1 , . . . , Xn be normally distributed with mean µ and variance µ2 , where µ > 0. What are the moment estimate and the maximum likelihood estimate for µ? 2 The moment estimate is µ̃ = x̄. The likelihood for the data is Qn likelihood is therefore −n log µ − Pn Xi 2 Pn e − (Xi −µ)2 2µ2 i=1 Pn µ 2 i=1 (Xi −µ) . 2µ2 , if µ is positive. The logWe expand Pn 2 i=1 (Xi −µ) 2µ2 Xi − + 21 . Differentiating the log-likelihood with respect 2µ2 µ P Pn Pn n Xi 2 Xi Xi − i=1 = 0. This has solutions µ = − i=1 µ gives − nµ + i=1 3 2 µ µ 2n r P 2 P n n Xi 2 i=1 Xi + i=1 . 2n 4n i=1 i=1 Of these, only µ = − Pn i=1 2n Xi + r P n i=1 Xi 2n 2 Pn + Xi 2 4n i=1 as to ± is positive, so this must be the MLE. 5. Let X1 , X2 , X3 , X4 be distributed as the sum of two exponential distributions with parameters λ1 and λ2 , with λ1 < λ2 . If the values of X1 , X2 , X3 , X4 are 1.4, 2.3, 3.2, and 3.9, show that the maximum likelihood estimate (λˆ1 , λˆ2 ) satisfies λˆ1 + λˆ1 = 2.7. [You do not need to find the values of λˆ1 1 2 and λˆ2 .] The sum of two exponential distributions with parameters λ1 and λ2 has pdf Z fX (x) = λ1 λ2 x e−λ1 t e−λ2 (x−t) dt = λ1 λ2 e−λ2 x 0  = 1 1 − λ1 λ2 Z x e(λ2 −λ1 )t dt = 0 −1 λ1 λ2 −λ2 x (λ2 −λ1 )x e (e − 1) λ2 − λ1 (e−λ1 x − e−λ2 x ) The likelihood of the given data is therefore  1 1 − λ1 λ2 −4 (e−1.4λ1 −e−1.4λ2 )(e−2.3λ1 −e−2.3λ2 )(e−3.2λ1 −e−3.2λ2 )(e−3.9λ1 −e−3.9λ2 ) The derivative of the log-likelihood with respect to λ1 is therefore 4 λ2 e−1.4λ1 e−2.3λ1 e−3.2λ1 e−3.9λ1 −1.4 −1.4λ1 −2.3 −3.2 −3.9 2 −1.4λ −2.3λ −2.3λ −3.2λ −3.2λ −3.9λ 2 1 − e 2 1 − e 2 1 − e−3.9λ2 λ1 λ2 − λ1 e −e e e e The derivative with respect to λ2 is −4 λ1 e−1.4λ2 e−2.3λ2 e−3.2λ2 e−3.9λ2 +1.4 +2.3 +3.2 +3.9 λ2 2 − λ1 λ2 e−1.4λ1 − e−1.4λ2 e−2.3λ1 − e−2.3λ2 e−3.2λ1 − e−3.2λ2 e−3.9λ1 − e−3.9λ2 Taking the sum of these two gives 4 λ1 λ2 −4 2 = 1.4 + 2.3 + 3.2 + 3.9 = 10.8 2 λ1 λ2 − λ1 λ2 − λ1 λ2 3  4 λ2 λ1 − λ1 λ2  = 10.8 λ2 − λ1 λ2 2 − λ1 2 4 = −10.8 λ1 λ2 (λ2 − λ1 ) λ1 + λ2 = 10.8 4 λ1 λ2 . 1 1 + = 2.7 λ1 λ2 Standard Questions 6. A scientist believes that the time in years between earthquakes in a particular region is exponentially distributed with parameter λ. He obtains the following data: Time between earthquakes < 1 year 1–2 years 2–3 years 3–4 years > 4 years Frequency 87 65 47 33 59 (a) Find the maximum likelihood estimate for λ. For a given value of λ, the probabilities of the data are: Time between earthquakes < 1 year 1–2 years 2–3 years 3–4 years > 4 years probability 1 − e−λ −λ e − e−2λ e−2λ − e−3λ e−3λ − e−4λ e−4λ So the likelihood of the data is (1−e−λ )87 (e−λ −e−2λ )65 (e−2λ −e−3λ )47 (e−3λ − e−4λ )33 (e−4λ )59 . Differentiating the logarithm of this this with respect −3λ −4λ e−λ 130e−2λ −65e−λ −94e−2λ −99e−3λ to λ gives 87 1−e + 141e + 132e − −λ + e−λ −e−2λ e−2λ −e−3λ e−3λ −e−4λ −4λ 59 4e . e−4λ We evaluate the derivative of the log-likelihood at the given values: λ 0.3850 0.4342 0.4721 0.5110 dl(λ) dλ 0.0224124512 -67.3157423758 -109.4851498782 -146.1516544333 4 So 0.3850 is the MLE. (b) Another scientist has tabulated the following independent data: Time between earthquakes < 2 years 2–4 years 4–6 years > 6 years Frequency 132 84 49 68 What is the maximum likelihood estimate for λ for the combined data? The likelihood of this scientist’s data is (1−e−2λ )132 (e−2λ −e−4λ )84 (e−4λ − e−6λ )49 (e−6λ )68 . Differentiating the logarithm of this this with respect to −6λ (336e−4λ −168e−2λ ) −196e−4λ ) e−2λ + (294e − 408 λ gives 264 1−e −2λ + (e−2λ −e−4λ ) (e−4λ −e−6λ ) λ 0.2426 0.2728 0.3079 0.3831 dl(λ) dλ 427.6714612375 204.0937303493 0.1312667433 -308.7860941338 So the MLE is 0.3079. 7. The lifetime of a battery is believed to be exponentially distributed with parameter λ. 30 batteries are tested and last for the following times: 1.2, 1.3, 1.5, 1.8, 1.9, 2.2, 2.4, 2.5, 2.7, 3.0, 3.5, 3.7, 4.1, 4.4, 4.8, 5.3, 5.7, 6.0, 6.8, 7.2, 7.8, 8.3, 8.9, 9.4, 9.7, 10.3, 11.2, 11.9, 12.5, and 12.9 (a) What is the maximum likelihood estimate for λ? The likelihood for this data is λ30 e−1.2λ−···−12.9λ = λ30 e−174.9λ . The loglikelihood is 30 log λ − 174.9λ. Its derivative is 30 λ − 174.9. Setting this 30 = 0.1715. equal to zero gives λ = 174.9 (c) If the data are censored at 3.5, what is the new maximum likelihood estimate for λ? If the data are censored at 3.5, we would get that 10 of the batteries lasted less than 3.5, with lifetimes 1.2, 1.3, 1.5, 1.8, 1.9, 2.2, 2.4, 2.5, 2.7, and 3.0, and 20 batteries lasted for the full 3.5. the likelihood of this is λ10 e−1.2λ−···−3.0λ (e−3.5λ )20 . The log-likelihood is therefore 10 log λ − 20.5λ − 70λ. Differentiating this and setting to zero gives a maximum 10 likelihood estimate for λ of 90.5 = 0.1105. 5 MATH/STAT 3460, Intermediate Statistical Theory Homework Sheet 2 Model Solutions Basic Questions 1. Let X1 , . . . , X3 be independent samples from a normal distribution distribution with mean µ and variance 1, conditional on Xi > 0 for all i. That − (x−µ)2 2 . Let X1 +X2 +X3 = 5. Use Newton’s is, Xi has density function e√2πΦ(µ) method to find the maximum likelihood estimate for µ. The likelihood of the data is e− (x1 −µ)2 +(x2 −µ)2 +(x3 −µ)2 2 Φ(µ)−3 , 2(x1 +x2 +x3 )µ−3µ2 2 which is proportional to e Φ(µ)−3 . The derivative of this with  respect to µ is therefore  2(x1 +x2 +x3 )µ−3µ2 µ2 2 e Φ(µ)−4 (x1 + x2 + x3 − 3µ)Φ(µ) − √32π e− 2 . The maximum likelihood estimate is therefore found by solving µ2 (x1 + x2 + x3 − 3µ)Φ(µ) − √32π e− 2 = 0. The derivative of this quantity with respect to µ is − µ2 −3Φ(µ) + (x1 + x2 + x3 − 3µ) e√2π2 + √32π µe− We start with an estimate of µ2 2 (x1 +x2 +x3 ) − µ √ e 2 2π − 3Φ(µ) 5 3: µ2 µ 1.67 1.54 2 = (5 − 3µ)Φ(µ) − −0.298431416 −0.009116611 3e√− 2 2π µ2 5e√− 2 2π − 3Φ(µ) −2.360114306 −2.205212315 So the maximum likelihood estimate is 1.54. 2. X1 , X2 , . . . , Xn be distributed as some unknown constant P a plus an expon nential distribution with parameter λ. If we have n = 50, i=1 Xi = 220, and min Xi = 1.4, Find the maximum likelihood estimates for λ and a. The pdf of the distribution is λe−λ(x−a) for xP> a. The log-likelihood n of the data is therefore l(λ, a) = n log λ − λ( i=1 Xi ) + nλa. This is clearly an increasing function of a, so is maximised by taking a as large as possible, i.e. a = min(Xi ). Differentiating with respect to λ and setting 1 to zero gives 50 λ = 220 − 50a, so λ = 4.4−a . The maximum likelihood estimate is a = 1.4, λ = 1/3. 3. The exponential pdf is a measure of lifetimes of devices that do not age. However, the exponential pdf is a special case of the Weibull distribution, which measures time to failure of devices where the probability of failure 1 increases as time does.A Weibull random variable Y has pdf fY (y; α, β) = β αβy β−1 e−αy , y ≥ 0, (α > 0, β > 0) (a) Find the maximum likelihood estimator for α assuming that β is known. β L(α, β) = Πni=1 αβyiβ−1 e−αyi = αn β n (Πni=1 yi )β−1 exp(−α n X yiβ ) i=1 l(α, β) = nlogα + nlogβ + (β − 1)log(Πni=1 yi ) − α n X yiβ i=1 setting ∂l(α,β) ∂α = n α − Pn i=1 yiβ = 0 gives α̂ = n Pn i=1 yiβ (b) Suppose α and β are both unknown. Write down the equations that would be solved simultaneously to find the maximum likelihood estimators of α and β . The other equation is n X ∂l(α, β) n = + log(Πni=1 yi ) − α yiβ logyi = 0 ∂β β i=1 setting ∂l(α,β) ∂β = 0 provides the other equation. Solving the two simultaneously would be done by numerical methods. 4. Let X be a random sample from a binomial distribution with n = 100 and p unknown. (a) Show that the maximum likelihood estimate for p is unbiassed. X The maximum likelihood estimate for p is 100 . The expected value of this E(X) 100p estimate is 100 = 100 = p, so this estimator is unbiassed. (b) The variance of X is 100p(1 − p). Find the bias of the maximum likelihood estimate for this variance. X(100−X) . The ex100 2 ) E(X) − E(X 100 = 100p − The maximum likelihood estimate for the variance is pected value for this estimator is E(X(100−X)) = 100 ((100p2 ) + p(1 − p)) = 99p(1 − p). The bias of this estimator is therefore −p(1 − p). 5. Show that X+1 n+2 is a biased estimator of the binomial parameter θ. Is this estimator asymptotically unbiased? E( X +1 EX + 1 nθ + 1 1 − 2θ )−θ = −θ = −θ = n+2 n+2 n+2 n+2 2 When θ 6= 1/2, X+1 n+2 is a biased estimator of θ. When n → ∞, the bias tends to 0, thus it is asymptotically unbiased. 6. Let Ymin be the smallest order statistic in a random sample of size n drawn from the uniform pdf, fY (y; θ) = 1/θ, 0 ≤ y ≤ θ. Find an unbiased estimator for θ based on Ymin . Rθ fYmin (y) = n θ1 (1 − yθ )n−1 , so E(Ymin ) = n θ1 0 y(1 − yθ )n−1 dy. Integration θ by parts yields E(Ymin ) = n+1 . An unbiased estimator would be (n + 1)Ymin . Bonus question 7. Let Xi = Ai + Bi , where Ai are uniformly distributed on [1, a] and Bi are uniformly distributed on [0, b], where a < b + 1. If the data are X1 = 1.3, X2 = 1.9, X3 = 2.3 and X4 = 3.4, Show that the maximum likelihood estimate for a and b is 2.2 and 1.6. For 1.9 < a < 2.3, 1.3 < b, and 3.4 < a + b, the likelihood of the data (a+b−3.4) 0.9 1 a+b−3.4 0.3 is b(a−1) b(a−1) b b(a−1) = b4 (a−1)3 (after multiplying by a suitable constant). The log likelihood is therefore log(a+b−3.4)−4 log(b)−3 log(a−1). 1 3 The derivative of log likelihood with respect to a is a+b−3.4 − a−1 , and the 1 derivative with respect to b is a+b−3.4 − 4b . Substitute a = 2.2, b = 1.6 in the above two equations, we can verify that they are the MLE for a and b. 3 MATH/STAT 3460, Intermediate Statistical Theory Homework Sheet 3 Model Solutions Basic Questions 1. A random sample of size 2, Y1 and Y2 , is drawn from the pdf fY (y; θ) = 2yθ2 , 0 < y < 1/θ. What must c equal if the statistic c(Y1 + 2Y2 ) is to be an unbiased estimator for 1/θ? 1/θ Z y 2 θ2 dy = E(Y ) = 2 0 2 1 ( ) 3 θ 4 1 1 2 1 E[c(Y 1 + 2Y 2)] = c[E(Y 1) + 2E(Y 2)] = c[ ( ) + ( )] = 2c( ) 3 θ 3 θ θ For the estimator to be unbiased, 2c = 1 or c = 1/2. 2. Show that the sample proportion X n is a minimum variance unbiased estimator of the binomial parameter θ (Hint: Treat X n as the mean of a random sample of size n from a Bernoulli population with the parameter θ.) f (x, θ) = θx (1 − θ)1−x E(X) = θ, E(X 2 ) = θ ∂logf (x; θ) x 1−x x−θ = − = ∂θ θ 1−θ θ(1 − θ) I(θ) = E[( ∂logf (x; θ) 2 1 1 ) ]= 2 E(x − θ)2 = ∂θ θ (1 − θ)2 θ(1 − θ) V ar( Thus X n X θ(1 − θ) 1 )= = n n nI(θ) is minimum variance estimator. E( X )=θ n thus it is unbiased. 3. If X̄1 is the mean of a random sample of size n from a normal population with the mean µ and the variance σ12 , X̄2 is the mean of a random sample of size n from a normal population with the mean µ and the variance σ22 , and the two samples are independent, show that (a) ω X̄1 + (1 − ω)X̄2 , where 0 ≤ ω ≤ 1, is an unbiased estimator of µ; 1 E[ω X̄1 + (1 − ω)X̄2 ] = ωµ + (1 − ω)µ = µ (b) the variance of this estimator is a minimum when ω = V ar[ω X̄1 + (1 − ω)X̄2 ] = ω 2 From dg(ω) dω = 0 we can get ω = σ22 . σ12 +σ22 σ12 σ2 + (1 − ω)2 2 = g(ω) n n σ22 . σ12 +σ22 (c) find the efficiency of the estimator of part (a) with ω = 1/2 relative to σ22 this estimator with ω = σ2 +σ 2. 1 2 1 2 (σ + σ22 ) 4n 1 σ2 σ1 σ12 σ22 σ1 σ2 V ar2 = ( 2 2 2 )2 2 + ( ...

This question has not been answered.

Create a free account to get help with this and any other question!