I need help with statistics help. The book I use is
APPLIED REGRESSION ANALYSIS and GENERALIZED LINEAR MODELS
McMaster University, Hamilton, Ontario, Canada
The exercises I need answered are copied below. Can you offer that assistance in say 1 day and at what cost? Thanks.
Exercise 5.2. ∗Suppose that the means and standard deviations of Y and X are the same: Y = X
and SY = SX.
(a) Show that, under these circumstances,
BY |X = BX|Y = rXY
where BY |X is the least-squares slope for the simple regression of Y on X; BX|Y is the
least-squares slope for the simple regression of X on Y ; and rXY is the correlation between
the two variables. Show that the intercepts are also the same, AY |X = AX|Y .
(b) Why, if AY |X = AX|Y and BY |X = BX|Y , is the least-squares line for the regression of Y
on X different from the line for the regression of X on Y (as long as r2 < 1)?
(c) “Regression toward the mean” (the original sense of the term “regression”): Imagine that
X is father’s height and Y is son’s height for a sample of father-son pairs. Suppose that
SY = SX, that Y = X, and that the regression of sons’ heights on fathers’ heights is linear.
Finally, suppose that 0 < rXY < 1 (i.e., fathers’ and sons’ heights are positively correlated,
but not perfectly so). Show that the expected height of a son whose father is shorter than
average is also less than average, but to a smaller extent; likewise, the expected height of a
son whose father is taller than average is also greater than average, but to a smaller extent.
Does this result imply a contradiction—that the standard deviation of son’s height is in fact
less than that of father’s height?
(d) What is the expected height for a father whose son is shorter than average? Of a father
whose son is taller than average?
(e) Regression effects in research design: Imagine that educational researchers wish to assess
the efficacy of a new program to improve the reading performance of children. To test the
program, they recruit a group of children who are reading substantially below grade level;
after a year in the program, the researchers observe that the children, on average, have
improved their reading performance. Why is this a weak research design? How could it be
Exercise 5.7. Consider the general multiple-regression equation
Y = A + B1X1 + B2X2 +· · ·+BkXk + E
An alternative procedure for calculating the least-squares coefficient B1 is as follows:
1. Regress Y on X2 through Xk, obtaining residuals EY |2 ... k.
2. Regress X1 on X2 through Xk, obtaining residuals E1|2 ... k.
3. Regress the residuals EY |2 ... k on the residuals E1|2 ... k. The slope for this simple regression
is the multiple-regression slope for X1, that is, B1.
(a) Apply this procedure to the multiple regression of prestige on education, income, and
percentage of women in the Canadian occupational prestige data, confirming that the
coefficient for education is properly recovered.
(b) Note that the intercept for the simple regression in Step 3 is 0. Why is this the case?
(c) In light of this procedure, is it reasonable to describe B1 as the “effect of X1 on Y when
the influence of X2, . . . , Xk is removed from both X1 and Y ”?
(d) The procedure in this problem reduces the multiple regression to a series of simple
regressions (in Step 3). Can you see any practical application for this procedure? (See
the discussion of added-variable plots in Section 11.6.1.)
Exercise 6.7. Consider the regression model Y = α+β1x1+β2x2+ε.Howcan the incremental sum-
of-squares approach be used to test the hypothesis that the two population slopes are equal
to each other, H0: β1 = β2? [Hint: Under H0, the model becomes Y = α + βx1 + βx2 + ε =
Y = α+β(x1 +x2)+ε, where β is the common value of β1 and β2.] Under what circumstances
would a hypothesis of this form be meaningful? (Hint: Consider the units of measurement of x1
and x2.) Now, test the hypothesis that the “population” regression coefficients for education and
income in Duncan’s occupational prestige regression are equal to each other. Is this test sensible?