STAT 415 Dummy Variables Statistics Worksheet

User Generated



STAT 415



Unformatted Attachment Preview

STAT 415/615 Miller Assignment #10 Due: Wednesday April 15, 2020 by 5 PM EST Guidelines for your submission: • • • • • Your responses must be submitted as a single PDF (Portable Document Format) file. Include your name at the top. Please copy and paste any R (or similar) output or graphics that you create into your assignment document. Work by hand can also be included. All responses must be easy to read and labeled with the appropriate problem number. Please save your work regularly (both your work in R and the solutions to the assignment questions). Submit the .pdf file with your responses via Blackboard. Part 1: Working with Dummy Variables Instructions: You must show all work and/or provide a full explanation for the following problems. You should use R or other software for plots. 1. (based on text p. 337: 8.13) Consider a regression model 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝜀, where 𝑋1 is a numerical variable and 𝑋2 is a dummy variable. Plot the response functions (the graphs of 𝐸(𝑌) as a function of 𝑋1 for different values of 𝑋2), if 𝛽0 = 25, 𝛽1 = 0.2, and 𝛽2 = −12. 2. Continue the previous exercise. Sketch the response curves for the model with interaction, 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝛽3 𝑋1 𝑋2 + 𝜀, given that 𝛽3 = −0.2. 3. (based on text p.340: 8.34) In a regression study, three types of banks were involved, namely, (1) commercial, (2) mutual savings, and (3) savings and loan. Consider the following dummy variables for the type of bank: a) Develop the first-order linear regression model (with no interactions) for relating last year’s profit or loss (𝑌 ) to the size of the bank (𝑋1) and type of bank (𝑋2 , 𝑋3). b) State the response function for the three types of banks. c) Interpret each of the following quantities: (1) 𝛽2, (2) 𝛽3 , (3) 𝛽2 − 𝛽3. 1 STAT 415/615 Miller Part 2: Intro to Model-Building Instructions: Please read section 9.1 & 9.2 in the text (pp. 343-353)before answering the following questions. You must provide a full explanation for each. 4. (based on text p. 376: 9.1) A speaker stated: “In well-designed experiments involving quantitative explanatory variables, a procedure for reducing the number of explanatory variables after the data are obtained is not necessary. Do you agree? Discuss. 5. (based on text p. 376: 9.2) The dean of a graduate school wishes to predict the GPA in graduate work for recent applicants. List eight variables that might be useful explanatory variables here. Part 3: Mini-Project Instructions: Use statistical software to answer the following questions. For each, please provide any relevant output and your answer to the question. 6. (based on text pp. 337-8: 8.16, 8.20) This problem returns to our old GPA data. Previously the GPA of the graduate students was predicted based on their ACT score. An assistant to the director of admissions conjectured that the predictive power of the model could be improved by adding information on whether the student had chosen a major field of concentration at the time the application was submitted. The data set for this problem is GPA-1.19-8.16.csv (on Blackboard). a) Fit the regression model 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝜀, where 𝑋1 is the ACT score and 𝑋2 = 1 if a student has indicated a major at the time of application and 𝑋2 = 0 if the major was undecided. State the estimated regression function. b) Test whether 𝑋2 can be dropped from the model, using 𝛼 = 0.05. c) Fit the regression model 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝛽3 𝑋1 𝑋2 + 𝜀 and state the estimated regression function. d) Interpret 𝛽3. e) Test whether the interaction term can be dropped from the model, using 𝛼 = 0.05. 2 REGRESSION MODELS FOR QUANTITATIVE AND QUALITATIVE PREDICTORS (8.2-8.5) INTRO TO THE MODELBUILDING PROCESS (CHAPTER 9) S TAT 4 1 5 / 6 1 5 MILLER INTERACTION REGRESSION MODELS S TAT 4 1 5 / 6 1 5 MILLER REVIEW: GENERAL LINEAR REGRESSION MODEL • The general linear regression model, with Normal error terms, in terms of predictor variables 𝑿 is : 𝒀𝒊 = 𝛽0 + 𝛽1 𝑋𝑖1 + 𝛽2 𝑋𝑖2 + ⋯ + 𝛽𝑝−1 𝑋𝑖,𝑝−1 + 𝜀𝑖 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑖 = 1, … , 𝑛 𝜷𝟎 , 𝜷𝟏 , … , 𝜷𝒑−𝟏 : 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟𝑠 𝑿𝒊𝟏 , 𝑿𝒊𝟐 , … , 𝑿𝒊, 𝒑−𝟏 : 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒𝑠 𝑖𝑛 𝑖 𝑡ℎ 𝑡𝑟𝑖𝑎𝑙 (𝑘𝑛𝑜𝑤𝑛 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡𝑠) 𝜺𝒊 : 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝑁 0, 𝜎 2 𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒𝑠 • The regression function for this model is: 𝑬 𝒀 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + ⋯ + 𝛽𝑝−1 𝑋𝑝−1 ∴ The general linear model with Normal error terms implies that responses 𝒀𝒊 are independent Normal random variables with : • Mean: 𝐸(𝑌𝑖 ) = 𝛽0 + 𝛽1 𝑋𝑖1 + 𝛽2 𝑋𝑖2 + ⋯ + 𝛽𝑖,𝑝−1 𝑋𝑝−1 • (Constant) Variance: 𝜎 2 REVIEW: GENERAL LINEAR REGRESSION MODEL • 𝑿𝟏 , 𝑿𝟐 , … , 𝑿𝒑−𝟏 can be raised to higher powers (i.e., the predictors can be squared or higher-order terms) and be nonadditive (have an interacting effect). 𝒀𝒊 can be transformed. (Note: “Linear” refers to the fact that the regression function is a linear combination of the parameters 𝜷𝟎 , 𝜷𝟏 , … , 𝜷𝒑−𝟏 .) Examples: 1. Polynomial regression models are general linear regression models. (They yield a curvilinear response function.) a) 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 2 + 𝜀𝑖 2 b) 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝛽2 𝑋𝑖 + 𝜀𝑖 Y 25 20 15 10 5 1 2 3 4 5 Xi ADDITIVE EFFECTS VS INTERACTION EFFECTS A regression model with 𝒑 − 𝟏 predictors contains additive effects if the response (regression) function can be written as: 𝐸 𝑌 = 𝑓1 𝑋1 + 𝑓2 𝑋2 + ⋯ + 𝑓𝑝−1 𝑋𝑝−1 where 𝑓1 𝑋1 , 𝑓2 𝑋2 , … , 𝑓𝑝−1 𝑋𝑝−1 are functions of the predictors (can be simple or complicated). Examples (adapted from examples in the Chapter 6 notes): 𝑎) 𝐸 𝑌 = 𝛽0 + 𝛽1 𝑋1 2 (Additive) 𝑏) 𝐸 𝑌 = 𝛽0 + 𝛽1 𝑋1 2 + 𝛽2 𝑋1 (Additive) 𝒇 𝟏 𝑿𝟏 𝒇 𝟏 𝑿𝟏 Examples (adapted from examples in the Chapter 6 notes): 𝐸 𝑌 = 𝛽0 + 𝛽1 ln 𝑋++ 𝛽ถ 2 𝑋2 𝒇 𝟏 𝑿𝟏 AN INTERACTION TERM 𝒇 𝟐 𝑿𝟐 𝐸 𝑌 = 𝛽0 + 𝛽1 𝑋1 2 ++ 𝛽2 𝑋1 𝑋2 𝒇 𝟏 𝑿𝟏 (Additive) (NOT Additive/contains an interaction effect) 𝒇 𝟐 𝑿𝟏 , 𝑿𝟐 • The cross-product term is an interaction term that may be called linear-by-linear or bilinear. • When we have interaction terms, we must change our interpretation of the regression coefficients. REVIEW: FIRST-ORDER (LINEAR) REGRESSION MODEL WITH TWO PREDICTORS, X 1 AND X 2 Parameter: Y intercept Parameters: Slope Coefficients Value of Predictor Variable on 𝑖 𝑡ℎ trial Parameter: Random Error term Response in 𝑖 𝑡ℎ trial Yi = β0 + β1 X i1 + β2 X i2 + εi Yi = β0 + β1 X i + εi Constant/Linear component Random Error component REVIEW: FIRST-ORDER (LINEAR) REGRESSION MODEL WITH TWO PREDICTORS : 𝑌𝑖 • 𝑌𝑖 is a random variable • Assuming that 𝐸 𝜀𝑖 = 0, the regression function for this model is: 𝐸 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 • 𝐸 𝑌 = 𝛽0 + 𝛽1 𝑋1 is represented as a line. 𝐸 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 has an additional variable and must be represented by a plane, also called a regression surface or response surface, (2-dimensional and one dimension higher than a 1-dimensional line!). OLD EXAMPLE: 𝐸 𝑌 = 5 + 1 1 𝑋1 + 3 𝑋2 REVIEW: REGRESSION COEFFICIENTS Components Interpretation For example: 𝜷𝟎 𝛽0 = 5 for our example The y-intercept of the regression plane. • If scope of model includes 𝑋1 = 0 and 𝑋2 = 0, it represents the mean response 𝐸(𝑌|(𝑋1 = 0, 𝑋2 = 0)). • Otherwise, it has no practical meaning. 𝜷𝟏 Change in the mean response 𝐸(𝑌|𝑋𝑖2 ) per unit increase in 𝑋1 (note:𝑋2 is held constant). e.g., mean response increases by 11 with a 1 unit increase of 𝑋1 when 𝑋2 is held constant. If 𝑋2 =1 , 𝐸(𝑌| 𝑋2 = 1 = 5 + 11𝑋1 + 3 1 = 8 + 11𝑋1 • This is a straight line with slope 11. • y-intercept will change for each 𝑋2 level . REVIEW: REGRESSION COEFFICIENTS Components Interpretation 𝜷𝟐 Change in the mean response 𝐸(𝑌|𝑋𝑖1 ) per unit increase in 𝑋2 (note:𝑋1 is held constant). For example: e.g., mean response increases by 3 with a 1 unit increase of 𝑋2 when 𝑋1 is held constant. If 𝑋1 =4 , 𝐸(𝑌| 𝑋1 = 4 = 5 + 11(4) + 3 1 = 49 + 3𝑋2 • This is a straight line with slope 3. • y-intercept will change for each 𝑋1 level . ALTERED INTERPRETATIONS OF REGRESSION COEFFICIENTS Consider a regression model for two quantitative predictors with linear effects on 𝑌 and interacting effects of 𝑿𝟏 and 𝑿𝟐 on 𝑌 represented by a cross-product term: Yi = β0 + β1 Xi1 + β2 X𝑖2 + 𝛽3 𝑋𝑖1 𝑋𝑖2 + εi The regression function is: 𝐸(𝑌) = β0 + β1 X1 + β2 X2 + 𝛽3 𝑋1 𝑋2 • 𝛃𝟎 & 𝛃𝟏 will now have new interpretations and what they previously represented is shown by a different term. ALTERED INTERPRETATIONS OF REGRESSION COEFFICIENTS 𝜷𝟏 + 𝜷𝟑 𝑿𝟐 Now represents the change in the mean response (𝑬(𝒀|𝑿𝒊𝟐 ) ) with a 1 unit increase in 𝐗 𝟏 when X2 is held constant. Why? 𝜷𝟐 + 𝜷𝟑 𝑿𝟏 Now represents the change in the mean response (𝑬(𝒀|𝑿𝒊𝟏 ) ) with a 1 unit increase in 𝐗 𝟐 when X1 is held constant. Why? 𝝏𝑬(𝒀) = 0 + 𝛽1 + 0 + 𝜷𝟑 𝑿𝟐 𝝏𝑿𝟏 𝜷 + 𝜷 𝟑 𝑿𝟐 = 𝟏 𝝏𝑬(𝒀) = 0 + 0 + 𝛽2 + 𝛽3 𝑋1 𝝏𝑿𝟐 𝜷 + 𝜷 𝟑 𝑿𝟏 = 𝟐 Note: Both changes depend on the level of the other predictor. EXAMPLE In Chapter 6, we looked at 𝑬 𝒀 = 𝟓 + 𝟏𝟏𝑿𝟏 + 𝟑𝑿𝟐 . Recall that the regression surface is a plane. At any given 𝑋1 (or 𝑋2 ) level, 𝑬 𝒀 will be a line with the same slope 𝜷𝟐 𝒐𝒓 𝜷𝟏 (i.e., they are parallel) and a different intercept. EXAMPLE Now consider adding a cross-product term: 𝑬 𝒀 = 𝟓 + 𝟏𝟏𝑿𝟏 + 𝟑𝑿𝟐 +𝟏𝟑𝑿𝟏 𝑿𝟐 Here the lines representing 𝑬 𝒀 at different 𝑋1 (or 𝑋2 ) levels will have different slopes (i.e., they are not parallel) and still have different intercepts. The response surface is not a plane. 𝑬 𝒀|𝑿𝟏 = 𝟏 = 5 + 11 1 + 3𝑋2 +13 1 𝑋2 = 5 + 11 + 3 + 13 ∗ 1 𝑋2 = 𝟏𝟔 + 𝟏𝟔𝑿𝟐 𝑬 𝒀|𝑿𝟏 = 𝟐 = 5 + 11 2 + 3𝑋2 +13 2 𝑋2 = 5 + 22 + 3 + 13 ∗ 2 𝑋2 = 𝟐𝟕 + 𝟐𝟔𝑿𝟐 𝑬 𝒀|𝑿𝟐 = 𝟏 = 5 + 11𝑋1 + 3(1)+13𝑋1 (1) = 5 + 3 + 11 + 13 ∗ 1 𝑋1 = 𝟏𝟔 + 𝟏𝟔𝑿𝟏 𝑬 𝒀|𝑿𝟐 = 𝟐 = 5 + 11𝑋1 + 3(2)+13𝑋1 (2) = 5 + 6 + 11 + 13 ∗ 2 𝑋1 = 𝟐𝟕 + 𝟐𝟔𝑿𝟏 EXAMPLE 𝑬 𝒀|𝑿𝟏 = 𝟏 = 5 + 11 1 + 3𝑋2 +13 1 𝑋2 = 5 + 11 + 3 + 13 ∗ 1 𝑋2 = 𝟏𝟔 + 𝟏𝟔𝑿𝟐 𝑬 𝒀|𝑿𝟏 = 𝟐 = 5 + 11 2 + 3𝑋2 +13 2 𝑋2 = 5 + 22 + 3 + 13 ∗ 2 𝑋2 = 𝟐𝟕 + 𝟐𝟔𝑿𝟐 o Note that the slopes are increasing as 𝑋1 (or 𝑋2 ) increases. A 1 unit increase in 𝑿𝟐 (or 𝑋1 ) has a larger effect on the response when 𝑿𝟏 (or 𝑋2 ) are at higher levels. ▪ This interaction effect between 𝑋1 and 𝑋2 is said to be synergistic (or of reinforcement type) and occurs when 𝛃𝟏 , 𝜷𝟐 & 𝛃𝟑 are positive. 160 140 120 100 80 60 40 1 2 3 4 5 X2 EXAMPLE 𝑬 𝒀 = 𝟓 + 𝟏𝟏𝑿𝟏 + 𝟑𝑿𝟐 -3𝑿𝟏 𝑿𝟐 𝑬 𝒀|𝑿𝟏 = 𝟏 = 5 + 11 1 + 3𝑋2 -3 1 𝑋2 = 5 + 11 + 3 − 3 ∗ 1 𝑋2 = 𝟏𝟔 𝑬 𝒀|𝑿𝟏 = 𝟐 = 5 + 11 2 + 3𝑋2 -3 2 𝑋2 = 5 + 22 + 3 − 3 ∗ 2 𝑋2 = 𝟐𝟕 − 𝟑𝑿𝟐 If 𝛃𝟏 & 𝜷𝟐 are positive but 𝛃𝟑 is negative, the interaction effect is said to be antagonistic (or of interference type). ▪ This means that the slope, mean change in response per unit increase in a predictor (𝑋2 in this case), will decrease for higher levels of the other predictor (𝑋1 in this case). 25 20 15 1 2 3 4 5 X2 Q U A L I TAT I V E PREDICTORS: DEFINING DUMMY VA R I A B L E S S TAT 4 1 5 / 6 1 5 MILLER INDICATOR/DUMMY VARIABLES We can incorporate qualitative variables, such as demographics and Yes/No formats, into regression models. o A common way to do this is by introducing indicator (or dummy) variables that take on a value of 1 or 0. **A qualitative variable with 𝒄 classes will be represented by 𝒄 − 𝟏 indicator/dummy variables. ▪ Why 𝑐 − 1? Why not 𝑐? Let’s say we have two predictors-one quantitative and one qualitative. If the qualitative variable has two classes and we define two indicator/dummy variables based on it, we run into problems. • The classes will be dependent, and we will see that when they are added as columns to the 𝑿 matrix we get columns that are linearly dependent. EXAMPLE The references the “Copier Maintenance” from our text. Tricity Office Equipment Corporation sells franchised copiers and performs preventative maintenance and repairs. Data was collected for 45 service calls. Here the response (𝒀) is the total number of minutes spent on a service call is predicted by the number of copiers to be serviced (𝑿𝟏 ) and the size of the copiers to be serviced-small or large. Note: 17 calls involve small copiers. 1 𝑋11 1 0 1 𝑋21 0 1 1 𝑋31 0 1 1 𝑐𝑜𝑝𝑖𝑒𝑟 𝑖𝑠 𝑙𝑎𝑟𝑔𝑒 1 𝑐𝑜𝑝𝑖𝑒𝑟 𝑖𝑠 𝑠𝑚𝑎𝑙𝑙 If 𝑋𝑖2 = ቊ and 𝑋𝑖3 = ቊ then 𝑿 = . ⋮ ⋮ ⋮ ⋮ 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 1 𝑋44,1 0 1 1 𝑋45,1 0 1 Notice that the sum of the 3rd and 4th columns=the first column. This means that the columns are linearly dependent. Recall: If a matrix has linearly dependent columns, it is singular, and thus has determinant=0 and is not invertible. EXAMPLE Since 𝑿 has linearly dependent columns, we also get that 𝑿𝑻 𝑿 has linearly dependent columns. 1 1 𝑋 𝑋21 𝑿𝑻 𝑿 = 11 1 0 0 1 σ45 σ45 𝑖=1 1 𝑖=1 𝑋𝑖1 σ45 𝑋𝑖1 σ45 𝑋𝑖1 2 𝑖=1 𝑖=1 = #𝑠𝑚𝑎𝑙𝑙 σ𝑠𝑚𝑎𝑙𝑙 𝑋𝑖1 #𝑙𝑎𝑟𝑔𝑒 σ𝑙𝑎𝑟𝑔𝑒 𝑋𝑖1 #𝑠𝑚𝑎𝑙𝑙 σ𝑠𝑚𝑎𝑙𝑙 𝑋𝑖1 #𝑠𝑚𝑎𝑙𝑙 0 1 𝑋31 0 1 #𝑙𝑎𝑟𝑔𝑒 σ𝑙𝑎𝑟𝑔𝑒 𝑋𝑖1 0 #𝑙𝑎𝑟𝑔𝑒 … 1 … 𝑋44,1 … 0 … 1 1 1 1 𝑋45,1 1 0 ⋮ 1 1 1 σ45 45 𝑖=1 𝑋𝑖1 σ45 𝑋𝑖1 σ45 𝑋𝑖1 2 𝑖=1 𝑖=1 = 17 σ𝑠𝑚𝑎𝑙𝑙 𝑋𝑖1 28 σ𝑙𝑎𝑟𝑔𝑒 𝑋𝑖1 𝑋11 𝑋21 𝑋31 ⋮ 𝑋44,1 𝑋45,1 1 0 0 ⋮ 0 0 0 1 1 ⋮ 1 1 17 28 σ𝑠𝑚𝑎𝑙𝑙 𝑋𝑖1 σ𝑙𝑎𝑟𝑔𝑒 𝑋𝑖1 17 0 0 28 Once again, the 1st column is the sum of the 3rd and 4th. Therefore, 𝑿𝑻 𝑿 is singular and not invertible. Recall −1 that 𝒃 = 𝑿𝑻 𝑿 𝑿𝑻 𝒀. If 𝑿𝑻 𝑿 coefficients of this model. −1 does not exist, we are unable to find estimators for the regression EXAMPLE Solution? Drop one of the dummy variables and use the regression model 𝐘𝐢 = β0 + β1 Xi1 + β2 X𝑖2 + εi where Xi1 is the quantitative variable ( number of copiers in this case) and Xi2 is the dummy variable, in this case 𝑋𝑖2 = ቊ 1 0 𝑐𝑜𝑝𝑖𝑒𝑟 𝑖𝑠 𝑠𝑚𝑎𝑙𝑙 and the regression function is 𝑬(𝒀) = β0 + β1 X1 + β2 X2 . 𝑐𝑜𝑝𝑖𝑒𝑟 𝑖𝑠 𝑙𝑎𝑟𝑔𝑒 Q U A L I TAT I V E PREDICTORS: INTERPRETING THE REGRESSION COEFFICIENTS S TAT 4 1 5 / 6 1 5 MILLER DUMMY VARIABLES: INTERPRETING REGRESSION COEFFICIENTS Considering our model with one quantitative predictor and one dummy variable. If 𝑿𝟐 = 𝟏 (“copier is small” in our example), 𝑬 𝒀|𝑿𝟐 = 𝟏 = 𝛽0 + β1 X1 + β2 1 = 𝛽0 + β1 X1 + β2 = 𝜷𝟎 + 𝛃𝟐 + 𝛃𝟏 𝐗 𝟏 . If 𝑿𝟐 = 𝟎 (“copier is large” in our example), 𝑬 𝒀|𝑿𝟐 = 𝟎 = 𝛽0 + β1 X1 + β2 0 = 𝜷𝟎 + 𝛃𝟏 𝐗 𝟏 . • In either case, we have a straight line with slope=𝛃𝟏 . Example: The mean minutes spent on a service call is a linear function of the number of copiers to be serviced with slope 𝛃𝟏 . 𝛃𝟏 represents the average change in service time per 1 unit increase in number of copiers (𝐗 𝟏 ), given that the size of the copier (𝐗 𝟐 ) is held constant. DUMMY VARIABLES: INTERPRETING REGRESSION COEFFICIENTS • The intercepts,𝜷𝟎 + 𝛃𝟐 versus 𝜷𝟎 , differ by 𝛃𝟐 . Example: 𝛃𝟐 indicates how much longer (or shorter) the mean service time is for calls involving large copiers as compared to small copiers, for any given number of copiers to be repaired. In general, 𝛃𝟐 indicates how much higher (or lower) the mean response line is for the class coded 1 than the line for the class coded 0, for any given level of 𝐗 𝟏 . EXAMPLE • The plot below shows the least-squares regression line for predicting the service time based on the number of copiers to be serviced. ෡ = −𝟎. 𝟓𝟖𝟎𝟐 + 𝟏𝟓. 𝟎𝟑𝟓𝟐𝑿𝟏 . • The estimated regression function is 𝒀 Call: lm(formula = time ~ number) Coefficients: (Intercept) number -0.5802 15.0352 EXAMPLE • The new plot below shows the estimated regression functions for small (in red) and large (in black) copiers separately. • Both regression functions are based on the estimated regression function: ෡ = −𝟎. 𝟗𝟐𝟐𝟓 + 𝟏𝟓. 𝟎𝟒𝟔𝟏𝑿𝟏 + 𝟎. 𝟕𝟓𝟖𝟕𝑿𝟐 𝒀 Call: lm(formula = time ~ number + size) Coefficients: (Intercept) -0.9225 number 15.0461 size 0.7587 EXAMPLE (Intercept) number size 2.5 % 97.5 % -7.177891 5.332945 14.057283 16.035004 -4.851254 6.368698 • Above we see the 95% confidence intervals for the regression parameters/coefficients. • The 95% confidence interval for 𝜷𝟐 is: −𝟒. 𝟖𝟓𝟏 ≤ 𝜷𝟐 ≤ 𝟔. 𝟑𝟔𝟗 • We are 95% confident that the average difference in service time for any given number of copiers may be somewhere between calls involving small copiers taking almost 5 minutes less and over 6 minutes more. Generally, we do not expect service times to be drastically different for calls involving small copiers and calls involving large copiers. • Note: 0 falls within this interval. EXAMPLE For testing 𝑯𝟎 : 𝜷𝟐 = 𝟎 𝒗𝒔. 𝑯𝑨 : 𝜷𝟐 ≠ 𝟎 we can use a t test or F test (as seen in the previous chapters). Call: lm(formula = time ~ number + size) Analysis of Variance Table Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.9225 3.0997 -0.298 0.767 number 15.0461 0.4900 30.706 F) 1 43 3416.4 2 42 3410.3 1 6.0488 0.0745 0.7862 • The p-value=0.786. We fail to reject that 𝜷𝟐 = 𝟎 . • We may consider dropping , as the size of the copier does not seem to have a large effect on service time. • This coincides with the plots shown earlier (the original regression model with just number of copiers as a predictor fits well overall and the separate regression functions for small and large copiers based on the multiple regression model are similar. • We also saw that 0 DID fall within the 95% confidence interval for 𝜷𝟐 . REASON FOR USING A MODEL WITH A DUMMY VARIABLE Why use regression function with a dummy variable instead of two separate regression functions for small and large copiers? Our model assumes equal slopes and same constant error term variance for 𝑋2 = 1 and 𝑋2 = 0. The common slope can be best estimated by pooling small and large copiers. Inferences can made more precisely. The model we just discussed can be adjusted to include more dummy variables. 𝐘𝐢 = β0 + β1 Xi1 + β2 X𝑖2 + ⋯ + β𝑐 X𝑖𝑐 + εi for 𝒄 − 𝟏 dummy variables to represent a qualitative variable with 𝒄 classes. Q U A L I TAT I V E PREDICTORS: C O N S I D E R AT I O N S S TAT 4 1 5 / 6 1 5 MILLER ALTERNATIVES FOR CODING INDICATOR VARIABLES 1. Using 1 & -1 rather than 1 & 0. Example: 𝑋𝑖2 = ቊ 1 −1 𝑐𝑜𝑝𝑖𝑒𝑟 𝑖𝑠 𝑠𝑚𝑎𝑙𝑙 𝑐𝑜𝑝𝑖𝑒𝑟 𝑖𝑠 𝑙𝑎𝑟𝑔𝑒 If “copier is small”, 𝑬 𝒀|𝑿𝟐 = 𝟏 = 𝛽0 + β1 X1 + β2 1 = 𝛽0 + β1 X1 + β2 = 𝜷𝟎 + 𝛃𝟐 + 𝛃𝟏 𝐗 𝟏. If “copier is large”, 𝑬 𝒀|𝑿𝟐 = −𝟏 = 𝛽0 + β1 X1 + β2 −1 = 𝜷𝟎 − 𝛃𝟐 + 𝛃𝟏 𝐗 𝟏 . • • In either case, we still have a straight line with slope=𝛃𝟏 . The intercepts,𝜷𝟎 + 𝛃𝟐 versus 𝜷𝟎 − 𝛃𝟐 , have an average of 𝛃𝟎 . Example: 𝛃𝟐 indicates how much the intercepts for small and large copiers differ from the average intercept of 𝛃𝟎 . ALTERNATIVES FOR CODING INDICATOR VARIABLES 2. Use 𝒄 dummy variables for a qualitative variable with 𝒄 classes and drop the intercept. Our model with one quantitative variable and one qualitative predictor becomes: 𝐘𝐢 = β1 Xi1 + β2 X𝑖2 + ⋯ + β𝑐+1 X𝑖,𝑐+1 + εi Example: If the qualitative predictor has two classes, such as in our Copier Maintenance example. 1 1 𝑐𝑜𝑝𝑖𝑒𝑟 𝑖𝑠 𝑠𝑚𝑎𝑙𝑙 and 𝑋𝑖3 = ቊ 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 0 functions are: If 𝑋𝑖2 = ቊ 𝑐𝑜𝑝𝑖𝑒𝑟 𝑖𝑠 𝑙𝑎𝑟𝑔𝑒 the response 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 If “copier is small”, 𝑬 𝒀|𝑿𝟐 = 𝟏, 𝑿𝟑 = 𝟎 = β1 X1 + β2 1 + β3 0 = 𝜷𝟎 + 𝛃𝟏 𝐗 𝟏 If “copier is large”, 𝑬 𝒀|𝑿𝟐 = 𝟎, 𝑿𝟑 = 𝟏 = β1 X1 + β2 0 + β3 1 = 𝜷𝟑 + 𝛃𝟏 𝐗 𝟏 . ALTERNATIVES FOR CODING INDICATOR VARIABLES **Note: Other types of coding involve: • Using allocated codes, such as a satisfaction scale. • Coding a quantitative variable. MODELING INTERACTIONS BETWEEN Q U A N T I TAT I V E & Q U A L I TAT I V E PREDICTORS S TAT 4 1 5 / 6 1 5 MILLER INTERACTIVE MODEL FOR ONE QUANTITATIVE VARIABLE AND ONE DUMMY VARIABLE To consider the possibility of interaction effects between a quantitative variable 𝑿𝒊𝟏 and qualitative variable with two classes: 𝐘𝐢 = β0 + β1 Xi1 + β2 X𝑖2 + 𝛽3 𝑋𝑖1 𝑋𝑖2 + εi and the regression function is: 𝑬(𝒀) = β0 + β1 X1 + β2 X2 + 𝛽3 𝑋1 𝑋2 Example: Xi1 is the quantitative variable ( number of copiers in this case) and Xi2 is the dummy variable, in this case 𝑋𝑖2 = ቊ 1 0 𝑐𝑜𝑝𝑖𝑒𝑟 𝑖𝑠 𝑠𝑚𝑎𝑙𝑙 . 𝑐𝑜𝑝𝑖𝑒𝑟 𝑖𝑠 𝑙𝑎𝑟𝑔𝑒 If “copier is small”, 𝑬 𝒀|𝑿𝟐 = 𝟏 = 𝛽0 + β1 X1 + β2 1 + 𝛽3 𝑋1 1 = 𝛽0 + β1 X1 + β2 + +𝛽3 𝑋1 = 𝜷𝟎 + 𝛃𝟐 + (𝛃𝟏 +𝜷𝟑 )𝐗 𝟏 . If “copier is large” in our example), 𝑬 𝒀|𝑿𝟐 = 𝟎 = 𝛽0 + β1 X1 + β2 0 + 𝛽3 𝑋1 0 = 𝜷𝟎 + 𝛃𝟏 𝐗 𝟏 . • Now 𝛃𝟐 shows how much larger (or smaller) the y-intercept of the response (service time in minutes in this case) is for the class coded 1 (small copiers here) than for the class coded 0 (large copiers here). o This difference is only for the y-intercept and no longer for any given 𝐗 𝟏 level since the slopes differ. o Effect the size of the copier with this regression model has depends on the number of copiers to be serviced. Interaction effects are present. Possible scenario: For a smaller number of copiers, smaller copiers take longer to service, but for a larger number of copiers large copiers take longer to service. This type of interaction is called disordinal. Here large copiers always tend to take less time to service, but the effect is much smaller for a larger number of copiers. In this case (when nonparallel response functions with one quantitative and one qualitative variable) do not intersect we say that the interaction is ordinal. EXAMPLE • Now we have the estimated regression function based on the multiple regression model with an added interaction (cross-product) term: ෡ = 𝟐. 𝟖𝟏𝟑 + 𝟏𝟒. 𝟑𝟑𝟗𝑿𝟏 − 𝟖. 𝟏𝟒𝟏𝑿𝟐 + 1.777𝑿𝟏 𝑿𝟐 𝒀 Call: lm(formula = time ~ number * size) Coefficients: (Intercept) number 2.813 14.339 size number:size -8.141 1.777 EXAMPLE For testing 𝑯𝟎 : 𝜷𝟑 = 𝟎 𝒗𝒔. 𝑯𝑨 : 𝜷𝟑 ≠ 𝟎 we can, again, use a t test or F test (as seen in the previous chapters). Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.8131 3.6468 0.771 0.4449 number 14.3394 0.6146 23.333 F) 1 42 3410.3 2 41 3154.4 1 255.89 3.326 0.07549 • The p-value=0.0755. We can only reject that 𝜷𝟑 = 𝟎 at levels 0.0755 and above. • We do not have enough evidence to support that there is an interaction effect and may keep our model with no interaction term. • In fact, from all of our analysis we may just use “number” as a predictor of the time of the service call. MODEL SELECTION & V A L I D AT I O N : INTRODUCTION TO THE MODELBUILDING PROCESS S TAT 4 1 5 / 6 1 5 MILLER GENERAL MODEL-BUILDING PROCESS 1. Data collection and prep 2. Reduction of predictor variables (for exploratory observational studies only) 3. Model Reinforcement and selection 4. Model validation
Purchase answer to see full attachment
Explanation & Answer:
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Hello, here is your assignment At the very bottom of the assignment, I also attached the R code I used as reference. Make sure that you change the file name (which I have highlighted in pink) before you submit the assignment.Let me know if you need anything else!

Part 1: Working with Dummy Variables
Instructions: You must show all work and/or provide a full explanation for the following
problems. You should use R or other software for plots.
1. (based on text p. 337: 8.13) Consider a regression model 𝑌 = 𝛽0 + 𝛽1𝑋1 + 𝛽2𝑋2 + 𝜀, where 𝑋1 is
a numerical variable and 𝑋2 is a dummy variable. Plot the response functions (the graphs of 𝐸(𝑌)
as a function of 𝑋1 for different values of 𝑋2), if 𝛽0=25, 𝛽1=0.2, and 𝛽2=−12.

E{Y} = 25 + 0.2X1 - 12X2

When X2 = 0, the response function is:
E{Y| X2 = 0} = 25 + 0.2X1

When X2 = 1, the response function is:
E{Y| X2 = 1} = 25 + 0.2X1 -12 = 13 + 0.2X1

2. Continue the previous exercise. Sketch the response curves for the model with interaction, 𝑌 =
𝛽0 + 𝛽1𝑋1 + 𝛽2𝑋2 +𝛽3𝑋1𝑋2+ 𝜀, given that 𝛽3=−0.2.

E{Y} = 25 + 0.2X1 - 12X2 - 0.2X1 X2

When X2 = 0, the response function is:
E{Y| X2 = 0} = 25 + 0.2X1

When X2 = 1, the response function is:
E{Y| X2 = 1} = 25 + 0.2X1 -12 - 0.2X1 = 13

3. (based on text p.340: 8.34) In a regression study, three types of banks were involved, namely, (1)
commercial, (2) mutual savings, and (3) savings and loan. Consider the following dummy variables
for the type of bank:

a) Develop the first-order linear regression model (with no interactions) for relating last year’s
profit or loss (𝑌 ) to the size of the bank (𝑋1) and type of bank (𝑋2,𝑋3).
𝑌 = 𝛽0 + 𝛽1𝑋1 + 𝛽2𝑋2 + 𝛽3𝑋3 + 𝜀
b) State the response function f...

Related Tags