MINITAB SOFTWARE IS NEEDED
Assignment 7 Project Part 4-- The Multiple Regression Forecast -- This assignment is due by midnight November 25th. This completed assignment is worth up to 2.5 extra credit points and may serve as the multiple regression portion of the class project. Late submissions will not be graded.
This assignment is essentially the multiple regression analysis portion of your project. This means that I expect you to develop a good regression model with more than one independent variable (X). Ideally, if you made a good choice of variables in your proposal you should be able to include all three or more X variables in your regression equation. Be sure to complete each part and write your responses supported by Minitab/excel work. This assignment should be turned in to me as a Word document. You should include excel and Minitab tables and graphs in the Word document as required. Be sure to comment on each of the 10 points below.
1.Run scatter plots and a correlation matrix on your project variables and comment on their values and significance if you have done this earlier you may use that analysis here.
2. Note any seasonality in your Y data with ACF (autocorrelation analysis of Y) You may use ACFs that you previously developed.
3.Determine if any of your variables require transformation. If they do, calculate the transformed values and create a scatter plot with a regression line and run a correlation with Y for each transformed X. Create a table for the Y, X and X transformed values.
4.Determine if your model requires dummy variables (e.g. for Y variable seasonality or significant events) and include a table of the dummy variable values for regression analysis. You may use either Decomposition centered moving average of Y (CMA) for Y and seasonal indices (SI) to seasonally adjust your Y variable or use dummy X variables in regression.
5.Use regression to evaluate the variable combinations to determine the best regression model. Note that is any seasonal dummy variables are used all of the seasonal dummy variables must be used. Use R square and F as primary determinants of the best model.
Note the significance of each slope term in the model. Rule-- if the coefficient is not significant then you may not use the model to forecast.
7.Investigate your best model using appropriate statistics or graphs to comment on possible:
a.Autocorrelation (Serial correlation) with the DW statistic
b.Heteroscedasticity with a residuals versus order plot (look for a megaphone effect)
c.Multicollinearity with the VIF statistic
Determine the best remedies for any of the problems identified in 5 above and make the appropriate changes to your regression model if required. Rerun the model and evaluate the fit again including error measures, R adjusted square, F value, slope coefficient significance, DW and VIF.
6.Evaluate the best multiple regression model accuracy with 2 error measures (RMSE and MAPE) each for the fit and again for the forecast period.
9.Evaluate the best model fit residuals and comment on their randomness using autocorrelation functions (ACFs) , histogram and a normality plot (You should use a four-in-one graphs as well). Comment on the cause of the error -- trend, cycle, seasonality and if it is statistically significant.
10. Forecast for the holdout period using your hold out X values to forecast Y. You can use Minitab Regression - Options menu by placing the columns for the X variables hold out values and any dummy variable predictions in the "Minitab/Regression/Options/Prediction intervals for new observations" area.
11.Evaluate the forecast error measures and residuals to determine if the error is acceptable or has systematic variation. Write your conclusion relative to the acceptability of the sales forecast.