Anonymous
timer Asked: May 1st, 2020

Question Description

i need just edit

chapter 7


my prof give me some points to fixit he side :

I reviewed your most recent draft of Chapter 7 Exercises (multiple regression).

For question 2a, check the “id” variable to be sure you have the correct cases numbers that need to be removed. There are three cases that should be removed that you did not list. Also, 18 and 1129 will automatically be removed since there is missing data for these cases (no MAH_1 value).

Some of your values are different than what I have. Be sure you run the multiple regression with the profile-b data set and only with cases that are MAH_1 ≤ 22.458.

For question 2h, you should not include the values for the variables that are not statistically significant. The regression equation should only include those variables that are statistically significant.

For 2i, also mention that these two variables are not statistically significant and provide the p (sig) value.

Unformatted Attachment Preview

1. The following output was generated from conducting a forward multiple regression to identify which IVs (urban, birthrat, lnphone, and lnradio) predict lngdp. The data analyzed were from the SPSS country-a.sav data file. a. Evaluate the tolerance statistics. Is multicollinearity a problem? In order to evaluate the presence of multicollinearity, we can exploit the tolerance statistics, calculated as 1-R2. A small tolerance is an indication of the fact that the variable considered is almost a perfect linear combination of the other independent variables already in the equation. Usually, a value of 0.1 serves as the cutoff point. Looking at the table, we can see assess that multicollinearity is not a problem because all tolerance statistics are greater than .1 for all the independent variable in both specifications. b. What variables create the model to predict lngdp? What statistics support your response? The model summary output indicates that the variables used for the forward multiple regression are are respectively lnphone (for the simple regression) and lnphone + birthrate (for the multiple regression). If we look at the p-values, we can see that both of the coefficients are statistically significant in explaining the variation of lngdp. However, that of birthrat is significant at a 5% significance level, differently from that of lnphone which is significant at 1% significance level. Moreover, despite its significance, the coefficient of birthrat is rather small in magnitude and the R^2 change between the regression including only lnphone and the following one with the added birthrate is only 0.004. This is a suggestion of the fact that the explicative power of birthrat is not much high. c. Is the model significant in predicting lngdp? Explain. Regression results indicate an overall model of two predictors (lnphone and birthrat) that significantly predicts lngdp. The R squared = .890, the Adjusted R squared = .888 d. What percentage of variance in lngdp is explained by the model? The model accounted for 89% of the variance in lndgp, as it can be retrieved from the R^2. e. Write the regression equation for lngdp. lngdp = 6.878 + .663*(lnphone) - .013*(birthrat) 2. This question utilizes the data sets profile-a.sav and profile-b.sav, You are interested in examining whether the variables shown here in brackets [years of age (age), hours worked per week (hrs 1), years of education (educ), years of education for mother (maeduc), and years of education for father (paeduc)] are predictors of individual income (rincmdol). Complete the following steps to conduct this analysis. a. Using profile-a.sav, conduct a preliminary regression to calculate Mahalanobis distance. Identify the critical value for chi-square. Conduct Explore to identify outliers. Which cases should be removed from further analysis? In order to calculate Mahalanobis distance, I conducted a preliminary regressio Model Summaryb Std. Error of R Adjusted R the Model R Square Square Estimate 1 .580a .336 .331 4.345 a. Predictors: (Constant), Highest Year of School Completed, Father, Number of Hours Worked Last Week, Age of Respondent, Highest Year of School Completed, Highest Year of School Completed, Mother b. Dependent Variable: RESPONDENTS INCOME The model summary indicates the general statistics of the regression where all the IVs were included into the model ANOVAa Model 1 Regressio n Residual Sum of Squares Mean Square df 6136.473 5 1227.295 12123.027 642 18.883 F 64.994 Sig. .000b Total 18259.500 647 a. Dependent Variable: RESPONDENTS INCOME b. Predictors: (Constant), Highest Year of School Completed, Father, Number of Hours Worked Last Week, Age of Respondent, Highest Year of School Completed, Highest Year of School Completed, Mother The ANOVA table presents the model significantly predicts the dependent variable of rincmdol, with the F-test for the overall significance telling us that at least one of the predictors are statistically significant. F(5, 642) = 64.994, p<.001. Coefficientsa Unstandardized Coefficients B Std. Error -5.487 1.302 Model 1 (Constant) Age of .133 .016 Respondent Highest Year of School .507 .071 Completed Number of Hours Worked .142 .012 Last Week Highest Year of School .005 .074 Completed, Mother Highest Year of School .041 .055 Completed, Father a. Dependent Variable: RESPONDENTS INCOME Standardize d Coefficients Beta t -4.215 Sig. .000 .291 8.585 .000 .256 7.145 .000 .385 11.788 .000 .003 .066 .948 .030 .733 .464 The coefficient table indicates the coefficients that were used to predict the regression equation. Residuals Statisticsa Minimu Maximu m m Mean Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value Std. Deviation N 4.16 24.16 13.64 3.080 648 -3.077 3.415 .000 1.000 648 .176 1.005 .398 .128 648 4.23 24.48 13.64 3.083 648 -15.499 -3.567 -3.583 13.759 3.166 3.188 .000 .000 .000 4.329 .996 1.001 648 648 648 -15.637 13.945 -.003 4.375 648 -3.616 3.211 -.001 1.003 648 .059 33.575 4.992 4.223 648 .000 .042 .002 .004 648 .000 .052 .008 .007 648 a. Dependent Variable: RESPONDENTS INCOME Case Processing Summary Cases Missing Valid N Mahalanobis Distance Percent 677 N 45.1% 823 Total Percent 54.9% N Percent 1500 100.0% The sample consisted of 1500 (823 missing values). Descriptives Statistic Mahalanobis Distance Mean 95% Confidence Interval for Mean Lower Bound Upper Bound 5% Trimmed Mean Median Variance Std. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Std. Error 4.9925522 .16121086 4.6760180 5.3090864 4.5041201 3.8432691 17.595 4.19458144 .05899 33.57526 33.51627 4.15036 2.310 7.910 .094 .188 The skewness statistics has a z-score of 2.310 /.094= 24.574. Based on this, we can conclude that the skewness is substantial and the distribution is non-normal. The kurtosis values are in line with that, 7.910/.188 = 42.074 shows there is no significance. Using a chi-squared table, critical value 22.458 was found, therefore, cases 406, 508, 18, 1129, and 351 exceeded that value so should be eliminated. The box plots is not normal and there are outliers at the highest end of the distribution. The critical value for chi-square is 22.458. Any cases with mahlabnobis>22.458 should be eliminated from the regression analysis. Therefore, cases 406, 508, 18, 1129, and 351 were eliminated following this reasoning. For all subsequent analyses, use profile-b.sav. Make sure that only cases where MAH_1<22.458 are selected. b. Create a scatterplot matrix. Can you assume linearity and normality? The scatterplot matrix with the transformed variables displays elliptical shapes, suggesting that the variables are linear normally distributed. Tests of Normality Kolmogorov-Smirnova Shapiro-Wilk Statistic df Sig. Statistic df Sig. Age of Respondent Highest Year of School Completed .057 609 .000 .975 609 .000 .151 609 .000 .944 609 .000 Number of Hours Worked .184 609 Last Week Highest Year of School .270 609 Completed, Mother Highest Year of School .180 609 Completed, Father RESPONDENT .115 609 S INCOME a. Lilliefors Significance Correction .000 .960 609 .000 .000 .891 609 .000 .000 .964 609 .000 .000 .952 609 .000 The Shapiro-Wilk test is particularly useful for testing the non-normality of the variables. The null hypothesis of this test is that the variables are normally distributed. From the results, we can reject the null hypothesis for all the variables, concluding that none of them is normally distributed. From the plot we can see that the residuals of the regression do not cluster on a horizontal line; in fact, there is an even distribution above and below the reference. From this, it seems that there is a moderate violation of linearity and homoscedasticity, which however should not invalidate the analysis. d. Conduct multiple regression using the Enter method. Evaluate the tolerance statistics. Is multicollinearity a problem? Multicollinearity is not a problem because all tolerance statistics is greater than .1. Descriptive Statistics Std. Mean Deviation RESPONDENT 13.25 5.058 S INCOME Age of 39.45 11.547 Respondent Highest Year of School 14.25 2.587 Completed Number of Hours Worked 42.88 14.059 Last Week Highest Year of School 11.81 2.802 Completed, Mother Highest Year of School 11.65 3.862 Completed, Father N 609 609 609 609 609 609 Correlations Numbe RESPO Highest r of NDENT Year of Hours S Age of School Worked INCOM Respond Complet Last E ent ed Week Pearson Correlation RESPON DENTS INCOME Age of Responde nt Highest Year of School Comple ted, Mother Highest Year of School Comple ted, Father 1.000 .270 .335 .522 .036 .050 .270 1.000 -.017 .053 -.305 -.275 Sig. (1-tailed) Highest Year of School Complete d Number of Hours Worked Last Week Highest Year of School Complete d, Mother Highest Year of School Complete d, Father RESPON DENTS INCOME Age of Responde nt Highest Year of School Complete d Number of Hours Worked Last Week .335 -.017 1.000 .145 .321 .370 .522 .053 .145 1.000 .037 .049 .036 -.305 .321 .037 1.000 .578 .050 -.275 .370 .049 .578 1.000 . .000 .000 .000 .185 .109 .000 . .337 .097 .000 .000 .000 .337 . .000 .000 .000 .000 .097 .000 . .180 .112 N Highest Year of School Complete d, Mother Highest Year of School Complete d, Father RESPON DENTS INCOME Age of Responde nt Highest Year of School Complete d Number of Hours Worked Last Week Highest Year of School Complete d, Mother Highest Year of School Complete d, Father .185 .000 .000 .180 . .000 .109 .000 .000 .112 .000 . 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 Correlation table indicates number of hours worked has highest correlation (.522) and highest year of school completed (.355) is the second highest correlation. Also indicates both mom (.036) and dad (.050) have the lowest correlation. All variables were entered using the enter method. Model Summaryb Model 1 R .635a Std. Error of R Adjusted R the Square Square Estimate .404 .399 3.922 a. Predictors: (Constant), Highest Year of School Completed, Father, Number of Hours Worked Last Week, Age of Respondent, Highest Year of School Completed, Highest Year of School Completed, Mother b. Dependent Variable: RESPONDENTS INCOME ANOVAa Sum of Squares Mean Square Model df F Sig. 1 Regressio 6280.935 5 1256.187 81.677 .000b n Residual 9274.119 603 15.380 Total 15555.054 608 a. Dependent Variable: RESPONDENTS INCOME b. Predictors: (Constant), Highest Year of School Completed, Father, Number of Hours Worked Last Week, Age of Respondent, Highest Year of School Completed, Highest Year of School Completed, Mother The ANOVA table suggests that the model significantly predicts the dependent variable of income, with the F test of the overall significance telling us that at least one variable is useful in predicting the income, F(5, 603) = 81.677, p<.001. The coefficient table indicates the coefficients that were used to predict the regression equation. Collinearity Diagnosticsa Variance Proportions Di m en Conditi Age of si Eigen on (Cons Respo Model on value Index tant) ndent 1 1 5.716 1.000 .00 .00 2 .131 6.611 .00 .22 3 .082 8.342 .00 .19 4 .034 12.888 .03 .20 5 .024 15.366 .00 .13 Highest Year of School Comple ted .00 .00 .00 .07 .63 6 .012 21.772 .96 .26 .29 a. Dependent Variable: RESPONDENTS INCOME Residuals Statisticsa Minimu m Maximu m Mean Number of Hours Worked Last Week .00 .06 .85 .04 .02 Highest Year of School Complet ed, Mother .00 .03 .00 .26 .49 Highe st Year of Scho ol Comp leted, Fathe r .00 .17 .00 .80 .00 .03 .22 .01 Std. Deviation Predicted 3.26 23.15 13.25 Value Residual -15.487 8.673 .000 Std. Predicted -3.106 3.082 .000 Value Std. Residual -3.949 2.211 .000 a. Dependent Variable: RESPONDENTS INCOME N 3.214 609 3.906 609 1.000 609 .996 609 e. Does the model significantly predict rincmdol? Explain. The results indicate the model significantly predicts rincmdol. The explanation power is given by R square = .404 (not too high), Adjusted R squared = .399, F(5, 603) = 81.677, p <.001. f. Which variables significantly predict rincmdol? Which variable is the best predictor of the DV? The variables of age (B=.110, Beta=.252, t=7.485, p<.001), edu (B=.531, Beta=.271, t=7.818, p<.001), and hrs1(B=.169, Beta=.469, t=14.741, p<.001) significantly predict the DV. The variable of hrs1 is the best predictor of rincmdol as indicated by the beta weight and respective t and p-values. g. What percentage of variance in rincmdol is explained by the model? The model accounted for 40.4% of the variance rincmdol. h. Write the regression equation for the standardized variables. Income= -6.052 + .252 * age + .272 * educ + .469 * hrs + .017*maeduc + -.014*paeduc i. Explain why the variables of mother’s and father’s education are not significant predictors of rincmdol. Bivariate and partial correlation coefficients of these two variables with the DV are very low. Therefore, there seems to be not much evidence of these variables being important in explaining the DV. Advanced and Multivariate Statistical Methods Practical Application and Interpretation Sixth Edition Craig A. Mertler Arizona State University Rachel Vannatta Reinhart Bowling Green State University Sixth edition published 2017 by Routledge 711 Third Avenue, New York, NY 10017 and by Routledge 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business © 2017 Taylor & Francis The right of Craig A. Mertler and Rachel Vannatta Reinhart to be identified as authors of this work has been asserted by them in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. Cover design by 809 Design Group. Editorial assistance provided by Randall R. Bruce, Jenifer Dill, Karen Sommerfeld, and Sharon Young. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. SPSS is a registered trademark of IBM Corp. Screen images copyright © IBM Corp. Used with permission. This book is not approved or sponsored by SPSS Statistics. First edition published by Pyrczak 2001 Fifth edition published by Pyrczak 2013 Library of Congress Cataloging in Publication Data A catalog record for this book has been requested ISBN: 978-1-138-28971-0 (hbk) ISBN: 978-1-138-28973-4 (pbk) ISBN: 978-1-315-26697-8 (ebk) Typeset in Times New Roman Visit the eResources: www.routledge.com/9781138289734 Contents Detailed Chapter Contents v Preface xi Acknowledgments xiv Dedications xv Chapter 1 Introduction to Multivariate Statistics 1 Chapter 2 A Guide to Multivariate Techniques 13 Chapter 3 Pre-Analysis Data Screening 27 Chapter 4 Factorial Analysis of Variance 71 Chapter 5 Analysis of Covariance 99 Chapter 6 Multivariate Analysis of Variance and Covariance 125 Chapter 7 Multiple Regression 169 Chapter 8 Path Analysis 203 Chapter 9 Factor Analysis 247 Chapter 10 Discriminant Analysis 279 Chapter 11 Logistic Regression 307 Appendix A SPSS Data Sets 327 Appendix B The Chi-Square Distribution 357 Glossary 359 References 369 Subject Index 371 iii Notes iv Detailed Chapter Contents Chapter 1 Introduction to Multivariate Statistics .........................................1 Section 1.1 Multivariate Statistics: Some Background ................................................................. 2 Research Designs .................................................................................................................... 2 The Nature of Variables .......................................................................................................... 3 Data Appropriate for Multivariate Analyses ........................................................................... 4 Standard and Sequential Analyses .......................................................................................... 5 Difficulties in Interpreting Results .......................................................................................... 7 Section 1.2 Review of Descriptive and Inferential Statistics ........................................................ 7 Descriptive Statistics ............................................................................................................... 7 Inferential Statistics ................................................................................................................. 9 Section 1.3 Organization of the Book ......................................................................................... 12 Chapter 2 A Guide to Multivariate Techniques ......................................... 13 Section 2.1 Degree of Relationship Among Variables................................................................ 14 Bivariate Correlation and Regression.................................................................................... 14 Multiple Regression .............................................................................................................. 14 Path Analysis ......................................................................................................................... 14 Section 2.2 Significance of Group Differences ........................................................................... 15 t Test ...................................................................................................................................... 15 One-Way Analysis of Variance ............................................................................................. 15 One-Way Analysis of Covariance ......................................................................................... 15 Factorial Analysis of Variance .............................................................................................. 16 Factorial Analysis of Covariance .......................................................................................... 16 One-Way Multivariate Analysis of Variance ........................................................................ 16 One-Way Multivariate Analysis of Covariance .................................................................... 17 Factorial Multivariate Analysis of Variance ......................................................................... 17 Factorial Multivariate Analysis of Covariance...................................................................... 17 Section 2.3 Prediction of Group Membership ............................................................................. 17 Discriminant Analysis ........................................................................................................... 17 Logistic Regression ............................................................................................................... 18 Section 2.4 Structure ................................................................................................................... 18 Factor Analysis and Principal Components Analysis ........................................................... 18 Section 2.5 The Table of Statistical Tests .................. ...
Student has agreed that all tutoring, explanations, and answers provided by the tutor will be used to help in the learning process and in accordance with Studypool's honor code & terms of service.

This question has not been answered.

Create a free account to get help with this and any other question!

Brown University





1271 Tutors

California Institute of Technology




2131 Tutors

Carnegie Mellon University




982 Tutors

Columbia University





1256 Tutors

Dartmouth University





2113 Tutors

Emory University





2279 Tutors

Harvard University





599 Tutors

Massachusetts Institute of Technology



2319 Tutors

New York University





1645 Tutors

Notre Dam University





1911 Tutors

Oklahoma University





2122 Tutors

Pennsylvania State University





932 Tutors

Princeton University





1211 Tutors

Stanford University





983 Tutors

University of California





1282 Tutors

Oxford University





123 Tutors

Yale University





2325 Tutors