EC 315 Park University Child Obesity Regression Analysis Paper

Anonymous
timer Asked: Mar 4th, 2019
account_balance_wallet $9.99

Question Description

Please follow the example attached and follow it to the letter.

Unformatted Attachment Preview

EFFECTS OF SMOKING ON NON-ACCIDENTAL DEATH RATES EC 315 – Quantitative Research Methods Russ Miller Fall II 2006 i TABLE OF CONTENTS BACKGROUND ..........................................................................................................................1 REGRESSION ANALYSIS .........................................................................................................2 CONCLUSIONS ...........................................................................................................................4 BIBLIOGRAPHY .........................................................................................................................5 APPENDIX ....................................................................................................................................6 1 I. Background It is widely accepted that the use of tobacco presents serious health risks. In fact, the Center for disease control and prevention states that “Tobacco use, including cigarette smoking, cigar smoking, and smokeless tobacco use, is the single leading preventable cause of death in the United States” (Center for Disease Control and Prevention, 2006). The purpose of this analysis is to determine the effects of tobacco use (SMOKE) on the non-accidental death rate (DEATH) while holding the effects of alcohol consumption (ALCOHOL), drug use (DRUG), and health insurance (INSUR) constant. This study will use cross-sectional data from the 50 states for the 2002-2003 combined time period. The model (less constant and coefficients) is: DEATH = SMOKE + ALCOHOL +DRUG - INSUR The dependent variable, DEATH, is defined as the death rate per 100,000 population for major causes of death in the United States, excluding non-health related causes such as automobile accidents, homicide, etc., and is extracted from the National Vital Statistics Reports (2006). Data for SMOKE, ALCOHOL, and DRUG were taken from the census bureau’s Statistical Abstract (2006) and are based on results from the National Household Survey on Drug Use and Health (NSDUH). SMOKE is defined as the number of people over 12 years of age (in thousands) who had smoked a cigarette at least once in the month prior to the study. ALCOHOL and DRUG are similarly defined with ALCOHOL representing binge drinking, and DRUG representing the use of any illicit drug. These three variables were selected because the CDC has stated in the Morbidity and Mortality Weekly Report that tobacco, alcohol, and other drug use is associated with the leading causes of morbidity and mortality…” (Center for Disease Control and Prevention, 1992). The relationships between DEATH and SMOKE, ALCOHOL, and 2 DRUG should all be positive, since the use of tobacco, drugs and alcohol are all bad for your health. The independent variable INSUR is defined as the number of people (in thousands) not having insurance, and the data for this variable was also extracted from the census bureau’s Statistical Abstract (2006). This variable was selected because an MIT Sloan study indicated that automobile accident victims without health insurance were more likely to die than their insured counterparts because of differences in the medical treatment received (MIT Sloan Management [MIT], 2003). Although this study deals with non-accidental deaths as opposed to automobile accident victims, it is assumed that the implied lower standard of care for uninsured patients may occur for other causes of death as well. The relationship between DEATH and INSUR should be positive since not having insurance has been linked to lower quality health care. II. Regression Analysis The model was regressed and the results are shown in the Table 1. Table 1. Original Regression Results Dependent Variable: DEATH Independent Variables Coefficients SMOKE 25.4186 ALCOHOL -5.9164 DRUG 37.3578 INSUR -2.1520 Adjusted R2 = 0.9802 t Statistic 7.1866 -1.0634 4.1175 -1.1743 n = 51 P-Value 0.0000 0.2932 0.0002 0.2463 You should now discuss these results…is your R2 good, bad, etc? What percentage of the variation is explained by the regression? Are the coefficients (signs) as you expected? Which variables are statistically significant? 3 Next, test for multicollinearity. Insert the correlation matrix and comment on the results, and then compute the Variance Inflation Factors. Table 2. Cross Correlation Matrix SMOKE X Independent Variables SMOKE ALCOHOL DRUG INSUR ALCOHOL DRUG INSUR X X X The rule of thumb for the cross correlation is for all coefficients be between -0.7 and +0.7. Values in the above table outside that range are problematic. Comment on your specific results. SMOKE Variance Inflation Factors ALCOHOL DRUG INSUR Rule of thumb for VIF is they should be less than 10. Comment on your specific results. Based on your findings above regarding statistical significance of independent variables and multicollinearity issues attempt to improve the regression by removing independent variables if appropriate. Only remove one at a time and see if the regression is better or worse. Try various combinations as appropriate. If all of your variables are statistically significant and there are no multicollinearity problems, try lagging (time-series data only) or logging the model and see if you can improve the regression. Even if you do end up removing some independent variables you can still try lagging or logging for additional improvement. 4 After all attempts to improve the regression, compare the “original” and “final” regressions and discuss the results. Independent Variables SMOKE ALCOHOL DRUG Original Regression Adjusted R2 = 0.9802 Final Regression Adjusted R2 = 0.9901 Coefficient Coefficient P-Value Comments P-Value INSUR Explain why you selected the final regression that you did…why is it better? Were there any trade-offs? III. Conclusions List and discuss your final model. DEATH = -858.13 + 25.42*SMOKE – 5.92*ALCOHOL + 37.36*DRUG – 2.15*INSUR What type of relationship did you establish between your primary independent variable and the dependent variable (e.g., strong positive; strong negative, weak positive, moderate positive, none, etc)? Based on your final regression, quantify the impact of a change in your primary independent variable (e.g., “The SMOKE coefficient of 25.42 indicates that for every 1,000 additional smokers approximately 25 additional deaths would occur.”) If your regression was not very good, what are some possible 5 explanations for the poor fit? If your regression is near perfect, why is that?If you were to research this topic further, what would you change, etc.? References Center for Disease Control and Prevention, United States Department of Health and Human Services. (Last reviewed August 3, 2006). Healthy Youth! Health Topics: Tobacco Use. Retrieved December 2, 2006 from http://www.cdc.gov/HealthyYouth/tobacco/index.htm Center for Disease Control and Prevention, United States Department of Health and Human Services. (1992, September 18). Morbidity and Mortality Weekly Report: Tobacco, Alcohol, and Other Drug Use Among High School Students – United States, 1991. Retrieved December 2, 2006 from http://www.cdc.gov/mmwr/preview/mmwrhtml/00017652.htm Center for Disease Control and Prevention, United States Department of Health and Human Services. (2006, April 19). National Vital Statistics Report, Volume 54, Number 13, Table 29 . Retrieved December 2, 2006 from http://www.cdc.gov/nchs/data/nvsr/nvsr54/nvsr54_13.pdf MIT Sloan Management News Room Press Releases. (2003, January 22). Uninsured auto crash victims face 37% higher death rate, says MIT Sloan study. Retrieved December 2, 2006 from http://mitsloan.mit.edu/newsroom/2003-doyle.php United States Census Bureau. (n.d.). The 2006 Statistical Abstract, Table 195: Estimated Use of Selected Drugs by State: 2002-2003. Retrieved December 2, 2006 from http://www.census.gov/compendia/statab/health_nutrition/health_risk_factors/ United States Census Bureau. (n.d.). The 2006 Statistical Abstract, Table 143: Persons With and Without Health Insurance Coverage By State: 2003. Retrieved December 2, 2006 from http://www.census.gov/compendia/statab/health_nutrition/health_insurance/ 6 Appendix 1. Excel Summary output for original regression: SUMMARY OUTPUT NO_INS Res Residuals Regression Statistics Multiple R 0.9908 R Square 0.9818 Adjusted R Square 0.9802 Standard Error 5247.6630 Observations 51 40000 20000 0 -20000 0 2,000 N ANOVA df Regression Residual Total Intercept NO_INS TOBACCO DRUG ALCOHOL 4 46 50 SS MS F Significance F 68271117105 17067779276 619.790823 2.30853E-39 1266746485 27537967.07 69537863591 Coefficients Standard Error -858.1337 1134.6000 -2.1520 1.8327 25.4186 3.5370 37.3578 9.0729 -5.9164 5.5638 t Stat -0.7563 -1.1743 7.1866 4.1175 -1.0634 P-value 0.4533 0.2463 0.0000 0.0002 0.2932 Lower 95% Upper 95% Lower 95.0% Upper 95.0% -3141.9628 1425.6954 -3141.9628 1425.6954 -5.8410 1.5369 -5.8410 1.5369 18.2991 32.5382 18.2991 32.5382 19.0950 55.6206 19.0950 55.6206 -17.1156 5.2829 -17.1156 5.2829 2. Excel Summary output for final regression: SUMMARY OUTPUT Regression Statistics Multiple R 0.9919 R Square 0.9839 Adjusted R Square 0.9825 Standard Error 0.1417 Observations 51 ANOVA df Regression Residual Total Intercept LTOBACCO LDRUG LALCOHOL LNO_INS 4 46 50 SS MS F Significance F 56.57493941 14.14373485 704.3991372 1.28213E-40 0.923640829 0.020079148 57.49858024 Coefficients Standard Error 3.0154 0.1655 1.0707 0.1363 -0.1750 0.1248 0.1581 0.1398 -0.0298 0.0821 t Stat 18.2249 7.8548 -1.4018 1.1306 -0.3625 P-value 0.0000 0.0000 0.1677 0.2641 0.7186 Lower 95% Upper 95% Lower 95.0% Upper 95.0% 2.6824 3.3485 2.6824 3.3485 0.7963 1.3451 0.7963 1.3451 -0.4262 0.0763 -0.4262 0.0763 -0.1234 0.4396 -0.1234 0.4396 -0.1951 0.1356 -0.1951 0.1356 ...
Purchase answer to see full attachment

Tutor Answer

Thomas574
School: New York University

Hello, I'm done, I have att...

flag Report DMCA
Review

Anonymous
Good stuff. Would use again.

Brown University





1271 Tutors

California Institute of Technology




2131 Tutors

Carnegie Mellon University




982 Tutors

Columbia University





1256 Tutors

Dartmouth University





2113 Tutors

Emory University





2279 Tutors

Harvard University





599 Tutors

Massachusetts Institute of Technology



2319 Tutors

New York University





1645 Tutors

Notre Dam University





1911 Tutors

Oklahoma University





2122 Tutors

Pennsylvania State University





932 Tutors

Princeton University





1211 Tutors

Stanford University





983 Tutors

University of California





1282 Tutors

Oxford University





123 Tutors

Yale University





2325 Tutors