simplified pre-discussed beer memo regression

Content Type

User Generated

User

lycreba

Subject

Mathematics

Description

Assignment 9 file contains the instructions.

Make sure to follow the united airlines example i slacked you

Unformatted Attachment Preview

Scatter Plot 12 y = 0.0275x + 0.9526 R² = 0.814 10 Y 8 6 4 2 0 0 50 100 150 200 X 250 300 350 Calories Pct Alcohol 153 4.9 157 5.9 95 4.2 130 5 123 5.5 115 5 110 4.2 116 4.2 145 5 99 4.3 55 2.4 133 4.6 169 5.9 95 4.1 138 4.3 174 6.1 149 4.9 152 5 104 4.2 144 4.7 144 4.7 110 3.9 132 5 149 5.5 103 4.1 157 5.6 145 4.6 166 5.2 155 5 152 4.7 110 4.1 175 4.9 113 4.3 95 4.1 157 5.6 157 5.8 110 4.2 143 4.7 64 2.8 110 4.2 143 4.7 110 4.2 96 4.2 110 4.2 110 4.2 110 4.2 128 144 98 70 146 114 160 202 220 120 160 160 166 195 124 146 190 330 214 157 190 215 231 175 218 194 225 158 153 123 149 113 94 177 163 154 163 153 171 158 146 292 269 314 131 181 150 4.3 5.9 4.5 0.4 4.5 3.8 5.9 7.5 8 4.5 4.9 4.8 5.2 4.7 4.1 4.7 5.9 9.6 6.8 5 5.9 6.7 6.9 5.6 7 5.6 5.8 5 4.4 5 4.6 4.4 4.1 5.2 4.7 4.7 4.7 4.8 5.4 4.7 5.3 10.5 8.7 10.5 4.7 6.5 5.3 158 179 124 148 162 156 148 162 142 103 111 110 170 149 105 163 152 166 165 205 200 200 140 160 155 145 215 146 153 174 179 188 142 222 160 222 135 161 151 147 150 145 135 98 150 135 5.2 6.4 4.6 4.5 5.1 5.9 4.9 5 5.9 4.1 4.4 4.1 4.9 4.9 4.2 4.9 4.7 4.9 4.9 5.6 6.6 7 4.8 5.2 4.8 4.8 7.8 4.7 5 5.3 5.8 6.5 4.6 8.1 6 8.1 4.2 5.1 4.9 4.6 4.8 5 4.4 3.8 4.5 4.4 Calories Pct Alcohol 153 4.9 157 5.9 95 4.2 130 5 123 5.5 115 5 110 4.2 116 4.2 145 5 99 4.3 55 2.4 133 4.6 169 5.9 95 4.1 138 4.3 174 6.1 149 4.9 152 5 104 4.2 144 4.7 144 4.7 110 3.9 132 5 149 5.5 103 4.1 157 5.6 145 4.6 166 5.2 155 5 152 4.7 110 4.1 175 4.9 113 4.3 95 4.1 157 5.6 157 5.8 110 4.2 143 4.7 64 2.8 110 4.2 143 4.7 110 4.2 96 4.2 110 4.2 110 4.2 110 4.2 128 144 98 70 146 114 160 202 220 120 160 160 166 195 124 146 190 330 214 157 190 215 231 175 218 194 225 158 153 123 149 113 94 177 163 154 163 153 171 158 146 292 269 314 131 181 150 4.3 5.9 4.5 0.4 4.5 3.8 5.9 7.5 8 4.5 4.9 4.8 5.2 4.7 4.1 4.7 5.9 9.6 6.8 5 5.9 6.7 6.9 5.6 7 5.6 5.8 5 4.4 5 4.6 4.4 4.1 5.2 4.7 4.7 4.7 4.8 5.4 4.7 5.3 10.5 8.7 10.5 4.7 6.5 5.3 158 179 124 148 162 156 148 162 142 103 111 110 170 149 105 163 152 166 165 205 200 200 140 160 155 145 215 146 153 174 179 188 142 222 160 222 135 161 151 147 150 145 135 98 150 135 5.2 6.4 4.6 4.5 5.1 5.9 4.9 5 5.9 4.1 4.4 4.1 4.9 4.9 4.2 4.9 4.7 4.9 4.9 5.6 6.6 7 4.8 5.2 4.8 4.8 7.8 4.7 5 5.3 5.8 6.5 4.6 8.1 6 8.1 4.2 5.1 4.9 4.6 4.8 5 4.4 3.8 4.5 4.4 Simple Linear Regression Analysis Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 0.9022 0.8140 0.8126 0.5642 139 ANOVA df Regression Residual Total Intercept Calories 1 137 138 SS MS F Significance F 190.8133 190.8133 599.5098 0.0000 43.6047 0.3183 234.4180 Coefficients Standard Error 0.9526 0.1777 0.0275 0.0011 t Stat P-value 5.3620 0.0000 24.4849 0.0000 Lower 95% Upper 95% 0.6013 1.3039 0.0253 0.0297 Calculations b1, b0 Coefficients 0.0275 0.9526 b1, b0 Standard Error 0.0011 0.1777 R Square, Standard Error 0.8140 0.5642 F , Residual df 599.5098 137.0000 Regression SS , Residual SS 190.8133 43.6047 Confidence level t Critical Value Half Width b0 Half Width b1 Lower 95% 0.6013 0.0253 Upper 95% 1.30390 0.02972 95% 1.9774 0.3513 0.0022 Calories 153 157 95 130 123 115 110 116 145 99 55 133 169 95 138 174 149 152 104 144 144 110 132 149 103 157 145 166 155 152 110 175 113 95 157 157 110 143 64 110 143 110 96 110 110 110 128 144 98 70 146 114 160 202 220 120 160 160 166 195 124 146 190 330 214 157 190 215 231 175 218 194 225 158 153 123 149 113 94 177 163 154 163 153 171 158 146 292 269 314 131 181 150 158 179 124 148 162 156 148 162 142 103 111 110 170 149 105 163 152 166 165 205 200 200 140 160 155 145 215 146 153 174 179 188 142 222 160 222 135 161 151 147 150 145 135 98 150 135 Calories Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error Calories 152.3165468 150 110 55 330 275 1828.0005 42.7551 28.07% 1.1860 3.3511 139 3.6264 National Regional 4.9 4.1 5.9 5.2 4.2 4.7 5.0 4.7 5.5 4.7 5.0 4.8 4.2 5.4 4.2 4.7 5.0 5.3 4.3 10.5 2.4 8.7 4.6 10.5 5.9 4.7 4.1 6.5 4.3 5.3 6.1 5.2 4.9 6.4 5.0 4.6 4.2 4.5 4.7 5.1 4.7 5.9 3.9 4.9 5.0 5.0 5.5 5.9 4.1 4.1 5.6 4.4 4.6 4.1 5.2 4.9 5.0 4.9 4.7 4.2 4.1 4.9 4.9 4.7 4.3 4.9 4.1 4.9 5.6 5.6 5.8 6.6 4.2 7.0 4.7 4.8 2.8 5.2 4.2 4.8 4.7 4.8 4.2 7.8 4.2 4.7 4.2 5.0 4.2 5.3 4.2 5.8 4.3 5.9 4.5 0.4 4.5 3.8 5.9 7.5 8.0 4.5 4.9 4.8 5.2 4.7 4.1 4.7 5.9 9.6 6.8 5.0 5.9 6.7 6.9 5.6 7.0 5.6 5.8 5.0 4.4 5.0 4.6 4.4 6.5 4.6 8.1 6.0 8.1 4.2 5.1 4.9 4.6 4.8 5.0 4.4 3.8 4.5 4.4 Percent Alcohol and Dist. Type Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error National Regional 4.935897436 5.404918033 4.7 4.9 4.2 4.7 0.4 3.8 9.6 10.5 9.2 6.7 1.4327 1.9428 1.1970 1.3938 24.25% 25.79% 0.4212 2.1506 4.8372 4.8715 78 61 0.1355 0.1785 Pct Alcohol 4.9 5.9 4.2 5.0 5.5 5.0 4.2 4.2 5.0 4.3 2.4 4.6 5.9 4.1 4.3 6.1 4.9 5.0 4.2 4.7 4.7 3.9 5.0 5.5 4.1 5.6 4.6 5.2 5.0 4.7 4.1 4.9 4.3 4.1 5.6 5.8 4.2 4.7 2.8 4.2 4.7 4.2 4.2 4.2 4.2 4.2 4.3 5.9 4.5 0.4 4.5 3.8 5.9 7.5 8.0 4.5 4.9 4.8 5.2 4.7 4.1 4.7 5.9 9.6 6.8 5.0 5.9 6.7 6.9 5.6 7.0 5.6 5.8 5.0 4.4 5.0 4.6 4.4 4.1 5.2 4.7 4.7 4.7 4.8 5.4 4.7 5.3 10.5 8.7 10.5 4.7 6.5 5.3 5.2 6.4 4.6 4.5 5.1 5.9 4.9 5.0 5.9 4.1 4.4 4.1 4.9 4.9 4.2 4.9 4.7 4.9 4.9 5.6 6.6 7.0 4.8 5.2 4.8 4.8 7.8 4.7 5.0 5.3 5.8 6.5 4.6 8.1 6.0 8.1 4.2 5.1 4.9 4.6 4.8 5.0 4.4 3.8 4.5 4.4 Percent Alcohol Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error Pct Alcohol 5.141726619 4.9 4.7 0.4 10.5 10.1 1.6987 1.3033 25.35% 1.3832 5.3178 139 0.1105 Percent Alcohol Five-Number Summary Minimum 0.4 First Quartile 4.4 Median 4.9 Third Quartile 5.6 Maximum 10.5 Percent Alcohol Pct Alcohol 0 2 4 6 8 10 12 Percent Alcohol and Dist. Type Five-Number Summary National Regional Minimum 0.4 3.8 First Quartile 4.2 4.7 Median 4.7 4.9 Third Quartile 5.6 5.7 Maximum 9.6 10.5 Percent Alcohol and Dist. Type Regional National 0 2 4 6 8 10 12 Calories Five-Number Summary Minimum 55 First Quartile 124 Median 150 Third Quartile 166 Maximum 330 Calories Calories 50 100 150 200 250 300 350 DistType National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional National DistType Regional Dist. Type 0 10 20 30 40 50 60 70 80 90 Dist. Type Count of DistType DistType National Regional Grand Total Total 78 61 139 Regional Dist. Type and Light Yes National No 0 10 20 30 40 50 60 Dist. Type and Light Count of DistType DistType National Regional Grand Total Light No Yes 53 55 108 25 6 31 Grand Total 78 61 139 Brand Anchor Steam Anheuser Busch Natural Ice Anheuser Busch Natural Light Bud Dry Bud Ice Bud Ice Light Bud Light Bud Light Lime Budweiser Budweiser Select Budweiser Select 55 Busch Beer Busch Ice Busch Light Carling Black Label Colt 45 Malt Liquor Coors Coors Extra Gold Lager Coors Light Hamm's Beer Hamm's Golden Draft Hamm's Special Light Icehouse Icehouse 5 Icehouse Light Magnum Malt Liquor Michael Shea's Michelob Amber Boch Michelob Beer Michelob Golden Draft Michelob Golden Draft Light Michelob Honey Lager Michelob Light Michelob Ultra Mickey's Fine Malt Liquor Mickey's Ice Miller Chill Miller Genuine Draft Miller Genuine Draft 64 Miller Genuine Draft Light Miller High Life Miller High Life Light Miller Lite Miller Lite Brwer's Collection Amber Miller Lite Brwer's Collection Blonde Miller Lite Brwer's Collection Wheat Pct Alcohol Calories Carbohydrates DistTypeCODE 4.9 153 16.0 1 5.9 157 8.9 1 4.2 95 3.2 1 5.0 130 7.8 1 5.5 123 8.9 1 5.0 115 7.5 1 4.2 110 6.6 1 4.2 116 8.0 1 5.0 145 10.6 1 4.3 99 3.1 1 2.4 55 1.9 1 4.6 133 10.2 1 5.9 169 12.5 1 4.1 95 3.2 1 4.3 138 12.5 1 6.1 174 11.1 1 4.9 149 12.2 1 5.0 152 12.5 1 4.2 104 5.3 1 4.7 144 12.1 1 4.7 144 12.1 1 3.9 110 8.3 1 5.0 132 8.7 1 5.5 149 9.8 1 4.1 103 5.5 1 5.6 157 11.2 1 4.6 145 13.0 1 5.2 166 15.0 1 5.0 155 13.3 1 4.7 152 14.1 1 4.1 110 7.0 1 4.9 175 17.9 1 4.3 113 6.7 1 4.1 95 2.6 1 5.6 157 11.2 1 5.8 157 11.8 1 4.2 110 6.5 1 4.7 143 13.1 1 2.8 64 2.4 1 4.2 110 7.0 1 4.7 143 13.1 1 4.2 110 7.0 1 4.2 96 3.2 1 4.2 110 6.2 1 4.2 110 6.2 1 4.2 110 6.2 1 Milwaukee's Best Milwaukee's Best Ice Milwaukee's Best Light O'Doul's Old Milwaukee Beer Old Milwaukee Light Olde English 800 Olde English 800 7.5 Olde English High Gravity 800 Rolling Rock Premium Beer Sam Adams Boston Ale Sam Adams Boston Lager Sam Adams Cherry Wheat Sam Adams Cream Stout Sam Adams Light Schlitz Sierra Nevada Anniversary Ale Sierra Nevada Bigfoot Sierra Nevada Celebration Ale Sierra Nevada Draft Ale Sierra Nevada Early Spring Beer Sierra Nevada Harvest Ale Sierra Nevada India Pale Ale Sierra Nevada Pale Ale Sierra Nevada Pale Bock Sierra Nevada Porter Sierra Nevada Stout Sierra Nevada Summerfest Beer Sierra Nevada Wheat Beer Southpaw Light Stroh's Beer Stroh's Light Aspen Edge Big Sky Moose Drool Brown Ale Big Sky Scape Goat Pale Ale Big Sky Summer Honey Ale Big Sky Trout Slayer Ale Blatz Beer Blue Moon Flying Dog Doggie Style Flying Dog Dogtober Fest Flying Dog Double Dog Pale Ale Flying Dog Gonzo Flying Dog Horn Dog Flying Dog In Heat Wheat Flying Dog K-9 Cruiser Flying Dog Old Scratch 4.3 5.9 4.5 0.4 4.5 3.8 5.9 7.5 8.0 4.5 4.9 4.8 5.2 4.7 4.1 4.7 5.9 9.6 6.8 5.0 5.9 6.7 6.9 5.6 7.0 5.6 5.8 5.0 4.4 5.0 4.6 4.4 4.1 5.2 4.7 4.7 4.7 4.8 5.4 4.7 5.3 10.5 8.7 10.5 4.7 6.5 5.3 128 144 98 70 146 114 160 202 220 120 160 160 166 195 124 146 190 330 214 157 190 215 231 175 218 194 225 158 153 123 149 113 94 177 163 154 163 153 171 158 146 292 269 314 131 181 150 11.4 7.3 3.5 13.3 12.9 8.3 10.5 13.4 14.6 10.0 19.9 18.0 16.9 23.9 9.7 12.1 17.3 32.1 19.4 13.4 16.7 19.3 20.0 14.1 19.7 18.4 22.3 13.7 13.1 6.6 12.0 7.0 2.6 15.6 13.9 11.6 13.9 11.6 13.7 11.4 11.4 15.0 18.6 18.9 8.3 10.6 9.6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Flying Dog Road Dog Flying Dog Snake Dog Flying Dog Tire Bite Genesee Beer Genesee Cream Ale Genesee Ice Genesee Red George Killian's Irish Red Keystone Ice Keystone Light Keystone Premium Leinenkugel Amber Light Leinenkugel Creamy Dark Leinenkugel Honey Weiss Leinenkugel Light Leinenkugel Northwoods Lager Leinenkugel Original Leinenkugel Red Leinenkugel Sunset Wheat New Belgium 1554 New Belgium 2 Below New Belgium Abbey New Belgium Blue Paddle New Belgium Fat Tire New Belgium Mothership Wit New Belgium Sunshine Wheat New Belgium Trippel Olympia Premium Lager Pabst Blue Ribbon Pete's Wicked Ale Red Hook ESB Red Hook IPA Schaefer Steel Reserve Steel Reserve Six Steel Reserve Triple Export Weinhard's Amber Light Weinhard's Blonde Lager Weinhard's Hefweizen Weinhard's Pale Ale Weinhard's Private Reserve Yuengling Ale Yuengling Lager Yuengling Light Yuengling Porter Yuengling Premium Beer 5.2 6.4 4.6 4.5 5.1 5.9 4.9 5.0 5.9 4.1 4.4 4.1 4.9 4.9 4.2 4.9 4.7 4.9 4.9 5.6 6.6 7.0 4.8 5.2 4.8 4.8 7.8 4.7 5.0 5.3 5.8 6.5 4.6 8.1 6.0 8.1 4.2 5.1 4.9 4.6 4.8 5.0 4.4 3.8 4.5 4.4 158 179 124 148 162 156 148 162 142 103 111 110 170 149 105 163 152 166 165 205 200 200 140 160 155 145 215 146 153 174 179 188 142 222 160 222 135 161 151 147 150 145 135 98 150 135 12.0 10.6 7.1 13.5 15.0 14.5 14.0 14.8 5.9 5.0 5.8 7.4 16.8 12.0 5.7 15.3 13.9 16.2 16.0 25.0 17.0 18.0 14.0 15.0 15.0 13.0 20.0 11.9 12.0 17.7 14.2 12.7 12.1 16.0 11.0 16.0 11.5 14.0 12.2 13.0 9.9 10.0 12.0 6.6 14.0 12.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 DistType National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National Light No No Yes No No Yes Yes Yes No Yes Yes No No Yes No No No No Yes No No Yes No No Yes No No No No No Yes No Yes Yes No No No No Yes Yes No Yes Yes Yes Yes Yes National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National National Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional No No Yes No No Yes No No No No No No No No Yes No No No No No No No No No No No No No No Yes No Yes Yes No No No No No No No No No No No No No No Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional Regional No No No No No No No No No Yes No Yes No No Yes No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No Yes No No Frequency Distribution for Pct Alcohol bins Midpts. Frequency Percentage -0.01 -0 0.0% 0.99 0.5 1 0.7% 1.99 1.5 0 0.0% 2.99 2.5 2 1.4% 3.99 3.5 3 2.2% 4.99 4.5 72 51.8% 5.99 5.5 41 29.5% 6.99 6.5 9 6.5% 7.99 7.5 4 2.9% 8.99 8.5 4 2.9% 9.99 9.5 1 0.7% 10.99 10.5 2 1.4% Total 139 100.0% 7-Beer Study (A random sample of 139 beers.) Scenario You are employed as a research assistant at the Alcohol and Tobacco Tax and Trade Bureau (U. S. Department of the Treasury) and your supervisor, R. Tyler Paterson, asks you to study the characteristics of beers sold throughout the United States. For this purpose you take a sample of 139 beers. For each beer you collect data on several variables (see the Variable INFO tab) but for the reports you will be preparing you decide to focus on one key numerical variable, “Pct Alcohol” (i.e., the alcoholic content in percentage), and one key categorical variable, “Light” (i.e., whether or not the beer product is considered to be light). You also decide to use a grouping categorical variable “Distribution Type” since each beer is distributed nationally or regionally. This will enable you to make comparisons of both the percent alcohol of the beer and whether or not the beer is considered to be light based on its distribution – national or regional. In addition, you have also selected the numerical variable “Calories” in the beer to develop a simple linear regression model to predict the “Pct Alcohol.” Introduction to Simple Linear Regression Modeling (Prepared by Mark L. Berenson) This chapter is an introduction to regression analysis modeling techniques that enable you to use a numerical independent variable to predict the values of a numerical dependent variable of interest. For example, the placement officer at your university can predict the expected starting salary (in thousands of dollars) of a graduating business student by developing a simple linear regression model that uses cumulative grade point average as the numerical independent variable. Regression analysis is fundamental to business decision-making because it involves prediction/estimation/forecasting – three words used here synonymously. Below are some examples of practical uses of regression analysis: • • • • • An investment analyst can estimate your credit score rating based on current salary. A family doctor can forecast your relative’s survivability from surgery based on hours in surgery. A financial analyst can predict your company’s sustainability based on revenues generated through the year. A real estate agent can estimate the value of your house (in dollars) based on its size in square feet. The admissions director of an MBA program can forecast your chances of success by estimating your graduate grade point average based on your GMAT score. In a regression analysis, the dependent variable, given by the symbol Y, is the numerical variable of interest that you want to predict. The dependent variable is often referred to as the response variable. The independent variable, given by the symbol X, is the numerical variable used to make the prediction. The independent variable is often referred to as the predictor or explanatory variable. 1 Developing the Simple Linear Regression Model A regression analysis begins by visually observing the relationship between the two numerical variables in a scatter plot. Therefore, to develop a simple linear regression model you need two numerical measurements on each item in your sample, such as the expected starting salary of a graduating student and his/her cumulative grade point average, or the selling price of a house and its size in square feet. Each pair of measurements is plotted such that the dependent variable of interest is on the vertical or Y axis, and the independent or predictor variable is on the horizontal or X axis. Figure 1 shows scatter plots demonstrating both strong and weak linear relationships. 1 FIGURE 1 Scatter plots of strong and weak linear relationships In the top two panels of Figure 1 you observe positive relationships between X and Y; in the bottom two panels you see negative relationships between X and Y. Sometimes the cloud of points will appear to follow a curved pattern instead of a straight line pattern. In such circumstances be sure to consult with a professional statistician – curvilinear regression analysis is outside the scope of this course, which focuses on simple linear regression analysis. Figure 2 depicts scatter plots with linear and curvilinear relationships. FIGURE 2 Scatter plots of linear and curvilinear relationships 2 To demonstrate the development of a simple linear regression model, Figure 3 displays part of an Excel worksheet of a data file constructed at Bergen University (“BU”). Figure 4 is the scatter plot representing the test scores achieved by the sample of 93 business students based on the number of hours the students claimed to have studied for their comprehensive (i.e., all topics covered in the semester) final exam in their core-required operations management course. You can see that the cloud of points plot upward and toward the right. Any straight line drawn through these points would therefore indicate a positive relationship between X and Y – a positive correlation and a positive slope. FIGURE 3 Excel worksheet of the Bergen University data file containing 93 students and displaying student ID number, test scores, and hours studied ID Number ID0001 ID0002 ID0003 ID0004 ID0005 ID0006 ID0007 ID0008 : ID0086 ID0087 ID0088 ID0089 ID0090 ID0091 ID0092 ID0093 Test Scores 66 72 87 55 64 83 99 79 : 71 75 94 39 77 70 62 80 3 Hours Studied 8.0 7.5 9.5 2.0 5.0 9.5 11.0 9.0 : 7.5 8.5 14.0 0.0 8.5 6.5 4.0 8.5 FIGURE 4 Scatter plot of test scores with hours studied The Simple Linear Regression Equation The straight line developed from a sample of data is described by two statistics, the sample slope b1 and the sample Y intercept b0 . The simple linear regression equation or prediction line is given by: Yˆi = b0 + b1 X i where Yˆi is the predicted value of Y for observation i X i is the value of X for observation i b0 is the sample Y intercept b1 is the sample slope The Y intercept b0 represents the mean or average value of Y when X equals 0. It is a necessary component of the simple linear regression equation but it doesn’t always have practical value. For instance, it would not be meaningful to predict the selling price of a house that has 0 square feet -- the house doesn’t exist! Similarly, it would not be meaningful to predict expected starting salary of a graduating senior based on a 4 cumulative grade point average of 0.0 – that student would have flunked out with a straight F average and not be graduating! The Y intercept b0 is given by: b0 = Y − b1 X It is the difference between the mean or average value of Y and the product of the slope with the mean or average value of X. Therefore, if using a hand-held calculator or programming in Excel the slope would have to be computed first. The slope of the line b1 represents the mean or average amount that Y changes, either positively or negatively, as a result of a one-unit change in X. Therefore, the slope will be positive if the points on the scatter plot indicate that as X gets larger Y typically also becomes larger. On the other hand, the slope will be negative if the points on the scatter plot indicate that as X gets larger Y usually becomes smaller. The slope b1 is given by: n b1 = ∑(X i − X )(Yi − Y ) i =1 n ∑(X i − X )2 i =1 The numerator term is the summation of the product of the difference between each observation of the independent variable X and its mean with the difference between each corresponding observation of the dependent variable Y and its mean. This result can be positive or negative. The denominator term is the summation of the squared differences between each observation of the independent variable X and its mean. This result can only be positive. Therefore the slope b1 can be positive or negative. If you are using Excel, it is easy to obtain X and Y , the means of the respective columns of X and Y, and then compute columns of differences between each X observation and X as well as each Y observation and Y . Once this is accomplished you can compute a column of the product of the latter two columns and, in summing the results, you will have obtained the numerator in the equation for the slope. You can then use your column of differences between each X observation and X to form a column of squared differences. In summing the results, you will have obtained the denominator in the equation for the slope. To simplify and expedite matters, however, you could use PHStat to develop your entire simple linear regression model. To demonstrate the development of a simple linear regression equation, suppose you are working as a research assistant to the chairperson of the Operations Management Department at Bergen University (“BU”). She wants to predict the test score that will be attained on a comprehensive (i.e., all topics covered in the semester) final examination based on the amount of time (in hours) that a student claims to have studied for that exam 5 in this core-required course. The scatter plot of Figure 4, representing the test scores earned and the study hours claimed by a sample of 93 business students, indicates the slope will be positive – more study time should result in a higher grade. This scatter plot is created from PHStat. The PHStat printout shown in Figure 5 presents the developed simple linear regression model. FIGURE 5 PHStat simple linear regression model of test scores by hours studied Regression Analysis of Test Scores by Hours Studied Regression Statistics Multiple R 0.9220 R Square 0.8500 Adjusted R Square 0.8483 Standard Error 3.9958 Observations 93 ANOVA df Regression Residual Total Intercept Hours Studied 1 91 92 SS 8233.0292 1452.9278 9685.9570 Coefficients 48.7239 3.2988 Standard Error 1.1474 0.1453 MS 8233.0292 15.9662 F 515.6524 Significance F 0.0000 t Stat 42.4637 22.7080 P-value 0.0000 0.0000 Lower 95% 46.4447 3.0102 Upper 95% 51.0031 3.5874 Note: Only the highlighted portions of the PHStat printout are important in this course. In the lower left corner of the PHStat worksheet displayed in Figure 5, you obtain the sample Y intercept b0 and the sample slope b1 for the predictor variable Hours Studied under the Coefficients column. Your sample linear regression equation, or prediction line, is: Yˆi = 48.7239 + 3.2988 X i Interpreting the Y Intercept and Slope Note that the Y intercept b0 and the slope b1 are both measured in the same units as the dependent variable Y. The Y intercept b0 = 48.7239 indicates that if a student does not study for the final exam (that is, X = 0), the test score is predicted on average to be 48.7239 – a failing grade. In 6 this situation, the Y intercept is a meaningful prediction because it is plausible that a student, perhaps unwisely, will decide not to study. In fact, one of the 93 students in the sample did not study and scored a 39, a failing grade almost 10 points worse than what on average would be predicted! The slope b1 = +3.2988 indicates that for each additional one hour of study time, the average predicted change in the test score would be an increase of 3.2988 points. Using the Sample Regression Equation for Prediction When using the sample regression equation for prediction, you should only select values of the independent variable X within its relevant range, i.e., choose only values of X between the smallest and largest X that were used in developing the regression equation – do not extrapolate beyond this relevant range. In the sample of n = 93 Bergen University graduating business students, the predictor variable hours studied ranged from 0 to 15. The prediction line you developed could be used to predict the test score of any student claiming to have studied within that interval. Therefore, if, owing to an examination conflict, a student takes this (or a comparable) exam three days later and tells you he studied 10 hours, you would be able to predict that on average his test score is expected to be: Yˆi = 48.7239 + 3.2988 X i = 48.7239 + 3.2988(10) = 48.7239 + 32.9880 = 81.7119 Therefore, a student who studies 10 hours for this exam could be predicted on average to score approximately 82. The scatter plot and the accompanying prediction line are depicted in Figure 6. Note that as the values of X get larger (i.e., more study time hours) the prediction line passes through the points in an upward direction (i.e., a prediction of a higher test score) so the slope b1 is clearly positive. Also note that the points in the scatter plot distribute in a linear (rather than curvilinear) manner so that fitting a straight line to the data (rather than some curve) is appropriate. All the observed data points, however, do not lie on a perfectly straight line – there is a scattering of points above and below the line – so the simple linear regression model developed is not a perfect fit. The next section reviews the various descriptive summary measures that enable you to assess how much variation exists around the fitted prediction line and to express how strong (or weak) your simple linear regression equation is. 7 FIGURE 6 Scatter plot and prediction line for test scores with hours studied 2 Measures of Variation in Regression From the Bergen University data file, a portion of which is extracted in the Excel worksheet of Figure 3, there is evidence of lots of variability in your numerical variable of interest, the test scores achieved by the 93 business students. Figures 7 through 10 depict the five-number-summary, the boxplot, the stem-and-leaf display, and set of descriptive summary statistics of the test scores. You can gain a better understanding the characteristics of your dependent variable Y by observing its properties of central tendency, variation, and shape. FIGURE 7 Five-Number-Summary of Test Scores Obtained from PHStat Five-Number Summary of Test Scores Minimum 39.00 First Quartile 67.00 Median 72.00 Third Quartile 79.50 Maximum 99.00 8 FIGURE 8 Boxplot of Test Scores Obtained from PHStat FIGURE 9 Stem-and-Leaf Display of Test Scores Obtained from PHStat Stem & Leaf Test Scores Stem unit: Statistics Sample Size Mean Median Std. Deviation Minimum Maximum 93 73.02 72.00 10.26 39.00 99.00 3 4 5 6 7 8 9 10 9 7 058 11223334445566677777889999 000000111111111222222333345555567788999 00002233334456679 014689 The test score data are very slightly left-skewed in shape. There is one distinct outlier – the test score of 39 earned by the student with ID0089 as seen in Figure 1. The Z score criterion does not flag any other test score as an outlier. All other Z values are between – 3.00 and +3.00. On the other hand, the quartiles and interquartile range criterion signals that the poor exam grade of 47 and the truly excellent test score of 99 are also possible outliers. The bottom line, however, from this assessment of your numerical random variable of interest is that test scores vary from very poor (39) to truly excellent (99) with a clustering of values in the low seventies (i.e., the mean is 73.02 and the median is 72). 9 FIGURE 10 Descriptive Summary Statistics for Test Scores Obtained from PHStat Descriptive Analysis Test Scores Mean 73.02 Median 72.00 Mode 71.00 Minimum 39.00 Maximum 99.00 Range 60.00 Variance 105.28 Standard Deviation 10.26 Coeff. of Variation 0.14 Skewness -0.05 Kurtosis 1.26 Count 93 Standard Error 1.06 Note: Only the highlighted portions of the PHStat printout are important in this course. The question now is how much of the observed variation in the dependent variable test score can be explained or accounted for by building a simple linear regression model that uses claimed hours of study as a numerical independent or predictor variable of test score. The total variation in the dependent variable Y can be divided into two parts, the “good” or explained variation that is due to the simple linear regression model you have developed, and the “bad” or unexplained variation that may be due either to an unaccounted for curvilinear relationship with the independent variable, to other possible predictor variables not considered in the model, or simply to naturally occurring random variation. With respect to the Bergen University sample, the total variation in test scores consists of two parts – the explained variation attributable to using the numerical independent variable hours studied in developing a simple linear regression model to predict test score, and the unexplained or residual error variation that may be due to one or more of several factors. Factors that may contribute to residual error include: an unaccounted for curvilinear relationship with the numerical independent variable hours studied; other unaccounted for numerical predictor variables such as hours of sleep the night before the exam or ability in mathematics; unaccounted for categorical predictor variables such as gender, major, or interest in the subject; or simply random variation – why a student taking the same test at 9 a.m. might score a few points more or less than if the test was taken at 8 a.m. 10 The total variation in Y, the dependent variable of interest, is obtained as the summation of the squared difference between each of the Yi observations in the sample and the mean of the sample Y . That is, Total Variation = SST or Sum of Squares Total = n ∑ (Y i − Y )2 i =1 The total variation is often referred to as SST, the sum of squares “total.” You should recognize this formula as similar to the numerator in computing the variance S 2 and standard deviation S for a numerical variable X that you learned in your basic statistics course. In this regression analysis, total variation represents the sum of squared differences between each student’s test score and the average of all 93 students’ test scores in the Bergen University sample. From the Total row of the SS column in the ANOVA (i.e., “analysis of variance”) table presented in the PHStat printout in Figure 3, note that the total variation is 9685.9570 squared test score points. The explained variation in Y, the part of the total variation in Y that is accounted for by the developed simple linear regression model, is obtained as the summation of the squared difference between each of the Yˆi predicted observations in the sample and the mean of the sample Y . That is, Explained Variation = SSR or Sum of Squares Regression = n ∑ (Yˆ − Y ) 2 i i =1 The explained variation is often referred to as SSR, the sum of squares “regression.” In this regression analysis, explained variation represents the sum of squared differences between each student’s predicted test score and the average of all 93 students’ test scores in the Bergen University sample. From the Regression row of the SS column in the ANOVA table displayed in the PHStat printout in Figure 5, note that the explained variation is 8233.0292 squared test score points. Each of the 93 predicted test scores (Yˆi ) is obtained by substituting the 93 students’ individual values of X i (i.e., hours studied) into the simple linear regression equation Yˆ = 48.7239 + 3.2988 X . i i The unexplained variation in Y, the part of the total variation in Y that is not accounted for by the developed simple linear regression model, is obtained as the summation of the squared difference between each of the actual Yi observations and their corresponding predicted observations Yˆ in the sample of size n. That is, i Unexplained Variation = SSE or Sum of Squares Residual Error = n ∑ (Yˆ − Y ) i i =1 11 2 The unexplained variation is often referred to as SSE, the sum of squares “residual error.” In this regression analysis, unexplained variation represents the sum of squared differences between each student’s actual test score and predicted test score. From the Residual row of the SS column in the ANOVA table displayed in the PHStat printout in Figure 5, note that the unexplained variation is 1452.9278 squared test score points. The unexplained variation is a measure of how “off” the predicted values are from the actual values. From Figure 6, you can see that most of the observations are close to the prediction line and therefore the predicted test scores are, for most students, fairly close to their actual test scores. In a perfectly fitting model, all the observed Y values would lie on the fitted prediction line and there would be no scatter or residual error above and below the line. In such a situation, the explained variation SSR would equal total variation SST and the unexplained variation SSE would be 0. On the other hand, if the chosen predictor variable X is completely independent of the dependent variable Y then the unexplained variation SSE would equal the total variation SST and explained variation SSR would be 0. Given that SSR, SSE, and SST are each obtained by summing a set of squared observations, it is impossible for SSR, SSE, or SST to ever have a negative result. To summarize, SST = SSR + SSE n n n i =1 i =1 i =1 ∑ (Yi − Y ) 2 = ∑ (Yˆi − Y ) 2 + ∑ (Yi − Yˆi ) 2 and, in this regression analysis, 9685.9570 = 8233.0292 + 1452.9278 Figure 11 displays the different measures of variation for a particular student in the Bergen University sample. This student claims to have studied 11 hours for the exam and achieved the truly excellent grade of 99. Also plotted here are the prediction line Yˆi = 48.7239 + 3.2988 X i and the horizontal line at a height of 73.02 representing Y , the mean test score on this final examination. From Figure 11 you observe that this student scored 25.98 points higher on the test than the mean of all 93 student performances. The square of this difference represents this student’s contribution to SST, the total variation. Moreover, you observe that 85.01, the predicted score for this student and others who also study 11 hours is, on average, 11.99 points higher on the test than the mean of all 93 student performances. The square of this difference represents this student’s contribution to SSR, the explained variation. 12 Furthermore, you also observe that this overachieving student scored 13.99 points higher on the test than what would be predicted on average for all students who study 11 hours. The square of this difference represents this student’s contribution to SSE, the unexplained variation. When added to this student’s three calculations, similar calculations for the test scores of each of the other 92 students in the Bergen University sample would yield the SST, SSR, and SSE results shown here. FIGURE 11 Measures of variation Coefficient of Determination The coefficient of determination, given by the symbol r 2 , is the portion of the total variation in the dependent variable Y that is explained by the variation in the independent variable X in the simple linear regression model developed. That is, n r2 = SSR = SST ∑ (Yˆ − Y ) 2 i i =1 n ∑ (Y i i =1 − Y )2 In other words, r 2 is the ratio of the explained variation to the total variation. Note that r 2 ranges from 0 (a horribly fitting and useless regression model) to 1 (a perfectly fitting regression model). 13 In the results shown in the ANOVA table of our regression model in Figure 3, note that r2 = SSR 8233.0292 = = 0.8500 SST 9685.9570 When r 2 is reported, it is often converted to a percentage. Therefore, 85.0% of the variation in student test score is explained or accounted for by the variation in hours studied in the simple linear regression model developed. This large r 2 indicates a strong linear relationship between test score and hours studied because the fitted regression model has explained 85.0% of the variability in the test scores. Only 15.0% of the variability in the test scores remains unaccounted for. From the PHStat printout shown in Figure 5, note in the upper left corner under Regression Statistics that the result is displayed as R Square = 0.8500. Coefficient of Correlation The coefficient of correlation, given by the symbol r, measures the strength of the relationship between the two numerical variables, X and Y. You can obtain the coefficient of correlation r by simply taking the square root of the coefficient of determination r 2 and then giving the result a “+” sign if the slope b1 of the regression equation is positive or giving the result a “–“ sign if the slope b1 of the regression equation is negative. That is, r = r2 where r is positive if b1 is positive or r is negative if b1 is negative. Note that the coefficient of correlation r can range from – 1 (i.e., a perfect negative correlation) to + 1 (i.e., a perfect positive correlation), depending on whether the slope b1 of the simple linear regression equation is, respectively, negative or positive. The closer the coefficient of correlation r is to either – 1 or to + 1, the stronger the relationship between X and Y is; however, the closer r is to 0, the weaker the relationship is. If r = 0, there is no association between X and Y. For the results of our regression model, r = 0.8500 = 0.9220 14 Since our slope, b1 = +3.2988, is positive, the coefficient of correlation r is +0.9220, an indication of a very strong positive association between test score and hours studied for the final examination. From the PHStat printout shown in Figure 5, note in the upper left corner under Regression Statistics that the result is displayed as Multiple R = 0.9220. Caution You must realize that correlation does not imply causation. Just because you find a strong association between two variables does not mean that one caused the other. Standard Error of the Estimate The standard error of the estimate, given by the symbol S YX , measures the average scatter or variability in a set of paired observations (i.e., the data points on a scatter plot) around the fitted regression line. You may recall from your introductory statistics course that the standard deviation S measures the average scatter or variability in a set of individual observations around the mean of all the observations in a sample. In other words, the standard error of the estimate S YX for a set of paired observations is just like the standard deviation S for a set of individual observations. The former indicates the average spread above and below the prediction line, the latter measures the average spread above and below the sample mean. The standard error of the estimate S YX is obtained from: n S YX = SSE = n−2 ∑ (Y i − Yˆi ) 2 i =1 n−2 2 Therefore, the standard error of the estimate S YX is the square root of S YX , the variance around the fitted regression line. Note that the numerator is the unexplained variation, i.e., the sum of squares of residual error. Just like the standard deviation S, the standard error of the estimate S YX must yield a positive result. S YX would equal 0 only if all the observed data points in the scatter plot were to lie on a straight line. In such a case where the simple linear regression model perfectly fits the data, SSE would equal 0, SSR would equal SST, and r 2 would equal 1. The closer the observed data points are to the prediction line, the closer the value of S YX would be to 0, the closer r 2 would be to 1, and the better the fitted regression model would be for its purpose of prediction. Figure 12 depicts two scatter plots, one indicating a strong positive relationship with a small standard error of the estimate, the other describing a weak positive relationship with a large standard error of the estimate. 15 FIGURE 12 Comparing Standard Errors of the Estimate For the results shown in the ANOVA table of our regression model in Figure 5, note that n ∑ (Y i S YX = − Yˆi ) 2 i =1 n−2 = 1452.9278 = 15.9662 = 3.9958 93 − 2 2 , the variance around the fitted regression line, is 15.9662 squared test score points S YX so S YX , the standard error of the estimate, is 3.9958 or approximately 4 test score points. The standard error of the estimate S YX is measured in the same units as is the dependent variable Y. From the PHStat printout shown in Figure 5, note in the upper left corner under Regression Statistics that the result is displayed as Standard Error = 3.9958. Assessing Variability The standard error of the estimate S YX is a measure of scatter around the regression line and the standard deviation S is a measure of scatter around the mean. Here, the standard error of the estimate S YX is approximately 4 test score points. From the PHStat printout shown in Figures 9 and 10, the standard deviation S for the dependent variable is 10.26 test score points. Therefore, on average, an actual test score is expected to differ from its predicted test score by ± 4 points, whereas an actual test score is expected to differ from the mean of all sampled test scores by ± 10.26 points. Interestingly, the average spread 16 around the prediction line is 39 % of the average spread around the mean of all test scores, attesting to the increase in the understanding of the distribution of test scores and any test score estimates to be derived therefrom by using regression modeling for prediction purposes rather than simply studying the test score data by itself. 17 Regression Memo Assignment Instructions and Supporting Documents OVERVIEW: • • Prepare a 1 page memo with a second page attachment (this will have the scatterplot) to the target of your assigned data set (so, it is actually a 2 page document). Explain the regression analysis you have in your Excel file focusing on what the analysis means to the company and without using any of the statistical terms associated with regression analysis. SCENARIO: You are hired as a research assistant to analyze the data from a particular study and write a memo regarding aspects of your analysis and, where appropriate, make recommendations. Your memo should be written to the individual and organization designated in your project theme described in the file BUGN 280 Excel Project Data File Information and look similar to the sample memo below for Data Set 1 as far as content and formatting. YOUR MEMO MUST INCLUDE: 1. Your scatterplot with a regression line. By checking the right box in your software, the regression analysis will produce this chart for you. Format the chart to make it easy to read, adding a title and axis labels. 2. Four basic regression statistics: correlation coefficient, coefficient of determination, slope and y-intercept. 3. Explanation of your regression analysis, why you are using regression, what information is provided by each statistic, and the scatterplot. Explain the difference between correlation and regression, explain what each statistic means, and interpret each in practical terms. 4. Be resourceful. Leverage everything you can to help you. This is what it’s like on the job. Deliver your task on time and with the information requested. This is the Assessment of Learning for the School of Business Learning Goal 2a. 2. Be effective in (a) written and (b) oral communications and in the use of appropriate supporting electronic technologies The standard Graduation Writing Requirement Rubric will be used.
Purchase answer to see full attachment

Tags: Regression business memo

User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Attached.

OUTLINE

1. INTRODUCTION
2. BODY
3. CONCLUSION
4. REFERENCES

Scatter Plot
12

y = 0.0275x + 0.9526
R² = 0.814

10

Y

8

6

4

2

0
0

50

100

150

200
X

250

300

350

Calories Pct Alcohol
153
4.9
157
5.9
95
4.2
130
5
123
5.5
115
5
110
4.2
116
4.2
145
5
99
4.3
55
2.4
133
4.6
169
5.9
95
4.1
138
4.3
174
6.1
149
4.9
152
5
104
4.2
144
4.7
144
4.7
110
3.9
132
5
149
5.5
103
4.1
157
5.6
145
4.6
166
5.2
155
5
152
4.7
110
4.1
175
4.9
113
4.3
95
4.1
157
5.6
157
5.8
110
4.2
143
4.7
64
2.8
110
4.2
143
4.7
110
4.2
96
4.2
110
4.2
110
4.2
110
4.2

128
144
98
70
146
114
160
202
220
120
160
160
166
195
124
146
190
330
214
157
190
215
231
175
218
194
225
158
153
123
149
113
94
177
163
154
163
153
171
158
146
292
269
314
131
181
150

4.3
5.9
4.5
0.4
4.5
3.8
5.9
7.5
8
4.5
4.9
4.8
5.2
4.7
4.1
4.7
5.9
9.6
6.8
5
5.9
6.7
6.9
5.6
7
5.6
5.8
5
4.4
5
4.6
4.4
4.1
5.2
4.7
4.7
4.7
4.8
5.4
4.7
5.3
10.5
8.7
10.5
4.7
6.5
5.3

158
179
124
148
162
156
148
162
142
103
111
110
170
149
105
163
152
166
165
205
200
200
140
160
155
145
215
146
153
174
179
188
142
222
160
222
135
161
151
147
150
145
135
98
150
135

5.2
6.4
4.6
4.5
5.1
5.9
4.9
5
5.9
4.1
4.4
4.1
4.9
4.9
4.2
4.9
4.7
4.9
4.9
5.6
6.6
7
4.8
5.2
4.8
4.8
7.8
4.7
5
5.3
5.8
6.5
4.6
8.1
6
8.1
4.2
5.1
4.9
4.6
4.8
5
4.4
3.8
4.5
4.4

Calories Pct Alcohol
153
4.9
157
5.9
95
4.2
130
5
123
5.5
115
5
110
4.2
116
4.2
145
5
99
4.3
55
2.4
133
4.6
169
5.9
95
4.1
138
4.3
174
6.1
149
4.9
152
5
104
4.2
144
4.7
144
4.7
110
3.9
132
5
149
5.5
103
4.1
157
5.6
145
4.6
166
5.2
155
5
152
4.7
110
4.1
175
4.9
113
4.3
95
4.1
157
5.6
157
5.8
110
4.2
143
4.7
64
2.8
110
4.2
143
4.7
110
4.2
96
4.2
110
4.2
110
4.2
110
4.2

128
144
98
70
146
114
160
202
220
120
160
160
166
195
124
146
190
330
214
157
190
215
231
175
218
194
225
158
153
123
149
113
94
177
163
154
163
153
171
158
146
292
269
314
131
181
150

4.3
5.9
4.5
0.4
4.5
3.8
5.9
7.5
8
4.5
4.9
4.8
5.2
4.7
4.1
4.7
5.9
9.6
6.8
5
5.9
6.7
6.9
5.6
7
5.6
5.8
5
4.4
5
4.6
4.4
4.1
5.2
4.7
4.7
4.7
4.8
5.4
4.7
5.3
10.5
8.7
10.5
4.7
6.5
5.3

158
179
124
148
162
156
148
162
142
103
111
110
170
149
105
163
152
166
165
205
200
200
140
160
155
145
215
146
153
174
179
188
142
222
160
222
135
161
151
147
150
145
135
98
150
135

5.2
6.4
4.6
4.5
5.1
5.9
4.9
5
5.9
4.1
4.4
4.1
4.9
4.9
4.2
4.9
4.7
4.9
4.9
5.6
6.6
7
4.8
5.2
4.8
4.8
7.8
4.7
5
5.3
5.8
6.5
4.6
8.1
6
8.1
4.2
5.1
4.9
4.6
4.8
5
4.4
3.8
4.5
4.4

Simple Linear Regression Analysis
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations

0.9022
0.8140
0.8126
0.5642
139

ANOVA
df
Regression
Residual
Total

Intercept
Calories

1
137
138

SS
MS
F
Significance F
190.8133 190.8133 599.5098
0.0000
43.6047
0.3183
234.4180

Coefficients Standard Error
0.9526
0.1777
0.0275
0.0011

t Stat
P-value
5.3620
0.0000
24.4849
0.0000

Lower 95%
Upper 95%
0.6013
1.3039
0.0253
0.0297

Calculations
b1, b0 Coefficients
0.0275
0.9526
b1, b0 Standard Error
0.0011
0.1777
R Square, Standard Error
0.8140
0.5642
F , Residual df
599.5098 137.0000
Regression SS , Residual SS
190.8133 43.6047
Confidence level
t Critical Value
Half Width b0
Half Width b1

Lower 95%
0.6013
0.0253

Upper 95%
1.30390
0.02972

95%
1.9774
0.3513
0.0022

Calories
153
157
95
130
123
115
110
116
145
99
55
133
169
95
138
174
149
152
104
144
144
110
132
149
103
157
145
166
155
152
110
175
113
95
157
157
110
143
64
110
143
110
96
110
110
110

128
144
98
70
146
114
160
202
220
120
160
160
166
195
124
146
190
330
214
157
190
215
231
175
218
194
225
158
153
123
149
113
94
177
163
154
163
153
171
158
146
292
269
314
131
181
150

158
179
124
148
162
156
148
162
142
103
111
110
170
149
105
163
152
166...