Results Section

Content Type

User Generated

User

nnnneeeeee

Subject

Mathematics

Description

Please follow the uploaded checklist (word document), and apply the required requirements from the excel data sheet (Amazon) that will be uploaded. Please use all the sheets the excel workbook for data. This section is longer and will require 3-4 pages and with references to tables in the document. I will upload a document that has a sample results section that you can read to compare.

Unformatted Attachment Preview

Writing Project: Checklist for the results section-written portion _____ The results section clearly states the methodology being used. _____ Dummy variables are used correctly to incorporate zip code in the model _____ Multicolinearity is discussed and addressed; correlation coefficients are calculated to identify potential multicollinearity. _____ Outliers and observations with unusually high leverage are identified and addressed. _____The statistical significance of coefficients is discussed and addressed; the possibility of eliminating variables with low statistical significance is considered. _____A partial F-test is conducted for the joint statistical significance of the dummy variables _____ There is a clear discussion of each model specification presented and why it is preferred or not preferred to other possible specifications. _____ Both the magnitude and the statistical significance of the coefficients in the preferred specification are discussed. _____ The discussion of the regression coefficients uses the appropriate units and relevant magnitudes. Writing Project: Checklist for the results section-table of regression results _____ The table of regression results has a title. _____ The table of regression results is laid out in the conventional format and self-explanatory. _____ The table of regression results contains the coefficients, t-stats or p-values for each, the Rsquared and adjusted R-squared for each model, and the sample size (if it changes from model to model). _____ Coefficients and p-values in the regression results table are rounded consistently. _____ The text and table(s) are well integrated; the text refers to the table of results where appropriate. Comments/questions: Vanessa Price ($) Bedrooms Bathrooms House Ft2 Lot Ft2 Year Built Zip Code Address Home 1 710,000 3 2 1,780 1,306 2015 98122 913 29th Ave Home 2 497,500 2 3 880 1,379 2006 98122 711 26th Ave Home 3 625,000 2 2 1,130 2,347 2007 98122 822 17th Ave Home 4 881,875 4 4 2,808 4,704 1906 98122 1715 Madrona Dr Home 5 820,000 3 2.5 2,370 7,620 1994 98122 963 22nd Ave Home 6 525,000 3 1 1,220 2,592 1902 98122 1221 E Jefferson St Home 7 878,500 3 2.5 2,230 2,700 1987 98122 308 35th Ave Home 8 856,000 3 2 1,600 4,000 1925 98122 356 27th Ave Home 9 650,000 3 2 1,630 1,012 2016 98122 1415 E Fir St Home 10 387,500 3 2 1,050 2,578 1903 98122 713 23rd Ave Home 11 970,000 3 1 1,840 3,500 1904 98122 2515 E Yesler Way Home 12 370,000 3 2 1,090 3,049 1901 98122 517 23rd Ave Home 13 639,000 3 3 1,600 1,742 2016 98122 1523 19th Ave Home 14 918,000 3 2.5 1,930 1,753 2016 98122 918 15th Ave Home 15 855,000 3 2 1,260 3,841 1903 98122 723 19th Ave House 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Price $ Bedrooms BathroomsHouse Ft2 Lot Ft2 Year built Zipcode $1,325,000.00 4 3.5 3,150 5,640 1920 98109 $625,000.00 3 1.5 1,440 3,400 1910 98109 $705,000.00 3 2.75 2,260 4,000 1956 98109 $546,000.00 3 1.5 1,850 9,240 1954 98109 $1,165,000.00 2 1.5 2,130 6,534 1924 98109 $751,000.00 4 4.5 1,910 2,090 2007 98109 $1,166,000.00 4 3 2,760 4,200 1910 98109 $1,100,000.00 3 3.5 2,160 3,458 1917 98109 $1,025,000.00 4 2 2,220 4,000 1924 98109 $1,820,000.00 5 3.5 3,200 4,000 1929 98109 $1,675,000.00 3 2 3,200 7,840 1961 98109 $865,000.00 3 1.5 2,640 3,092 1925 98109 $1,285,000.00 5 4 3,600 6,373 1904 98109 $740,000.00 4 3 1,812 1,429 2007 98109 $2,300,000.00 4 3.25 3,500 4,046 2016 98109 Price (1000) $ $1,325.00 $625.00 $705.00 $546.00 $1,165.00 $751.00 $1,166.00 $1,100.00 $1,025.00 $1,820.00 $1,675.00 $865.00 $1,285.00 $740.00 $2,300.00 Age 98 years 108 years 62 years 64 years 94 years 11 years 108 years 101 years 94 years 89 years 57 years 93 years 114 years 11 years 2 years House 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Price(1000)Bed 476 1032 960 575 650 1706 1121 378 860 789 672 678 608 947 1565 Bath 2 3 3 2 4 3 4 3 3 2 3 3 2 4 4 1 2.5 1.75 1 2 5 2 1.5 2 1 3 3 1 2 3 House sqft Lot sqft Yr built Zip Add. 1510 2550 1928 98103 3645 Carr 2370 3066 1929 98103 3728 Woodlawn 3400 8712 1921 98103 4212 Francis 1480 2680 1946 98103 1102 N 41 2290 4791 1922 98103 4019 Woodland 4101 5702 1990 98103 4315 Bagley 2930 3484 1922 98103 5800 Greenwood 1370 6840 1946 98103 3721 Eastern 2260 3191 1926 98103 1316 N39 1130 3301 1900 98103 4223 Dayton 1770 1714 2012 98103 3810 Linden 2210 1306 1998 98103 3652 Whitman 1830 3920 1909 98103 4028 Midvale 2250 4356 1912 98103 208 N 54 3048 5227 2007 98103 711 N61 Address 4018 31ST Ave W 2519 34TH Ave W 2560 31ST Ave W 3051 21ST Ave W 2409 Montavista Pl W 4428 36TH Ave W 2915 W Garfield St 2631 34TH Ave W 4209 26TH Ave W 4516 35TH Ave W 4037 Williams Ave W 3637 W Commodore Way 5436 40TH Ave W 2223 W Barrett St 4315 32ND Ave W Price ($1000s) Bedrooms BathroomsHouse FT2 Lot FT2 YR Built Zip Code 990 3 2 1810 4800 1955 98199 816 3 1 1870 5998 1938 98199 650 3 1 1270 5500 1928 98199 735 3 2 1660 2700 1928 98199 1938 4 4 4070 4791 1978 98199 1425 4 4 3180 4791 2005 98199 1010 3 2 2040 5000 1938 98199 851 3 2 1521 6534 1941 98199 705 3 2 1640 5776 1950 98199 742 2 1 1140 4791 1947 98199 998 3 3 1880 5500 1947 98199 1975 5 5 3850 8276 2017 98199 1186 2 2 1776 4791 1958 98199 755 3 2 2380 6098 1905 98199 675 3 1.5 1250 4800 1943 98122 Age 63 80 90 90 40 13 80 77 68 71 71 1 60 113 75 Empirical Results: For this writing project concerning unemployment in Oregon for 2009, I estimated two different regression models for two separate dependent variables: percentage of male unemployment and percentage of female unemployment. For each dependent variable, I estimated three different specifications. The independent variables used in these regression models are (1) population of 18 to 24 year olds as a percentage, (2) population of naturalized foreign-born citizens as a percentage, (3) population of foreign-born non-citizens as a percentage, (4) black population as a percentage, (5) Asian population as a percentage, (6) Hispanic population as a percentage, (7) percentage of people who have attained a college or graduate degree, (8) percentage of people who work for the government, (9) percentage of people who work in construction, and (10) percentage of people who work white collar jobs. Male Unemployment: Table 2 (on Page 6) shows the regression results for male unemployment. In the first specification, I regress male unemployment on all ten of the independent variables. With a relatively low R-Squared Adjusted at about 0.35, I looked to eliminating variables with low tratios and high p-values in order to increase this value. With their low t-ratios and high p-values, these independent variables might be insignificant to the model at a 1% significance level. In this specification, the variables with the highest t-ratios were both independent variables concerning citizenship: population of naturalized foreign-born citizens as a percentage and population of foreign-born non-citizens as a percentage. With t-ratios of 0.02 and 0.23, respectively, these variables might be insignificant to the model. In the second specification for male unemployment, I eliminate both independent variables concerning citizenship. With an increased R-Squared Adjusted of around 0.39, it seems as if these two variables are indeed insignificant to the model. To confirm this, I performed a partial F-test to test these variables’ significance to the overall model. With a null hypothesis of both variables being insignificant, an alternative hypothesis of at least one variable being significant, and an F statistic of .03, it is clear that these two variables are not statistically significant to the overall model at a 1% significance level. Thus, it is perfectly acceptable to eliminate these two variables from the model. Nonetheless, I still looked to eliminating variables with low t-ratios and high p-values in order to increase the R-Squared Adjusted. In this specification, the variables with the highest t-ratios were all independent variables concerning race and ethnicity: black population as a percentage, Asian population as a percentage, and Hispanic population as a percentage. With t-ratios of 0.67, -0.43, and -0.40, respectively, these variables might be insignificant to the model. Although the variable for age (population of 18 to 24 year olds as a percentage) actually has the highest t-ratio in this specification, this variable will not be eliminated because it is a key part of answering the research question of this project. In the final specification for male unemployment, I eliminate not only both independent variables concerning citizenship (population of naturalized foreign-born citizens as a percentage and population of foreign-born non-citizens as a percentage) but also all independent variables concerning race and ethnicity (black population as a percentage, Asian population as a percentage, and Hispanic population as a percentage). With an increased R-Squared Adjusted of around 0.44, it seems as if all of the race and ethnicity variables are indeed insignificant to the model. To confirm this, I performed a partial F-test to test these variables’ significance to the Klutho 5 overall model. With a null hypothesis of all variables being insignificant, an alternative hypothesis of at least one variable being significant, and an F statistic of about 0.24, it is clear that all race and ethnicity variables are not statistically significant to the overall model at a 1% significance level. Thus, it is perfectly acceptable to eliminate these variables from the overall model. Therefore, the final model for male unemployment includes only five independent variables: (1) population of 18 to 24 year olds as a percentage, (2) population of naturalized foreign-born citizens as a percentage, (3) percentage of people who have attained a college or graduate degree, (4) percentage of people who work for the government, (5) percentage of people who work in construction, and (6) percentage of people who work white collar jobs. Although only one variable in this specification is significant according to t-tests, eliminating one or a combination of these variables only served to decrease the R-Squared Adjusted. Looking at the coefficients of these five variables, it is clear that education has the largest impact on male unemployment in Oregon. Holding all other variables constant, a 1% increase in the population with college or graduate degrees is associated with a 0.51% decrease in male unemployment. White-collar employment and construction employment also have a relatively large impact on male unemployment. Keeping all other variables constant, a 1% increase in share of employment in white-collar jobs will increase predicted male unemployment by 0.41%, while (holding all other variables constant) a 1% increase in the share of employment in construction is associated with a 0.36% decrease in predicted male unemployment. Government employment also has a positive relationship with male unemployment; a 1% increase in people who work for the government will increase predicted male unemployment by 0.18%. Age, however, has the smallest impact on male unemployment in Oregon; holding all other variables constant, a 1% population increase in 18-24 year olds is associated with a 0.07% increase in predicted male unemployment. Therefore, the effect of age on predicted male unemployment is not statistically significant. Klutho 6 Table 2: Regression Results for Male Unemployment in Oregon at County Level Dependent Variable: Male Unemployment (in %) (1) – Full Model (2) – Reduced Model (3) – Final Model Intercept -0.03 (-0.21) -0.02 (-0.12) -0.03 (-0.23) Pop, 18 to 24 Years, 2009 (in %) 0.13 (0.42) 0.10 (0.36) 0.07 (0.30) Pop, Foreign Born Naturalized, 2009 (in %) 0.03 (0.02) Pop, Foreign Born Not a Citizen, 2009 (in %) 0.09 (0.23) Black Pop, 2009 (in %) 0.39 (0.60) 0.42 (0.67) Asian Pop, 2009 (in %) -0.26 (-0.41) -0.23 (-0.43) Hispanic Pop, 2009 (in %) -0.06 (-0.40) -0.02 (-0.40) College or Graduate Degree, 2009 (in %) -0.52 (-2.02) -0.49 (-2.41)* -0.51 (-2.75)** Employment, Government (in %) 0.18 (1.04) 0.17 (1.12) 0.18 (1.63) Employment, Construction, 2009 (in %) -0.37 (-0.75) -0.40 (-0.86) -0.36 (-0.94) Employment, White Collar, 2009 (in %) 0.42 (1.54) 0.39 (1.69) 0.41 (1.93) R-Squared 0.533 0.531 0.519 R-Squared Adjusted 0.346 0.393 0.439 Number of Observations is 36 T-Ratios are in parentheses **significant at 1%, *significant at 5% Klutho 7 Female Unemployment: Table 3 (on Page 9) depicts the regression results for female unemployment. In the first specification, I regress female unemployment on all ten of the independent variables. Although its R-Squared Adjusted is not as low as the male R-Squared Adjusted, this specification had a relatively low R-Squared Adjusted at about 0.45. To increase this value, I looked to eliminating variables with low t-ratios, which might prove these variables to be insignificant to the model at a 1% significance level. In this specification, the variables with the highest t-ratios were population of foreign-born non-citizens as a percentage and percentage of people who work for the government. With t-ratios of 0.15 and -0.15, respectively, these variables might be insignificant to the model. Although the variable for age (population of 18 to 24 year olds as a percentage) actually has one of the highest t-ratios in this specification, this variable will not be eliminated because it is a key part of answering the research question of this project. In the second specification for female unemployment, I eliminate the independent variables for population of foreign-born non-citizens as a percentage and percentage of people who work for the government. With an increased R-Squared Adjusted of around 0.49, it seems as if these two variables are indeed insignificant to the model. To confirm this, I performed a partial F-test to test these variables’ significance to the overall model. With a null hypothesis of both variables being insignificant, an alternative hypothesis of at least one variable being significant, and an F statistic of about .03, it is clear that these two variables are not statistically significant to the overall model at a 1% significance level. Thus, it is perfectly acceptable to eliminate these two variables from the model. Nonetheless, I still looked to eliminating variables with low tratios and high p-values in order to increase the R-Squared Adjusted. In this specification, the variables with the highest t-ratios were two independent variables concerning race and ethnicity: black population as a percentage and Asian population as a percentage. With t-ratios of 0.63 and -0.35, respectively, these variables might be insignificant to the model. Like in the previous specification, the variable for age (population of 18 to 24 year olds as a percentage) actually has the highest t-ratio in this specification; once again, this variable will not be eliminated because it is a key part of answering the research question of this project. In the final specification for female unemployment, I eliminate not only the independent variables for population of foreign-born non-citizens as a percentage and percentage of people who work for the government but also the independent variables of black population as a percentage and Asian population as a percentage. With an increased R-Squared Adjusted of around 0.52, it seems as if these two race and ethnicity variables are indeed insignificant to the model. To confirm this, I performed a partial F-test to test these variables’ significance to the overall model. With a null hypothesis of both variables being insignificant, an alternative hypothesis of at least one variable being significant, and an F statistic of about 0.21, it is clear that the variables concerning black and Asian populations are not statistically significant to the overall model at a 1% significance level. Thus, it is perfectly acceptable to eliminate these variables from the model. Therefore, the final model for female unemployment includes only six independent variables: (1) population of 18 to 24 year olds as a percentage, (2) Hispanic population as a percentage, (3) percentage of people who have attained a college or graduate degree, (4) percentage of people who work in construction, and (5) percentage of people who work white collar jobs. Although only one variable in this specification is significant according to t-tests, Klutho 8 eliminating one or a combination of these variables only served to decrease the R-Squared Adjusted. Looking at the coefficients of these five variables, it is clear that naturalized citizenship has the largest impact on female unemployment in Oregon. Holding all other variables constant, a 1% increase in the population of naturalized foreign-born citizens is associated with a 1.2% decrease in predicted female unemployment. Compared to all other coefficients in both the male unemployment and female unemployment models, this coefficient has the largest magnitude. Although they do no have as great of an impact as naturalized foreign-born citizens, white-collar employment and construction employment also have a relatively large impact on female unemployment. Keeping all other variables constant, a 1% increase in share of employment in white-collar jobs will increase predicted female unemployment by 0.30%, while (holding all other variables constant) a 1% increase in the share of employment in construction is associated with a 0.38% increase in predicted female unemployment. Hispanic race/ethnicity also has a positive relationship with female unemployment; a 1% increase in Hispanic population will increase predicted female unemployment by 0.22%. Education, however, has a negative relationship with female unemployment; a 1% increase in the population with college or graduate degrees is associated with a 0.23% decrease in predicted female unemployment. Compared to the role of education in the male unemployment model, it is surprising that this value is lower; nonetheless, it is clear that education plays a role in determining unemployment for females. Just as in the male unemployment model, age has the smallest impact on female unemployment in Oregon; holding all other variables constant, a 1% population increase in 1824 year olds is associated with a 0.03% decrease in predicted female unemployment. Therefore, the effect of age on predicted female unemployment is not statistically significant. Klutho 9 Table 3: Regression Results for Female Unemployment in Oregon at County Level Dependent Variable: Female Unemployment (in %) (1) – Full Model (2) – Reduced Model (3) – Final Model Intercept -0.01 (-0.07) -0.01 (-0.11) -0.002 (-0.03) Pop, 18 to 24 Years, 2009 (in %) -0.03 (-0.17) -0.04 (-0.22) -0.03 (-0.20) Pop, Foreign Born Naturalized, 2009 (in %) -1.31 (-1.67) -1.21 (-1.93) -1.20 (-2.97)** Pop, Foreign Born Not a Citizen, 2009 (in %) 0.04 (0.15) Black Pop, 2009 (in %) 0.26 (0.59) 0.27 (0.63) Asian Pop, 2009 (in %) -0.15 (-0.35) -0.14 (-0.35) Hispanic Pop, 2009 (in %) 0.21 (2.24)* 0.22 (4.01)** 0.22 (4.29)** College or Graduate Degree, 2009 (in %) -0.23 (-1.30) -0.22 (-1.68) -0.23 (-1.85) Employment, Government (in %) -0.02 (-0.15) Employment, Construction, 2009 (in %) -0.41 (-1.22) -0.37 (-1.70) 0.38 (-1.80) Employment, White Collar, 2009 (in %) 0.32 (1.73) 0.31 (1.96) 0.30 (2.00) R-Squared 0.610 0.609 0.603 R-Squared Adjusted 0.454 0.493 0.521 Number of Observations is 36 Standard Errors are in parentheses **significant at 1%, *significant at 5%
Purchase answer to see full attachment

Tags: analysis variables Stats Regression