Use SAS to do logistic regression

User Generated

Nxhabybxvln

Programming

Description

2-Group Analysis:

Perform a Logistic Regression Analysis paralleling the HBAT analysis performed in the textbook and in class, but now use the HATCO dataset (see HATCO_Split60 data file).

(1) Model: X11 = X1 – X7

(2) Split 60-40 Variable: Split60

(3) Classification and Prediction Analysis including Holdout Sample

Provide a brief management summary including key selected results along with the SAS program and output as an appendix.

Unformatted Attachment Preview

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 1 1 1 0 0 1 0 1 1 1 1 1 1 1 1 0 0 1 0 0 1 0 1 1 1 0 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 1 1 1 0 4.1 1 1.8 1 3.4 2 2.7 1 6 3 1.9 2 4.6 1 1.3 2 5.5 3 4 2 2.4 1 3.9 2 2.8 1 3.7 1 4.7 3 3.4 3 3.2 2 4.9 2 5.3 3 4.7 3 3.3 2 3.4 1 3 3 2.4 1 5.1 2 4.6 3 2.4 1 5.2 3 3.5 3 4.1 0.6 6.9 4.7 2.4 2.3 5.2 0 32 4.2 1 0 3 6.3 6.6 2.5 4 8.4 1 43 4.3 0 1 5.2 5.7 6 4.3 2.7 8.2 1 48 5.2 0 1 1 7.1 5.9 1.8 2.3 7.8 1 32 3.9 0 1 0.9 9.6 7.8 3.4 4.6 4.5 0 58 6.8 1 0 3.3 7.9 4.8 2.6 1.9 9.7 1 45 4.4 0 1 2.4 9.5 6.6 3.5 4.5 7.6 0 46 5.8 1 0 4.2 6.2 5.1 2.8 2.2 6.9 1 44 4.3 0 1 1.6 9.4 4.7 3.5 3 7.6 0 63 5.4 1 0 3.5 6.5 6 3.7 3.2 8.7 1 54 5.4 0 1 1.6 8.8 4.8 2 2.8 5.8 0 32 4.3 1 0 2.2 9.1 4.6 3 2.5 8.3 0 47 5 1 0 1.4 8.1 3.8 2.1 1.4 6.6 1 39 4.4 0 1 1.5 8.6 5.7 2.7 3.7 6.7 0 38 5 1 0 1.3 9.9 6.7 3 2.6 6.8 0 54 5.9 1 0 2 9.7 4.7 2.7 1.7 4.8 0 49 4.7 1 0 4.1 5.7 5.1 3.6 2.9 6.2 0 38 4.4 1 1 1.8 7.7 4.3 3.4 1.5 5.9 0 40 5.6 1 0 1.4 9.7 6.1 3.3 3.9 6.8 0 54 5.9 1 0 1.3 9.9 6.7 3 2.6 6.8 0 55 6 1 0 0.9 8.6 4 2.1 1.8 6.3 0 41 4.5 1 0 0.4 8.3 2.5 1.2 1.7 5.2 0 35 3.3 1 0 4 9.1 7.1 3.5 3.4 8.4 0 55 5.2 1 1 1.5 6.7 4.8 1.9 2.5 7.2 1 36 3.7 0 1 1.4 8.7 4.8 3.3 2.6 3.8 0 49 4.9 1 0 2.1 7.9 5.8 3.4 2.8 4.7 0 49 5.9 1 0 1.5 6.6 4.8 1.9 2.5 7.2 1 36 3.7 0 1 1.3 9.7 6.1 3.2 3.9 6.7 0 54 5.8 1 0 2.8 9.9 3.5 3.1 1.7 5.4 0 49 5.4 1 0 3.7 5.9 5.5 3.9 3 8.4 1 46 5.1 0 1 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 0 1 0 1 0 1 0 0 0 0 0 1 1 0 1 0 1 1 1 0 1 0 1 1 1 1 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 1 0 1 1 2 3 1 2.8 3 5.2 3 3.4 1 2.4 1 1.8 1 3.6 2 4 3 0 1 2.4 1 1.9 1 5.9 3 4.9 3 5 2 2 1 5 3 3.1 3 3.4 2 5.8 3 5.4 3 3.7 2 2.6 2 4.5 2 2.8 1 3.8 1 2.9 2 4.9 2 5.4 3 4.3 3 3.2 6 5.3 3.1 3 8 1 43 3.3 0 1 3.8 8.9 6.9 3.3 3.2 8.2 0 53 5 1 1 2 9.3 5.9 3.7 2.4 4.6 0 60 6.1 1 0 3.7 6.4 5.7 3.5 3.4 8.4 1 47 3.8 0 1 1 7.7 3.4 1.7 1.1 6.2 1 35 4.1 0 1 3.3 7.5 4.5 2.5 2.4 7.6 1 39 3.6 0 1 4 5.8 5.8 3.7 2.5 9.3 1 44 4.8 0 1 0.9 9.1 5.4 2.4 2.6 7.3 0 46 5.1 1 0 2.1 6.9 5.4 1.1 2.6 8.9 1 29 3.9 0 1 2 6.4 4.5 2.1 2.2 8.8 1 28 3.3 0 1 3.4 7.6 4.6 2.6 2.5 7.7 1 40 3.7 0 1 0.9 9.6 7.8 3.4 4.6 4.5 0 58 6.7 1 0 2.3 9.3 4.5 3.6 1.3 6.2 0 53 5.9 1 0 1.3 8.6 4.7 3.1 2.5 3.7 0 48 4.8 1 0 2.6 6.5 3.7 2.4 1.7 8.5 1 38 3.2 0 1 2.5 9.4 4.6 3.7 1.4 6.3 0 54 6 1 0 1.9 10 4.5 2.6 3.2 3.8 0 55 4.9 1 0 3.9 5.6 5.6 3.6 2.3 9.1 1 43 4.7 0 1 0.2 8.8 4.5 3 2.4 6.7 0 57 4.9 1 0 2.1 8 3 3.8 1.4 5.2 0 53 3.8 1 0 0.7 8.2 6 2.1 2.5 5.2 0 41 5 1 0 4.8 8.2 5 3.6 2.5 9 1 53 5.2 0 1 4.1 6.3 5.9 4.3 3.4 8.8 1 50 5.5 0 1 2.4 6.7 4.9 2.5 2.6 9.2 1 32 3.7 0 1 0.8 8.7 2.9 1.6 2.1 5.6 0 39 3.7 1 0 2.6 7.7 7 2.8 3.6 7.7 0 47 4.2 1 1 4.4 7.4 6.9 4.6 4 9.6 1 62 6.2 0 1 2.5 9.6 5.5 4 3 7.7 0 65 6 1 0 1.8 7.6 5.4 3.1 2.5 4.4 0 46 5.6 1 0 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 0 1 1 1 0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 0 1 1 1 1 1 1 1 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 0 1 1 0 0 1 0 0 0 1 1 0 0 1 0 1 2.3 2 3.1 3 5.1 3 4.1 3 3 2 1.1 1 3.7 2 4.2 3 1.6 2 5.3 3 2.3 2 3.6 2 5.6 3 3.6 3 5.2 3 3 1 4.2 2 3.8 2 3.3 3 1 1 4.5 3 5.5 3 3.4 2 1.6 1 2.3 1 2.6 2 2.5 1 2.4 2 2.1 1 2.9 4.5 8 4.7 3.3 2.2 8.7 1 50 5 0 1 1.9 9.9 4.5 2.6 3.1 3.8 0 54 4.8 1 0 1.9 9.2 5.8 3.6 2.3 4.5 0 60 6.1 1 0 1.1 9.3 5.5 2.5 2.7 7.4 0 47 5.3 1 0 3.8 5.5 4.9 3.4 2.6 6 0 36 4.2 1 1 2 7.2 4.7 1.6 3.2 10 1 40 3.4 0 1 1.4 9 4.5 2.6 2.3 6.8 0 45 4.9 1 0 2.5 9.2 6.2 3.3 3.9 7.3 0 59 6 1 0 4.5 6.4 5.3 3 2.5 7.1 1 46 4.5 0 1 1.7 8.5 3.7 3.5 1.9 4.8 0 58 4.3 1 0 3.7 8.3 5.2 3 2.3 9.1 1 49 4.8 0 1 5.4 5.9 6.2 4.5 2.9 8.4 1 50 5.4 0 1 2.2 8.2 3.1 4 1.6 5.3 0 55 3.9 1 0 2.2 9.9 4.8 2.9 1.9 4.9 0 51 4.9 1 0 1.3 9.1 4.5 3.3 2.7 7.3 0 60 5.1 1 0 2 6.6 6.6 2.4 2.7 8.2 1 41 4.1 0 1 2.4 9.4 4.9 3.2 2.7 8.5 0 49 5.2 1 0 0.8 8.3 6.1 2.2 2.6 5.3 0 42 5.1 1 0 2.6 9.7 3.3 2.9 1.5 5.2 0 47 5.1 1 0 1.9 7.1 4.5 1.5 3.1 9.9 1 39 3.3 0 1 1.6 8.7 4.6 3.1 2.1 6.8 0 56 5.1 1 0 1.8 8.7 3.8 3.6 2.1 4.9 0 59 4.5 1 0 4.6 5.5 8.2 4 4.4 6.3 0 47 5.6 1 1 2.8 6.1 6.4 2.3 3.8 8.2 1 41 4.1 0 1 3.7 7.6 5 3 2.5 7.4 0 37 4.4 1 1 3 8.5 6 2.8 2.8 6.8 1 53 5.6 0 1 3.1 7 4.2 2.8 2.2 9 1 43 3.7 0 1 2.9 8.4 5.9 2.7 2.7 6.7 1 51 5.5 0 1 3.5 7.4 4.8 2.8 2.3 7.2 0 36 4.3 1 1 1.2 7.3 6.1 2 2.5 8 1 34 4 0 1 90 91 92 93 94 95 96 97 98 99 100 1 1 0 0 1 1 0 1 1 0 0 1 1 1 1 1 1 0 0 1 0 0 0 1 4.3 3 3 2 4.8 2 3.1 2 1.9 1 4 1 0.6 1 6.1 3 2 1 3.1 1 2.5 1 2.5 9.3 6.3 3.4 4 7.4 0 60 6.1 1 0 2.8 7.8 7.1 3 3.8 7.9 0 49 4.4 1 1 1.7 7.6 4.2 3.3 1.4 5.8 0 39 5.5 1 0 4.2 5.1 7.8 3.6 4 5.9 0 43 5.2 1 1 2.7 5 4.9 2.2 2.5 8.2 1 36 3.6 0 1 0.5 6.7 4.5 2.2 2.1 5 0 31 4 1 0 1.6 6.4 5 0.7 2.1 8.4 1 25 3.4 0 1 0.5 9.2 4.8 3.3 2.8 7.1 0 60 5.2 1 0 2.8 5.2 5 2.4 2.7 8.4 1 38 3.7 0 1 2.2 6.7 6.8 2.6 2.9 8.4 1 42 4.3 0 1 1.8 9 5 2.2 3 6 0 33 4.4 1 0 SAS Output Chapter 6 Logistic Regression Example Obs ID Split60 X4 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 1 1 0 1 8.5 3.9 2.5 5.9 4.8 4.9 6.0 6.8 4.7 4.3 5.0 5.1 3.7 2 2 0 0 8.2 2.7 5.1 7.2 3.4 7.9 3.1 5.3 5.5 4.0 3.9 4.3 4.9 3 3 0 1 9.2 3.4 5.6 5.6 5.4 7.4 5.8 4.5 6.2 4.6 5.4 4.0 4.5 4 4 0 1 6.4 3.3 7.0 3.7 4.7 4.7 4.5 8.8 7.0 3.6 4.3 4.1 3.0 5 5 0 0 9.0 3.4 5.2 4.6 2.2 6.0 4.5 6.8 6.1 4.5 4.5 3.5 3.5 6 6 0 1 6.5 2.8 3.1 4.1 4.0 4.3 3.7 8.5 5.1 9.5 3.6 4.7 3.3 7 7 0 1 6.9 3.7 5.0 2.6 2.1 2.3 5.4 8.9 4.8 2.5 2.1 4.2 2.0 8 8 0 1 6.2 3.3 3.9 4.8 4.6 3.6 5.1 6.9 5.4 4.8 4.3 6.3 3.7 9 10 0 1 6.4 4.5 5.1 6.1 4.7 5.7 5.7 8.4 5.4 5.3 4.1 5.8 4.4 10 11 0 0 8.7 3.2 4.6 4.8 2.7 6.8 4.6 6.8 5.8 7.5 3.8 3.7 4.0 11 12 0 1 6.1 4.9 6.3 3.9 4.4 3.9 6.4 8.2 5.8 5.9 3.0 4.9 3.2 12 14 0 0 9.2 3.9 5.7 5.5 2.4 8.4 4.8 7.1 6.7 3.0 4.5 2.6 4.2 13 15 0 1 6.3 4.5 4.7 6.9 4.5 6.8 5.9 8.8 6.0 5.4 4.8 6.2 5.2 14 16 0 0 8.7 3.2 4.0 6.8 3.2 7.8 3.8 4.9 6.1 5.0 4.3 3.9 4.5 15 17 0 1 5.7 4.0 6.7 6.0 3.3 5.5 5.1 6.2 6.7 5.4 4.2 6.2 4.5 16 20 0 1 9.1 4.5 3.6 6.4 5.3 5.3 7.1 8.4 5.8 6.7 4.5 6.1 4.4 17 24 0 1 9.3 2.4 2.6 7.2 2.2 7.2 4.5 6.2 6.4 4.2 6.7 4.4 4.5 18 27 0 0 8.5 3.0 7.2 5.8 4.1 7.6 3.7 4.8 6.9 6.7 5.3 3.8 4.4 19 29 0 0 8.5 3.0 5.7 6.0 2.3 7.6 3.7 4.8 5.8 6.0 5.7 3.8 4.4 20 30 0 1 7.6 3.6 3.0 4.0 5.1 4.2 4.6 7.7 4.9 7.2 4.7 5.5 3.5 21 31 0 0 6.9 3.4 8.5 4.3 4.5 6.4 4.7 5.2 7.7 3.3 3.7 2.7 3.3 22 32 0 1 8.1 2.5 7.2 4.5 2.3 5.1 3.8 6.6 6.8 6.1 3.0 3.5 3.0 23 33 0 1 6.7 3.7 6.5 5.3 5.3 5.1 4.9 9.2 5.7 4.2 3.5 4.5 3.4 24 35 0 1 6.7 4.0 5.2 3.9 3.0 5.4 6.8 8.4 6.2 6.0 2.5 4.3 3.5 25 36 0 0 8.7 3.2 6.1 4.3 3.5 6.1 2.9 5.6 6.1 6.5 3.1 2.9 2.5 26 37 0 0 9.0 3.4 5.9 4.6 3.9 6.0 4.5 6.8 6.4 4.3 3.9 3.5 3.5 27 38 0 1 9.6 4.1 6.2 7.3 2.9 7.7 5.5 7.7 6.1 4.4 5.2 4.6 4.9 28 43 0 0 9.3 5.1 4.6 6.8 5.8 6.6 6.3 7.4 5.1 4.1 4.6 4.6 4.3 29 44 0 1 5.1 5.1 6.6 6.9 4.4 5.4 7.8 5.9 7.2 5.2 4.9 6.3 4.5 30 45 0 0 8.0 2.5 4.7 7.1 3.6 7.7 3.0 5.2 5.1 3.9 4.3 4.2 4.7 31 46 0 1 5.9 4.1 5.7 5.9 5.8 6.4 5.5 8.4 6.4 5.1 5.2 5.8 4.8 32 47 0 0 10.0 4.3 7.1 6.3 2.9 5.4 4.5 3.8 6.7 3.7 5.0 4.0 3.5 33 48 0 1 5.7 3.8 6.8 7.5 5.7 5.7 6.0 8.2 6.6 4.8 6.5 7.3 5.2 34 49 0 1 9.9 3.7 3.7 6.1 4.2 7.0 6.7 6.8 5.9 7.2 4.5 3.4 3.9 35 50 0 0 7.9 3.9 4.3 5.8 4.4 6.9 5.8 4.7 5.2 3.6 4.1 4.2 4.3 36 52 0 0 8.2 2.7 3.7 7.4 2.7 7.9 3.1 5.3 5.3 5.0 4.5 4.3 4.9 37 53 0 1 9.4 2.5 4.8 6.1 3.2 7.3 4.6 6.3 6.3 9.2 4.7 4.6 4.6 38 54 0 0 6.9 3.4 5.7 4.4 3.3 6.4 4.7 5.2 6.4 4.4 3.2 2.7 3.3 39 56 0 0 9.3 3.8 7.3 5.7 3.7 6.4 5.5 7.4 6.6 5.9 4.1 3.2 3.4 40 58 0 0 7.6 3.6 5.2 5.8 5.6 6.6 5.4 4.4 6.7 6.4 4.6 3.9 4.0 41 60 0 1 9.9 2.8 7.2 6.9 2.6 5.8 3.5 5.4 6.2 7.0 5.6 4.9 4.0 SAS Output 42 61 0 0 8.7 3.2 8.4 6.1 2.8 7.8 3.8 4.9 7.2 4.5 5.4 3.9 4.5 43 63 0 0 8.8 3.9 3.8 5.1 4.3 4.7 4.8 5.8 5.0 7.2 4.4 3.7 2.9 44 64 0 1 7.7 2.2 6.3 4.5 2.4 4.7 3.4 6.2 6.0 4.7 3.3 3.1 2.6 45 65 0 1 6.6 3.6 5.8 4.1 4.9 4.7 4.8 7.2 6.5 3.9 3.5 3.6 2.8 46 67 0 1 5.7 4.0 7.9 6.4 2.7 5.5 5.1 6.2 7.5 6.4 5.0 6.2 4.5 47 68 0 1 5.5 3.7 4.7 5.4 4.3 5.3 4.9 6.0 5.6 2.5 4.5 5.9 4.3 48 69 0 1 7.5 3.5 3.8 3.5 2.9 4.1 4.5 7.6 5.1 5.2 4.0 5.4 3.4 49 72 0 0 6.7 3.2 3.0 3.7 4.8 6.3 4.5 5.0 5.2 2.5 2.9 2.6 3.1 50 79 0 0 9.3 3.5 6.3 7.6 5.5 7.5 5.9 4.6 6.6 3.1 5.2 4.1 4.6 51 80 0 1 7.1 3.4 4.9 4.1 4.0 5.0 5.9 7.8 6.1 3.5 2.6 3.1 2.7 52 81 0 0 9.9 3.0 7.4 4.8 4.0 5.9 4.8 4.9 5.9 6.9 3.2 4.3 3.8 53 86 0 1 7.5 3.5 4.1 4.5 3.5 4.1 4.5 7.6 4.9 2.8 3.4 5.4 3.4 54 87 0 1 5.0 3.6 1.3 3.0 3.5 4.2 4.9 8.2 4.3 7.6 2.4 4.8 3.1 55 88 0 0 7.7 2.6 8.0 6.7 3.5 7.2 4.3 5.9 6.9 7.7 5.1 3.9 4.3 56 92 0 1 7.1 4.2 4.1 2.6 2.1 3.3 4.5 9.9 5.5 3.5 2.0 4.0 2.4 57 94 0 1 9.3 3.5 5.4 7.8 4.6 7.5 5.9 4.6 6.4 4.9 4.8 4.1 4.6 58 95 0 0 9.3 3.8 4.0 4.6 4.7 6.4 5.5 7.4 5.3 4.8 3.6 3.2 3.4 59 98 0 0 8.7 3.2 3.3 3.2 3.1 6.1 2.9 5.6 5.0 4.3 3.1 2.9 2.5 60 100 0 1 7.9 3.0 4.4 5.1 5.9 4.2 4.8 9.7 5.7 5.8 3.4 5.4 3.5 Chapter 6 Logistic Regression Example Obs ID Split60 X4 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 1 9 1 1 5.8 3.6 5.1 6.7 3.7 5.9 5.8 9.3 5.9 4.4 4.4 6.1 4.6 2 13 1 0 9.5 5.6 4.6 6.9 5.0 6.9 6.6 7.6 6.5 5.3 5.1 4.5 4.4 3 18 1 1 5.9 4.1 5.5 7.2 3.5 6.4 5.5 8.4 6.2 6.3 5.7 5.8 4.8 4 19 1 1 5.6 3.4 5.1 6.4 3.7 5.7 5.6 9.1 5.4 6.1 5.0 6.0 4.5 5 21 1 1 5.2 3.8 7.1 5.2 3.9 4.3 5.0 8.4 7.1 4.6 3.3 4.9 3.3 6 22 1 1 9.6 5.7 6.8 5.9 5.4 8.3 7.8 4.5 6.4 6.5 4.3 3.0 4.3 7 23 1 0 8.6 3.6 7.4 5.1 3.5 7.3 4.7 3.7 6.7 6.0 4.8 3.4 4.0 8 25 1 1 6.0 4.1 5.3 4.7 3.5 5.3 5.3 8.0 6.5 3.9 4.7 5.3 4.0 9 26 1 1 6.4 3.6 6.6 6.1 4.0 3.9 5.3 7.1 6.1 3.7 5.6 6.6 3.9 10 28 1 1 7.0 3.3 5.4 5.5 2.6 4.8 4.2 9.0 6.5 5.9 4.3 5.2 3.7 11 34 1 1 8.0 3.3 6.1 5.7 5.5 4.6 4.7 8.7 5.9 3.8 4.7 6.6 4.2 12 39 1 1 8.2 3.6 3.9 6.2 5.8 4.9 5.0 9.0 5.2 7.1 4.7 6.9 4.5 13 40 1 1 6.1 4.9 3.0 4.8 5.1 3.9 6.4 8.2 5.1 6.8 4.5 4.9 3.2 14 41 1 1 8.3 3.4 3.3 5.5 3.1 4.6 5.2 9.1 4.1 1.7 4.6 5.8 3.9 15 42 1 0 9.4 3.8 4.7 5.4 3.8 6.5 4.9 8.5 4.9 6.2 4.1 4.5 4.1 16 51 1 1 6.7 3.6 5.9 4.2 3.4 4.7 4.8 7.2 5.7 5.3 4.0 3.6 2.8 17 55 1 1 8.0 3.3 3.8 5.8 3.2 4.6 4.7 8.7 5.3 4.2 4.9 6.6 4.2 18 57 1 1 7.4 5.1 4.8 7.7 4.5 7.2 6.9 9.6 6.4 7.4 5.7 6.5 5.5 19 59 1 0 10.0 4.3 5.3 3.7 4.2 5.4 4.5 3.8 6.7 4.5 3.7 4.0 3.5 20 62 1 1 8.4 3.8 6.7 5.0 4.5 4.7 5.9 6.7 5.1 4.2 2.7 5.0 3.6 21 66 1 1 5.7 3.8 3.5 6.7 5.4 5.7 6.0 8.2 5.4 5.0 4.7 7.3 5.2 SAS Output 22 70 1 1 6.4 3.6 2.7 5.3 3.9 3.9 5.3 7.1 5.2 5.5 4.7 6.6 3.9 23 71 1 1 9.1 4.5 6.1 5.9 6.3 5.3 7.1 8.4 7.1 5.7 5.4 6.1 4.4 24 73 1 1 6.5 4.3 2.7 6.6 6.5 6.3 6.0 8.7 4.7 6.3 4.6 5.6 4.6 25 74 1 1 9.9 3.7 7.5 4.7 5.6 7.0 6.7 6.8 7.2 4.6 4.1 3.4 3.9 26 75 1 1 8.5 3.9 5.3 5.5 5.0 4.9 6.0 6.8 5.7 3.6 4.4 5.1 3.7 27 76 1 0 9.9 3.0 6.8 5.0 5.4 5.9 4.8 4.9 7.3 7.6 3.1 4.3 3.8 28 77 1 1 7.6 3.6 7.6 4.6 4.7 4.6 5.0 7.4 8.1 6.6 4.5 5.8 3.9 29 78 1 0 9.4 3.8 7.0 6.2 4.7 6.5 4.9 8.5 7.3 2.4 4.3 4.5 4.1 30 82 1 0 8.7 3.2 6.4 4.9 2.4 6.8 4.6 6.8 6.3 5.1 4.3 3.7 4.0 31 83 1 0 8.6 2.9 5.8 3.9 2.9 5.6 4.0 6.3 6.1 4.0 2.7 3.0 3.0 32 84 1 1 6.4 3.2 6.7 3.6 2.2 2.9 5.0 8.4 7.3 6.5 2.0 3.7 1.6 33 85 1 0 7.7 2.6 6.7 6.6 1.9 7.2 4.3 5.9 6.5 4.1 4.7 3.9 4.3 34 89 1 0 9.1 3.6 5.5 5.4 4.2 6.2 4.6 8.3 6.5 4.1 4.6 4.3 3.9 35 90 1 1 5.5 5.5 7.7 7.0 5.6 5.7 8.2 6.3 7.4 4.9 5.5 6.7 4.9 36 91 1 0 9.1 3.7 7.0 4.1 4.4 6.3 5.4 7.3 7.5 4.6 4.4 3.0 3.3 37 93 1 0 9.2 3.9 4.6 5.3 4.2 8.4 4.8 7.1 6.2 6.6 4.4 2.6 4.2 38 96 1 0 8.6 4.8 5.6 5.3 2.3 6.0 5.7 6.7 5.8 3.6 4.9 3.6 3.6 39 97 1 1 7.4 3.4 2.6 5.0 4.1 4.4 4.8 7.2 4.5 6.4 4.2 5.6 3.7 40 99 1 1 7.8 4.9 5.8 5.3 5.2 5.3 7.1 7.9 6.0 5.7 4.3 4.9 3.9 Chapter 6 Logistic Regression Example The LOGISTIC Procedure Model Information Note: Most typical Regression Diagnostics, e.g., Proc REG and GLM are also availble in Proc Logistic. Data Set WORK.HBAT60 Response Variable X4 Number of Response Levels 2 Model binary logit Optimization Technique Fisher's scoring X4 - Region Number of Observations Read 60 Number of Observations Used 60 Response Profile Ordered Value X4 Total Frequency 1 0 26 2 1 34 Probability modeled is X4=0. Stepwise Selection Procedure Step 0. Intercept entered: Specifies the response value SAS Output Model Convergence Status Logistic regression measures model estimation fit with the value of -2 times the log of the likelihood value, referred to as -2LL or -2 log likelihood. The minimum value for -2LL is 0, which corresponds to a perfect fit (likelihood = 1 and -2LL is then 0). Thus, the lower the -2LL value, the better the fit of the model. The -2LL value can be used to compare equations for the change in fit or to calculate measures comparable to the R2 measure in multiple regression. Step 1. Effect X13 entered: Convergence criterion (GCONV=1E-8) satisfied. -2 Log L = 82.108 Analysis of Maximum Likelihood Estimates Parameter Intercept DF Estimate Standard Error Wald Chi-Square Pr > ChiSq 1 -0.2683 0.2605 1.0603 0.3031 Wald statistic Test used in logistic regression for the significance of the logistic coefficient. Its interpretation is like the F or t values used for the significance testing of regression coefficients. Residual Chi-Square Test Chi-Square DF Pr > ChiSq 42.3497 13 ChiSq X6 1 11.9251 0.0006 X7 1 2.0517 0.1520 X8 1 1.6089 0.2046 X9 1 0.8656 0.3522 X10 1 0.7914 0.3737 X11 1 18.3231 ChiSq Intercept 1 14.1917 3.7123 14.6143 0.0001 X13 1 -1.0791 0.3574 9.1148 0.0025 X17 1 -1.8439 0.6388 8.3314 0.0039 Parameter Percent Concordant - A pair of observations with different observed responses is said to be concordant if the observation with the lower ordered response value (= 0) has a lower predicted mean score than the observation with the higher ordered response value (= 1). Percent Discordant - If the observation with the lower ordered response value has a higher predicted mean score than the observation with the higher ordered response value, then the pair is discordant. Percent Tied - If a pair of observations with different responses is neither concordant nor discordant, it is a tie. Pairs - This is the total number of distinct pairs in which one case has an observed outcome different from the other member of the pair. Odds Ratio Estimates Effect Point Estimate 95% Wald Confidence Limits The “Analysis of Maximum Likelihood Estimates” table summarizes information regarding the independent variables including parameter estimates, variability, and significance. The “Odds Ratio Estimates” table summarizes the significant independent variables and indicates their associated odds ratios and confidence limits. Effect - the predictor variables that are interpreted in terms of odds ratios. X17 0.158 0.045 0.553 Point Estimate - the odds ratio corresponding to Effect. The odds ratio is obtained by exponentiating the Estimate, exp[Estimate]. The difference in the log of two Association of Predicted Probabilities and Observed Responses odds is equal to the log of the ratio of these two odds. The log of the ratio of two odds is the log odds ratio. Percent Concordant 92.1 Somers' D 0.843 Hence, the interpretation of Estimate--the coefficient Percent Discordant 7.8 Gamma 0.844 was interpreted as the difference in log-odds--could also be done in terms of log-odds ratio. When the Estimate Percent Tied 0.1 Tau-a 0.421 is exponentiated, the log-odds ratio becomes the odds Pairs 884 c 0.921 ratio. We can interpret the odds ratio as follows: for a one unit change in the predictor variable, the odds ratio for a positive outcome is expected to change by the Residual Chi-Square Test respective coefficient, given the other variables in the Chi-Square DF Pr > ChiSq model are held constant. 95% Wald Confidence Limits - This is the Wald 20.2161 11 0.0425 Confidence Interval (CI) of an individual odds ratio, given the other predictors are in the model. For a given predictor Analysis of Effects Eligible variable with a level of 95% confidence, we'd say that we are for Removal 95% confident that upon repeated trials, 95% of the CI's would include the “true" population odds ratio. The CI is Wald Effect DF Chi-Square Pr > ChiSq equivalent to the Chi-Square test statistic: if the CI includes one, we'd fail to reject the null hypothesis that a particular X13 1 9.1148 0.0025 regression coefficient equals zero and the odds ratio equals X17 1 8.3314 0.0039 one, given the other predictors are in the model. An advantage of a CI is that it is illustrative; it provides information on where the "true" parameter may lie and the precision of the point estimate for the odds ratio. X13 0.340 0.169 0.685 SAS Output Tau-a - Kendall's Tau-a is a modification of Somer's D that takes into the account the difference between the number of possible paired observations and the number of paired observations with a different response. It is defined to be the ratio of the difference between the number of concordant pairs and the number of discordant pairs to the number of possible pairs (2(nc-nd)/(N(N-1)). Usually Tau-a is much smaller than Somer's D since there would be many paired observations with the same response. c - is equivalent to the well known measure ROC. c ranges from 0.5 to 1, where 0.5 corresponds to the model randomly predicting the response, and a 1 corresponds to the model perfectly discriminating the response. Note: No effects for the model in Step 2 are removed. Somers' D - Somer's D is used to determine the strength and direction of relation between pairs of variables. Its values range from -1.0 (all pairs disagree) to 1.0 (all pairs agree). It is defined as (nc-nd)/t where nc is the number of pairs that are concordant, nd the number of pairs that are discordant, and t is the number of total number of pairs with different responses. In our example, it equals the difference between the percent concordant and the percent discordant divided by 100 Analysis of Effects Eligible for Entry DF Score Chi-Square Pr > ChiSq X6 1 0.6561 0.4179 X7 1 3.5008 0.0613 X8 1 0.0063 0.9369 X9 1 0.6926 0.4053 X10 1 0.0914 0.7624 X11 1 3.4094 0.0648 X12 1 0.8492 0.3568 X14 1 2.3269 0.1272 X15 1 0.0257 0.8727 X16 1 0.0103 0.9192 X18 1 2.9074 0.0882 Effect Gamma - The Goodman-Kruskal Gamma method does not penalize for ties on either variable. Its values range from -1.0 (no association) to 1.0 (perfect association). Because it does not penalize for ties, its value will generally be greater than the values for Somer's D. A Receiver Operating Characteristic Curve (ROC) is a standard technique for summarizing classifier performance over a range of trade-offs between true positive (TP) and false positive (FP) error rates. Note: No (additional) effects met the 0.05 significance level for entry into the model. Summary of Stepwise Selection Effect Step Entered Removed DF Number In Score Chi-Square Wald Chi-Square Pr > ChiSq Variable Label 1 X13 1 1 21.3297 ChiSq 9.9225 8 0.2705 Chi-Square-Based Measure. Hosmer and Lemeshow developed a classification test where the cases are first divided into approximately 10 equal classes. Then, the number of actual and predicted events is compared in each class with the chi-square statistic. This test provides a comprehensive measure of predictive accuracy that is based not on the likelihood value, but rather on the actual prediction of the dependent variable. The “Hosmer and Lemeshow Goodness of Fit Test” indicates the quality of model fit. If the associated p-value is significant (p ChiSq 9.9225 8 0.2705 Classification Table Correct Incorrect Percentages Chi-Square, DF and Pr > ChiSq - These are the Chi-Square test statistic, Degrees of Freedom (DF) and associated p-value (PR>ChiSq) corresponding to the specific test that all of the predictors are simultaneously equal to zero. We are testing the probability (PR>ChiSq) of observing a Chi-Square statistic as extreme as, or more so, than the observed one under the null hypothesis; the null hypothesis is that all of the regression coefficients in the model are equal to zero. The DF defines the distribution of the ChiSquare test statistics and is defined by the number of predictors in the model. Typically, PR>ChiSq is compared to a specified alpha level, our willingness to accept a type I error, which is often set at 0.05 or 0.01. The small p-value from the all three tests would lead us to conclude that at least one of the regression coefficients in the model is not equal to zero. SAS Output Prob Level Event NonEvent Event NonEvent Correct Sensitivity Specificity False POS False NEG 0.400 25 27 7 1 86.7 96.2 79.4 21.9 3.6 0.410 25 27 7 1 86.7 96.2 79.4 21.9 3.6 0.420 25 27 7 1 86.7 96.2 79.4 21.9 3.6 0.430 25 27 7 1 86.7 96.2 79.4 21.9 3.6 0.440 25 27 7 1 86.7 96.2 79.4 21.9 3.6 0.450 25 27 7 1 86.7 96.2 79.4 21.9 3.6 0.460 25 27 7 1 86.7 96.2 79.4 21.9 3.6 0.470 25 27 7 1 86.7 96.2 79.4 21.9 3.6 0.480 25 27 7 1 86.7 96.2 79.4 21.9 3.6 0.490 24 28 6 2 86.7 92.3 82.4 20.0 6.7 0.500 24 28 6 2 86.7 92.3 82.4 20.0 6.7 0.510 24 28 6 2 86.7 92.3 82.4 20.0 6.7 0.520 24 28 6 2 86.7 92.3 82.4 20.0 6.7 0.530 22 28 6 4 83.3 84.6 82.4 21.4 12.5 0.540 22 28 6 4 83.3 84.6 82.4 21.4 12.5 0.550 22 28 6 4 83.3 84.6 82.4 21.4 12.5 0.560 22 28 6 4 83.3 84.6 82.4 21.4 12.5 0.570 22 28 6 4 83.3 84.6 82.4 21.4 12.5 0.580 20 28 6 6 80.0 76.9 82.4 23.1 17.6 0.590 20 28 6 6 80.0 76.9 82.4 23.1 17.6 0.600 20 28 6 6 80.0 76.9 82.4 23.1 17.6 Chapter 6 Logistic Regression Example Obs X4 X13 X17 F_X4 I_X4 P_0 P_1 1 1 6.8 5.1 1 1 0.07241 0.92759 2 0 5.3 4.3 0 0 0.63264 0.36736 3 1 4.5 4.0 1 0 0.87653 0.12347 4 1 8.8 4.1 1 1 0.05393 0.94607 5 0 6.8 3.5 0 0 0.59870 0.40130 6 1 8.5 4.7 1 1 0.02540 0.97460 7 1 8.9 4.2 1 1 0.04082 0.95918 8 1 6.9 6.3 1 1 0.00761 0.99239 9 1 8.4 5.8 1 1 0.00381 0.99619 10 0 6.8 3.7 0 0 0.50781 0.49219 11 1 8.2 4.9 1 1 0.02431 0.97569 12 0 7.1 2.6 0 0 0.85016 0.14984 13 1 8.8 6.2 1 1 0.00119 0.99881 14 0 4.9 3.9 0 0 0.84719 0.15281 15 1 6.2 6.2 1 1 0.01924 0.98076 16 1 8.4 6.1 1 1 0.00219 0.99781 17 1 6.2 4.4 1 1 0.35159 0.64841 SAS Output 18 0 4.8 3.8 0 0 0.88133 0.11867 19 0 4.8 3.8 0 0 0.88133 0.11867 20 1 7.7 5.5 1 1 0.01394 0.98606 21 0 5.2 2.7 0 0 0.97345 0.02655 22 1 6.6 3.5 1 0 0.64928 0.35072 23 1 9.2 4.5 1 1 0.01740 0.98260 24 1 8.4 4.3 1 1 0.05723 0.94277 25 0 5.6 2.9 0 0 0.94275 0.05725 26 0 6.8 3.5 0 0 0.59870 0.40130 27 1 7.7 4.6 1 1 0.06917 0.93083 28 0 7.4 4.6 0 1 0.09315 0.90685 29 1 5.9 6.3 1 1 0.02206 0.97794 30 0 5.2 4.2 0 0 0.69759 0.30241 31 1 8.4 5.8 1 1 0.00381 0.99619 32 0 3.8 4.0 0 0 0.93793 0.06207 33 1 8.2 7.3 1 1 0.00030 0.99970 34 1 6.8 3.4 1 0 0.64209 0.35791 35 0 4.7 4.2 0 0 0.79825 0.20175 36 0 5.3 4.3 0 0 0.63264 0.36736 37 1 6.3 4.6 1 1 0.25186 0.74814 38 0 5.2 2.7 0 0 0.97345 0.02655 39 0 7.4 3.2 0 0 0.57585 0.42415 40 0 4.4 3.9 0 0 0.90485 0.09515 41 1 5.4 4.9 1 1 0.33833 0.66167 42 0 4.9 3.9 0 0 0.84719 0.15281 43 0 5.8 3.7 0 0 0.75219 0.24781 44 1 6.2 3.1 1 0 0.85632 0.14368 45 1 7.2 3.6 1 1 0.44621 0.55379 46 1 6.2 6.2 1 1 0.01924 0.98076 47 1 6.0 5.9 1 1 0.04062 0.95938 48 1 7.6 5.4 1 1 0.01858 0.98142 49 0 5.0 2.6 0 0 0.98205 0.01795 50 0 4.6 4.1 0 0 0.84127 0.15873 51 1 7.8 3.1 1 0 0.51462 0.48538 52 0 4.9 4.3 0 0 0.72615 0.27385 53 1 7.6 5.4 1 1 0.01858 0.98142 54 1 8.2 4.8 1 1 0.02909 0.97091 55 0 5.9 3.9 0 0 0.65332 0.34668 56 1 9.9 4.0 1 1 0.02049 0.97951 57 1 4.6 4.1 1 0 0.84127 0.15873 58 0 7.4 3.2 0 0 0.57585 0.42415 59 0 5.6 2.9 0 0 0.94275 0.05725 60 1 9.7 5.4 1 1 0.00196 0.99804 SAS Output Chapter 6 Logistic Regression Example The FREQ Procedure Frequency Percent Row Pct Col Pct Table of F_X4 by I_X4 I_X4(Into: X4) F_X4(From: X4) 0 1 Total 0 25 41.67 96.15 80.65 1 1.67 3.85 3.45 26 43.33 1 6 10.00 17.65 19.35 28 46.67 82.35 96.55 34 56.67 Total 31 51.67 29 48.33 60 100.00 Chapter 6 Logistic Regression Example Obs X4 X13 X17 F_X4 I_X4 P_0 P_1 1 1 9.3 6.1 1 1 0.00083 0.99917 2 0 7.6 4.5 0 1 0.09053 0.90947 3 1 8.4 5.8 1 1 0.00381 0.99619 4 1 9.1 6.0 1 1 0.00124 0.99876 5 1 8.4 4.9 1 1 0.01968 0.98032 6 1 4.5 3.0 1 0 0.97820 0.02180 7 0 3.7 3.4 0 0 0.98073 0.01927 8 1 8.0 5.3 1 1 0.01457 0.98543 9 1 7.1 6.6 1 1 0.00354 0.99646 10 1 9.0 5.2 1 1 0.00601 0.99399 11 1 8.7 6.6 1 1 0.00063 0.99937 12 1 9.0 6.9 1 1 0.00026 0.99974 13 1 8.2 4.9 1 1 0.02431 0.97569 14 1 9.1 5.8 1 1 0.00179 0.99821 15 0 8.5 4.5 0 1 0.03632 0.96368 16 1 7.2 3.6 1 1 0.44621 0.55379 17 1 8.7 6.6 1 1 0.00063 0.99937 18 1 9.6 6.5 1 1 0.00029 0.99971 19 0 3.8 4.0 0 0 0.93793 0.06207 20 1 6.7 5.0 1 1 0.09467 0.90533 21 1 8.2 7.3 1 1 0.00030 0.99970 22 1 7.1 6.6 1 1 0.00354 0.99646 23 1 8.4 6.1 1 1 0.00219 0.99781 24 1 8.7 5.6 1 1 0.00398 0.99602 25 1 6.8 3.4 1 0 0.64209 0.35791 Original Data Classifications Crosstabulations SAS Output 26 1 6.8 5.1 1 1 0.07241 0.92759 27 0 4.9 4.3 0 0 0.72615 0.27385 28 1 7.4 5.8 1 1 0.01111 0.98889 29 0 8.5 4.5 0 1 0.03632 0.96368 30 0 6.8 3.7 0 0 0.50781 0.49219 31 0 6.3 3.0 0 0 0.86548 0.13452 32 1 8.4 3.7 1 1 0.15508 0.84492 33 0 5.9 3.9 0 0 0.65332 0.34668 34 0 8.3 4.3 0 1 0.06334 0.93666 35 1 6.3 6.7 1 1 0.00696 0.99304 36 0 7.3 3.0 0 0 0.68621 0.31379 37 0 7.1 2.6 0 0 0.85016 0.14984 38 0 6.7 3.6 0 0 0.58019 0.41981 39 1 7.2 5.6 1 1 0.01977 0.98023 40 1 7.9 4.9 1 1 0.03329 0.96671 Chapter 6 Logistic Regression Example The FREQ Procedure Frequency Percent Row Pct Col Pct Table of F_X4 by I_X4 I_X4(Into: X4) F_X4(From: X4) 0 1 Total 0 9 22.50 69.23 81.82 4 10.00 30.77 13.79 13 32.50 1 2 5.00 7.41 18.18 25 62.50 92.59 86.21 27 67.50 Total 11 27.50 29 72.50 40 100.00 Holdout Sample Classifications Crosstabulations *; *; * HBAT - Logistic Regression Analysis; *; *; ods graphics on; *; options ls=80 ps=50 nodate pageno=1; *; Title 'Chapter 6 Logistic Regression Example'; *; * Input HBAT ; *; Data HBAT; Infile 'C:\Documents and Settings\Thomas F Brantle\My Documents\Stevens_2006\Stevens_Teaching\BIA_652_Multivariate_2014_Spring\Class_09 Chapter 5-6\HBAT_Split60.txt' DLM = '09'X TRUNCOVER; Input ID Split60 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23; *; Data HBAT; Set HBAT (Keep = ID Split60 X4 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18); Label ID = 'ID - Identification Number' Split60 = 'Split60' X4 = 'X4 - Region' X6 = 'X6 - Product Quality' X7 = 'X7 - E-Commerce' X8 = 'X8 - Technical Support' X9 = 'X9 - Complaint Resolution' X10 = 'X10 - Advertizing' X11 = 'X11 - Product Line' X12 = 'X12 - Salesforce Image' X13 = 'X13 - Competitive Pricing' X14 = 'X14 - Warranty & Claims' X15 = 'X15 - New Products' X16 = 'X16 - Order & Billing' X17 = 'X17 - Price Flexibility' X18 = 'X18 - Delivery Speed'; *; * Create HBAT Split 60 (Original/Initial) and Split 40 (Validation/Holdout) Datasets ; *; Data HBAT60; Set HBAT; If Split60 = 0; *; Data HBAT40; Set HBAT; If Split60 = 1; *; Proc Print Data = HBAT60; *; Proc Print Data = HBAT40; *; *; * Stepwise Logistic Regression Analysis - X4 = X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18; *; * EVENT=’category’ | keyword * specifies the event category for the binary response model. *; * SELECTION = option specifies the method used to select the explanatory variables in the model. * STEPWISE requests stepwise selection; *; * SLENTRY = option specifies the significance level for entry into the model * SLSTAY = option specifies the significance level for staying in the model *; * DETAILS option produces detailed printout at each step of the model-building process *; * LACKFIT requests Hosmer and Lemeshow goodness-of-fit test *; * RSQUARE displays generalized R^2 *; * CTABLE option requests the printing of a classification table for the final model produced by the procedure. *; * PPROB = option specifies possibly multiple cutpoints used to classify observations for the CTABLE option. * The values must be between 0 and 1. If the PPROB= option is not specified, the * default is to print the classification for a range of probabilities from the smallest estimated * probability (rounded below to the nearest .02) to the highest estimated probability (rounded above * to the nearest .02) with 0.02 increments. Note that the PPROB= option has no effect unless the * CTABLE option is also specified. *; *; Proc Logistic Data = HBAT60; Model X4(event='0') = X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 / Selection=Stepwise SLEntry=0.05 SLStay=0.05 Details LackFit RSquare CTable PProb =(0 to 1 by .10); *; * Final Resultant Model and Output Model; *; Proc Logistic Data = HBAT60 OutModel=Logistic60; Model X4(event='0') = X13 X17 / LackFit RSquare CTable PProb =(0.40 to 0.60 by .01); *; * Original Split60 Logistic Model Fitted to Split40 validation Data; *; Proc Logistic InModel=Logistic60; Score Data = HBAT60 (Keep = X4 X13 X17) Out = HBAT60Score; *; * Proc Freq Crosstabulations Original and Holdout Validation Datasets; *; Proc Print Data = HBAT60Score; Proc Freq Data = HBAT60Score; Table F_X4 * I_X4; *; Proc Logistic InModel=Logistic60; Score Data = HBAT40 (Keep = X4 X13 X17) Out = HBAT40Score; Proc Print Data = HBAT40Score; Proc Freq Data = HBAT40Score; Table F_X4 * I_X4; *; *; * ods graphics off; *; *; Run; Quit; Team Business Intelligence Kelsey Douma, Mingjun Han, Yu Hong, Siwei Wang Assignment Name Date Management Brief Introduction ● Introduce the problem, key issues, results, conclusions and associated recommendations Problem statement ● Analysis ● Requirements and constraints, model, analyses, results, conclusions and recommendations Summary of Conclusions ● Summarize the important conclusions, lessons learned and management recommendations Appendices ● SAS Code ● SAS Output
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer


Anonymous
Great study resource, helped me a lot.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Related Tags