Description
Complete the following end of chapter exercises for Chapter 3. Submit your response in an MS Word document or Pdf after inserting the results from SPSS Output into your document. This exercise utilizes the data set schools-a.sav, which can be downloaded from chapter data sources referenced above.
1- You are interested in investigating if being above or below the median income (medloinc) impacts ACT means (act94) for schools. Complete the necessary steps to examine univariate grouped data in order to respond to the questions below. Although deletions and/or transformations may be implied from your examination, all steps will examine original variables.
a. How many participants have missing values for medloinc and act94?
b. Is there a severe split in frequencies between groups?
c. What are the cutoff values for outliers in each group?
d. Which outlying cases should be deleted for each group?
e. Analyzing histograms, normal Q-Q plots, and tests of normality, what is your conclusion regarding normality? If a transformation is necessary, which one would you use?
f. Do the results from Levene’s test for equal variances indicate homogeneity of variance? Explain.
2- You are interested in studying predictors (math94me, loinc93, and read94me) of the percentage graduating in 1994 (grad1994).
a. Examine univariate normality for each variable. What are your conclusions about distributions? What transformation should be conducted?
b. After making the necessary transformations, examine multivariate outliers using Mahalanobis distance. What cases should be deleted?
c. After deleting the multivariate outliers, examine multivariate normality and linearity by creating a Scatterplot Matrix.
d. Examine the variables for homoscedasticity by creating a residuals plot (standardized vs. predicted values). What are your conclusions about homoscedasticity?
Please use APA format with references.
Explanation & Answer
If you check the plagiarism part, you'll wonder because it has a high percentage, that was because of the questions that are copied from the book. Don't worry, I added the book as a reference for the file.
CHAPTER 3
Answers to End Exercises
This exercise utilizes the data set schools-a.sav
1. You are interested in investigating if being above or below the median income (medloinc) impacts
ACT means (act94) for schools.
a. How many participants have missing values for medloinc and act94?
For the variable medloinc there are 0 participants with missing values.
For the variable act94 there are 0 participants with missing values.
Statistics
above or below median loinc
N
Valid
Missing
64
0
Case Processing Summary
Cases
Valid
N
average ACT score 1994
Missing
Percent
64
N
100.0%
Total
Percent
0
N
0.0%
Percent
64
100.0%
b. Is there a severe split in frequencies between groups?
There is no severe split in frequencies between groups, it is 50% by 50%, and it is an
expected outcome: the name of the variable (above or below median income) implies that the
data will be distributed evenly between the two groups (knowing that the number of participants
is an even number).
Case Processing Summary
Cases
Valid
above or below median
loinc
average ACT score 1994
below the median for low
inc % 1993
above the median for low
inc % 1993
N
Missing
Percent
N
Total
Percent
N
Percent
32
100.0%
0
0.0%
32
100.0%
32
100.0%
0
0.0%
32
100.0%
c. What are the cutoff values for outliers in each group?
There are two stem-and-leaf plots below. The first one indicates that 1 participant with
the income that is below the median has ACT score above 22.5. The second plot shows that 2
participants with the income above the median has ACT scores higher than 17.
Extreme Values
above or below median loinc
average ACT score 1994
Case Number
below the median for low inc Highest
1
64
22.5
% 1993
2
38
20.9
3
39
20.6
4
60
20.0
5
35
19.6
1
24
14.1
2
42
14.2
3
9
14.2
4
13
14.3
5
55
14.7
above the median for low inc Highest
1
26
17.0
% 1993
2
57
17.0
3
43
16.8
4
30
16.4
5
48
16.0
1
50
13.6
2
20
13.8
3
2
14.0
4
16
14.1
5
5
14.3
Lowest
Lowest
average ACT score 1994 Stem-and-Leaf Plot for
medloinc= below the median for low inc % 1993
Frequency
7.00
9.00
5.00
4.00
Stem &
14
15
16
17
.
.
.
.
Value
Leaf
1223789
234478888
12788
1378
2.00
18 .
1.00
19 .
3.00
20 .
1.00 Extremes
Stem width:
Each leaf:
09
6
069
(>=22.5)
1.0
1 case(s)
average ACT score 1994 Stem-and-Leaf Plot for
medloinc= above the median for low inc % 1993
Frequency
Stem &
2.00
13
6.00
14
10.00
14
6.00
15
3.00
15
2.00
16
1.00
16
2.00 Extremes
Stem width:
Each leaf:
.
.
.
.
.
.
.
Leaf
68
013444
5556678999
000124
559
04
8
(>=17.0)
1.0
1 case(s)
d. Which outlying cases should be deleted for each group?
Below is the bloxplot for two groups that reveals all three outliers: one in the first group
and two in another one. Case numbers are: 64 (first group - income below the mean); #57 and
#26 (second group - income above the mean). We will alter the value for the outlying case #64 by
replacing it with a maximum value that falls within the accepted distribution which is 20.069 as
per stem-and-leaf plot, and will two outliers #57 and #26 from the second group will alter to 16.8.
e. Analyzing histograms, normal Q-Q plots, and tests of normality, what is your conclusion
regarding normality? If a transformation is necessary, which one would you use?
According to the Descriptive Statistics figure below, for participants with the income
below the median the skewness coefficient is .790. For participants with the income above the
median the skewness coefficient is .791. A positive skew tells us that there is a clustering of cases
to the left, and the right tail is extended with only few cases. The positive kurtosis is supported by
histograms. Normal Q-Q plots for both groups support this finding as the observed values deviate
somewhat from the straight line. The Kolmogorov-Smirnov test and Shapiro-Wilk test rejects the
hypothesis of normality of ACT scores for the population of both groups. Detrended normal Q-Q
plot shows a U-shape distribution.
To decrease the moderate positive skewness, the transformation proce...