This question is related to Statistics. Please help me with detailed explanation and steps. Thank you.
Here is the question task.
In November 1993, the state of Pennsylvania conducted elections for its state legislature. The
result in the Senate election in the 2nd district (based in Philadelphia) was challenged in court,
and ultimately overturned. The Democratic candidate won 19,127 of the votes cast by voting
machine, while the Republican won 19,691 votes cast by voting machine, giving the Republican a
lead of 564 votes. However, the Democrat won 1,391 absentee ballots, while the Republican won
just 366 absentee ballots, more than offsetting the Republican lead based on the votes recorded
by machines on election day. The Republican candidate sued, claiming that many of the absentee
ballots were fraudulent. The judge in the case solicited expert analysis from Orley Ashenfelter,
an economist at Princeton University. Ashenfelter used a regression model to examine the rela-
tionship between absentee vote margins and machine vote margins in 21 previous Pennsylvania
Senate elections in seven districts in the Philadelphia area over the preceding decade.
The file “AbsenteeBallots.txt” on Canvas contains the Year of the election, the District in which
the election was held, and the margins (Democrat - Republican) for absentee ballots (Differ-
enceAbsentee) and for voting machines from the day of the election (DifferenceMachine).
(a) Use JMP to create a scatterplot for the difference in absentee ballots as a function of the
difference in voting machine ballots. Explain under which conditions it would make sense
to expect a roughly linear relationship between the two variables. Your conditions should
be assumptions on the behavior of Philadelphia’s voting public. Include your plot in your
(b) Use JMP to fit a simple linear regression model for DifferenceAbsentee as a function of
DifferenceMachine for all but the contested elections. Include the output of the analysis
in your homework. Include both the ANOVA table as well as the parameter estimates table. (c) What percentage of the variation in DifferenceAbsentee can be explained by margins
(Democrat - Republican) for absentee ballots (DifferenceAbsentee)?
(d) Find the p-value for the test H0 : β0 = 0 in your output. Explain (as you would to the judge
in this case) what this number tells us. Interpret the results of the test in the context of the
(e) Produce a 95% confidence interval for the regression slope in this problem. Interpret the
confidence interval in the context of the problem.
(f) Dr. Ashenfelter found that “the difference between the Democratic and Republican tallies in
the machine-based vote has been a good indicator of the difference between the two parties’
absentee vote.” Explain how he could draw this conclusion based on your regression output.
(g) The NY Times article states: “Assuming this relationship in the 21 previous elections had
held in the most recent, Professor Ashenfelter estimates that the Republican’s 564-vote edge
on the machines should have led to a 133-vote advantage in absentee ballots.” Explain (as
Dr. Ashenfelter would explain to the judge), how one can come to this conclusion.
(h) In the contested election the voting machine margin was -564. The absentee ballot margin,
however, was 1025. Use your regression model to make a statistical argument for why this
observation is unusual. Assume for now that the contested election was fair. Derive a
probability for observing an absentee ballot margin as large or larger than the one observed
if the election was fair. Explain how you come up with that number.
Note, it’s ok if your answer is not p = 0.06.
(i) Dr. Ashenfelter made a similar argument you just made and came up with a p-value of 0.06
for the test that decides
H0 : the election was fair vs. Ha : there was fraud by the democrats
The NY Times reporter interpreted this result as follows: “Putting it another way, if past
elections are a reliable guide to current voting behavior, there is a 94 percent chance that
irregularities in the absentee ballots, not chance alone, swung the election to the Democrat,
Professor Ashenfelter concludes.”
Critique the reporter’s interpretation of the p-value. If the interpretation is correct, explain
why. If the interpretation is incorrect, provide a correct interpretation instead.