In November 1993, the state of Pennsylvania conducted elections for its state legislature. The

result in the Senate election in the 2nd district (based in Philadelphia) was challenged in court,

and ultimately overturned. The Democratic candidate won 19,127 of the votes cast by voting

machine, while the Republican won 19,691 votes cast by voting machine, giving the Republican a

lead of 564 votes. However, the Democrat won 1,391 absentee ballots, while the Republican won

just 366 absentee ballots, more than offsetting the Republican lead based on the votes recorded

by machines on election day. The Republican candidate sued, claiming that many of the absentee

ballots were fraudulent. The judge in the case solicited expert analysis from Orley Ashenfelter,

an economist at Princeton University. Ashenfelter used a regression model to examine the rela-

tionship between absentee vote margins and machine vote margins in 21 previous Pennsylvania

Senate elections in seven districts in the Philadelphia area over the preceding decade.

The file “AbsenteeBallots.txt” on Canvas contains the Year of the election, the District in which

the election was held, and the margins (Democrat - Republican) for absentee ballots (Differ-

enceAbsentee) and for voting machines from the day of the election (DifferenceMachine).

(a) Use JMP to create a scatterplot for the difference in absentee ballots as a function of the

difference in voting machine ballots. Explain under which conditions it would make sense

to expect a roughly linear relationship between the two variables. Your conditions should

be assumptions on the behavior of Philadelphia’s voting public. Include your plot in your

homework.

(b) Use JMP to fit a simple linear regression model for DifferenceAbsentee as a function of

DifferenceMachine for all but the contested elections. Include the output of the analysis

in your homework. Include both the ANOVA table as well as the parameter estimates table. (c) What percentage of the variation in DifferenceAbsentee can be explained by margins

(Democrat - Republican) for absentee ballots (DifferenceAbsentee)?

(d) Find the p-value for the test H0 : β0 = 0 in your output. Explain (as you would to the judge

in this case) what this number tells us. Interpret the results of the test in the context of the

problem.

(e) Produce a 95% confidence interval for the regression slope in this problem. Interpret the

confidence interval in the context of the problem.

(f) Dr. Ashenfelter found that “the difference between the Democratic and Republican tallies in

the machine-based vote has been a good indicator of the difference between the two parties’

absentee vote.” Explain how he could draw this conclusion based on your regression output.

(g) The NY Times article states: “Assuming this relationship in the 21 previous elections had

held in the most recent, Professor Ashenfelter estimates that the Republican’s 564-vote edge

on the machines should have led to a 133-vote advantage in absentee ballots.” Explain (as

Dr. Ashenfelter would explain to the judge), how one can come to this conclusion.

(h) In the contested election the voting machine margin was -564. The absentee ballot margin,

however, was 1025. Use your regression model to make a statistical argument for why this

observation is unusual. Assume for now that the contested election was fair. Derive a

probability for observing an absentee ballot margin as large or larger than the one observed

if the election was fair. Explain how you come up with that number.

(i) Dr. Ashenfelter made a similar argument you just made and came up with a p-value of 0.06

for the test that decides

H0 : the election was fair vs. Ha : there was fraud by the democrats

The NY Times reporter interpreted this result as follows: “Putting it another way, if past

elections are a reliable guide to current voting behavior, there is a 94 percent chance that

irregularities in the absentee ballots, not chance alone, swung the election to the Democrat,

Professor Ashenfelter concludes.”

Critique the reporter’s interpretation of the p-value. If the interpretation is correct, explain

why. If the interpretation is incorrect, provide a correct interpretation instead.

