## Florida 2000 Presidential Election — A Flawed Measurement

An election is a type of measurement. In many ways, it is like the scientific measurement of a physical process, but in other ways it is fundamentally different. The biggest difference is that the quantity being measured is not defined, except in terms of the measurement itself.

What do we measure when we go to the polls for a presidential election? Like some of the topics that I post on the MathRec site, there are many similar ways to frame the question, and similar questions may have different answers. The Florida 2000 Presidential Election has all of the features of a flawed measurement. I have never seen any other event outside of the laboratory, which so clearly points out the difference between a measured quantity and the measurement itself.

I have seen a blatant example of a flawed measurement inside the laboratory. While I was in graduate school, we had a visiting scientist working in our group. One afternoon he came into our professor's office and announce that he had discovered a new effect in one of our established experiments. The professor firmly told him that it was impossible and sent him back to check. Eventually the professor (and a small crowd of onlookers) was dragged into the laboratory to watch a demonstration. As a piece of apparatus was slid into place, the needle which indicated signal level showed a strong deflection. Proudly, our visitor announced "See. It's true."

If you've worked in a research laboratory, you might see immediately where this is headed. Our visitor had spent all of his time trying to get the needle to move with little attention to what was going on in the experiment. The professor quickly found the mistake that had corrupted the measurement. Over time the incident grew in the retelling, until it was regularly asserted that our visitor would take a magnet to the side of the meter in order to increase the signal from the experiment.

That assessment of our visitor was surely a bit harsh, but conducting research is a matter of careful attention to sources of systematic errors. A valid measurement may finish with a meter reading, but it is crucial not to confuse the reading on the meter with the measurement itself.

The United States generally runs astoundingly fair and accurate elections. But when I state this, I am not comparing the accuracy of an election process to a scientific measurement. An election is not a scientific measurement. There is no underlying quantity that is being measured. The vote total itself is the measured quantity. In effect, we have a situation where the reading of the meter is the measurement.

Now, you may think that an election measures our choice for President. If I were making a different point, I wouldn't argue with you, but "our choice for President" is one of those poorly defined quantities. And how we define the quantity has everything to do with the result that we get. We've decided to define "our choice" in terms of votes, not bullets, for example.

OK, so we have to define the space that we use to measure votes. One "man" one vote, for example. Still I say that the measured quantity is the votes themselves. We are measuring the reading on the meter—not an underlying quantity. To be sure, we do have laws which prohibit the electoral equavalents of moving the needle with a magnet. Still, an election is a contest, not a measurement. Candidates can can't stuff the ballot boxes with ballots, but they can stuff the voting booths with voters.

Well, I suppose you may be getting annoyed with my nitpicking. Of course we are measuring the opinion of the voters not of those who stay home. In fact, I'll cut through a bunch of layers here and concede that we can talk about the opinion of those who are allowed to vote, those who choose to vote, those who succeed in getting to the voting booth, and who actually complete and return a ballot. I'll even skip all discussion of the convoluted metric involving representation in the Electoral College. I'll just point out that all of these had a very direct effect on the U.S. Presidential election in 2000.

I will concede that almost every American believes that our Presidential election is trying to measure this quantity, but the quantity is still not defined. The votes are the measured quantity. Our election is run according to laws and the final count of the votes under the laws is the object of each candidate. All we have done is to refine the rules so that the easiest ways to affect the final count is to change public opinion and increase voter turnout among voters who are most likely to support your candidate.

Let's look at the 2000 U.S. Presidential election as a measurement of voter choice. The final, official, count of the votes in Florida gave a 537-vote margin to George W. Bush after the U.S. Supreme Court effectively barred any further official examination of the ballots. A subsequent independent examination of the key ballots has determined that the official margin of victory was accurate to within a few hundred votes under the laws and regulations in place at the time that the election was conducted. As chaotic as the process was, the official result was within one part in 10,000 of the underlying quantity. This is the result that I think is astoundingly accurate.

Furthermore, the independent review of the ballots determined that using other standards which were under discussion at the time could have given results outside of this range. In fact, some proposed methods of counting the ballots would have resulted in a different winner. Still, the final margin using any of these various methods would have been within two parts in 10,000 of the official results.

Now you should not get a false sense that this is an accurate measurement of voter choice. It is not. It is an accurate tally of the votes themselves, and that is the only way in which the concept of voter choice is defined. Let me point out three systematic effects which had a clear effect on the margin of victory.

First, approximately 2.3% of the votes cast were not for either George W. Bush or for Albert Gore. The opinion of these voters regarding the choice between these candidates was not recorded. Unless the opinion of these voters was divided more closely than 50.2/49.8 between Bush and Gore, then this systematic effect was bigger than the inaccuracies in the tally. Opinions in this regard vary, but no one seriously believes that Gore would have failed to get at least several percent more votes from this group than Bush, primarily due to the demographics of the Nader candidacy. (More than 70% of the other votes were for Nader.) Of course, these voters made their choice within the rules of the system. I am simply pointing out that there is no underlying measured quantity other than the final vote tally.

Up to this point, we have talked about effects, which we all agree are within the "rules of the game." The opinions of voters only count if they actually vote, and voters who choose third-party candidates do not get to register their opinion regarding the principal candidates. But we are certainly not done with the systematic effects which contribute to the results of our election.

There was a lot of publicity regarding the layout of the Palm Beach County ballot in that election. In short, there was a high rate of incorrectly executed ballots in Palm Beach County. The "problem" was that there were too many candidates for President (and Vice President) to fit into one column and still keep the type size that the Palm Beach County board of elections wanted to have (so that their large population of elderly voters would be better able to read the ballots). Their solution was the (now infamous) "butterfly ballot".

This was a county which went heavily for Albert Gore. (Gore got about 62.3% of the vote in Palm Beach to 35.3% for Bush.) The errored ballots could reasonably be presumed to include a large number of ballots which would have favored Gore. The ballots themselves strongly support this inference. On their web site, CNN reported:

According to the study, 5,277 voters made a clean punch for Gore and a clean punch for Reform Party nominee Pat Buchanan, candidates whose political philosophies are poles apart. An additional 1,650 voters made clean punches for Bush and Buchanan.

There were about 17,500 ballots in Palm Beach County which recorded more than one vote for President. Buchanan drew a negligible number of votes in Florida, although he drew more votes in Palm Beach County (whose demographics did not favor his candidacy) than he did in any other Florida County 3,407 or 0.79% in Palm Beach County versus 1,013 in Pinellas County. Buchanan did poll a higher percentage in some counties, but we're talking about small counties. The largest number of votes involved was 108 in Suwanee County for 0.87%. Buchanan polled less than 0.3% statewide. Even more clear is a graph which I saw of Buchanan votes versus Reform Party registration for the Florida counties. The graph forms a straight line with about 20% noise, except for Palm Beach County, which is off by several times the best-fit value. A careful statistical analysis by Greg Thorson comes to the same conclusion that over 2,000 Buchanan votes were not intended for Buchanan.

It is clear that the layout of the Palm Beach County ballot caused a systematic effect in the number of valid votes cast, and that effect was thousands of votes—about one part in 1,000 of the total votes cast in Florida. Palm Beach County gained a lot of publicity because of the anomalous showing that Pat Buchanan made there, but other counties had similar problems. From the same CNN article:

Eighteen other counties used another confusing ballot design known as the "caterpillar" or "broken" ballot, where six or seven presidential candidates are listed in one column and the names of the remaining minor party candidates appeared at the top of a second one. According to the study, more than 15,000 people who voted for either Gore or Bush also selected one candidate in the second column, apparently thinking the second column represented a new race.

Had many of these voters not marked a minor candidate in the second column, Gore would have netted thousands of additional votes as compared with Bush.

So, the vagaries of ballot layout clearly are a part of the system. Note very specifically that these mismarked ballots "were clearly invalid under any interpretation of the law." No one has charged that these ballots were intentionlly laid out in such a manner as to affect the outcome of the election, but this type of systematic effect is part of our election system. In this particular race, the chips fell in such a way as to change the outcome.

Still, even when the ballots are laid out "well", there are differences between the performance of different voting equipment. In the most extreme, some counties in Florida used optical scan equipment which immediately rejected ballots with two votes in the same race and gave the voters the opportunity to correct their ballots. In other counties, such spoiled ballots were accepted at the polling place, but were later ignored when the ballots were counted.

The equipment is not distributed randomly. Urban counties tend to use punch-card equipment, while rural counties tend to use the optical scan equipment. Gore polled about 864,000 more votes in punch-card counties than in optical-scan counties. Bush polled only about 463,000 more votes in punch-card counties than in optical counties. Since the "overvote" rate was about 1.4% greater with the punch-card system than with the optical scanning system, Gore lost about 2,800 more votes than Bush (assuming that the errors happened in proportion to the votes successfully cast for each candidate). The greater part of that difference in error rate was caused by a few counties with butterfly or caterpillar ballots. After removing these counties, the "background" error rate is still over 0.5% higher for punch-card counties. So, even the background systematic effects due to the difference in voting equipment in urban versus rural areas was still greater than the uncertainty in the tabulation of the valid votes.

There are other systematic effects which exist in our system. Certainly it is easier for some voters to get to the polls than others. There have been specific allegations, for example, that the way that voting locations are chosen in minority areas is less favorable than the way that voting locations are chosen in affluent areas. These systematic effects, however, do not show up in the ballots themselves.

Systematic effects decide the outcome of elections all the time. Some of these systematic effects are deliberately caused to influence the outcome of elections. In our country, we work especially hard to eliminate most of these effects (such as minority access to polls), while tolerating others within limits (such as the political process of drawing boundaries for Congresional districts).

In this race, we had a very unusual opportunity to see the outcome of a Presidential contest determined by systematic effects that we do not normally acknowledge to be a part of our process. Without butterflies and caterpillars and centralized tabulation in the right counties, we would have had a different outcome. But even if these effects were absent, I still would not assert that the outcome was a measurement of our choice for President. An election is a measurement of the votes and nothing else.