The Jury is Out on the Correctness of Democratic Decisions

18 Mar

Majority preferences were seen by Rousseau (The Social Contract) as an expression of general will. With Condorcet, the general will has also been imbued with the notion of correctness. As contradictory evidence is ample, it is time to remove the false comfort of any epistemic benefit accruing from such aggregation.

Condorcet Jury Theorem, originally postulated by Marquis de Condorcet, formalized by Duncan Black, runs as follows:

If,

  1. Jury has to decide between two options using simple majority
  2. If each juror’s probability of being correct is greater than half (~ competence)
  3. Each juror has an equal probability of being correct (~ homogeneity)
  4. Each juror votes independently (~ independence)

Then,

  1. Any jury of odd juror is more likely to arrive at the correct answer than any single juror
  2. As n increases, probability of arriving at the correct answer approaches 1

The above summarization is paraphrased from ‘Aggregation of Correlated Votes and CJT’ by Serguei Kaniovski.

There have been multiple attempts at generalizing Condorcet, mostly by showing that violations to one or more of the assumptions don’t automatically doom the possibility of achieving an ‘epistemically’ superior outcome. One of the generalizations, offered by Christian List and Bob Goodin, is that the result still holds if people are offered k options, and they have higher than 1/k chance of being correct.

Suppose there are k options and that each voter/juror has independent probabilities p1, p2, …, pk of voting for options 1, 2, …, k, respectively, where the probability, pi, of voting for the “correct” outcome, i, exceeds each of the probabilities, pj, of voting for any of the “wrong” outcomes, j ≠ i. Then the “correct” option is more likely than any other option to be the plurality winner. As the number of voters/jurors tends to infinity, the probability of the “correct” option being the plurality winner converges to 1.

Other generalizations deal with cases where independence doesn’t hold or where competence is unevenly distributed.

One way to summarize the theorems is that math works to the extent the assumptions hold. Assumptions are at best poorly realized, and at worst inapplicable when we transpose CJT to democracy.

Applying Condorcet’s Jury theorem to Electoral Democracy

To apply CJT to democracy, we must assume citizenry to be a jury and the decision task in front of it as choosing the “right” party or candidate.

The word jury is saddled with association with courts in the American context, and it is important to disambiguate how the citizenry differs from a jury of citizens summoned by the court. Disambiguation will allow us to cover key issues that affect the epistemic utility of any “aggregations” of human beings.

In the court system, a jury is randomly (~ within certain guidelines) selected from the community. It is generally subject to a battery of voir dire questions so as to assess their independence, lack of conflict of interest, biases, etc. It is sworn to render a “rational” and “impartial” verdict. The jury is instructed in the applicable law, including evidentiary law. And members of the jury are asked not to learn about the case from any other source other than what is presented within the court, which itself is subjected to reasonably stringent evidentiary guidelines. The jury is also guarded against undue influence, for example, bribes by interested parties. The jury is also made to at least sit through extensive presentations from ‘both sides,’ and their rebuttals, and generally asked to deliberate the evidence among what is generally a ‘diverse’ pool before reaching a verdict.

On the other hand, citizenry that comes to vote is a self-selected sample (roughly half of the total body), highly and admissibly ‘non-independent’ in how they look at the evidence, generally sworn to ‘parties’, unconstrained by law on what evidence to look at, and how to look at it, generally extensively manipulated by interested ‘parties’, rarely informed about the ‘basis’, rarely arriving at decision after learning about arguments by ‘both sides’, and rarely ever deliberating etc.

The comparison provides a rough template for arguing against positive comparisons between the epistemic competence of juries and that of the citizenry. However, Condorcet’s argument is a bit different – though many of the above lessons apply – and hinges on the enormous n in a democracy. The only other assumption that one then needs is each juror having more than ½ chance of having it right, or some variation thereof. The central contentions that can be made against Condorcet can come from two sources – theorization of the sources and extent of violation of the assumptions, for example – independence, and competence; inapplicability due to incongruence, etc. The various contentions – emerging from the two sources – are covered below (in no particular order).

Rational Voting, Sincere Voting

While it is one of the weaker cases against applying Condorcet – mostly because the counterargument imagines a ‘rational’ voter – the argument deserves some attention – mostly because of its salience in the political science literature. One of the axioms of political science, since Downs, has been that information acquisition is costly. Hence it follows that as the decision-making body becomes larger, and as the chance to be a ‘pivotal voter’ goes down, the incentives to shirk (free-ride) increase.

Austen-Smith and Banks, among others, have shown that ‘sincere voting’—voting the best choice based on information signal—is not equilibrium behavior as rational voter votes not only based on the signal but also on the chance of being pivotal. Feddersen and Pesendorfer (1998, APSR), taking the claim (perils of strategic voting) to its logical extreme and applying it to ‘unanimity rule’ (not majority rule though similar less stark contentions apply – which they note) show that as jury size increases, the probability of convicting the innocent increases.

Extreme Non-independence

Given p > half is a ‘reasonably high’ threshold – jurors performing better than random – especially in circumstances of misinformation, problems can arise quickly.

In the current state, about 90% of the voters exhibit high forms of non-independence emerging from apathy and partisanship. It also reasons that reduction in either one will lead to higher probability of citizenry choosing the ‘better choice’ on offer, and arguably better choices on offer. Partisanship also means that people have different utilities that they intend to maximize. The other 10% err on the side of manipulation.

What’s on the Menu?

To the extent there are two inferior choices to choose from, one can imagine that in the best case the polity will choose the slightly better one among the two inferior choices. Condorcet offers no comfort for what kind of choices are on offer – perhaps the central and pivotal role of any normative conception of democracy. In fact, it is likely that the quality of choices on offer (‘correctness of choices’) is likely to be a function of probability with which a body politic knows about the ‘optimal correct choice’, and probability that it chooses the ‘optimally correct’ choice (which is likely to be collinear with odds of picking the ‘better choice’).

Policy Choices

Policy choices are an array of infinite counterfactuals. To choose the ‘most correct’ one would need a population informed enough to disinter the right choice with a higher probability than any other wrong choice. Given infinite choices, the bar set for each citizen is very high. The chances of current citizenry crossing that bar—non-existent.

3 or More Choices

The manipulability of system offering more than two choices is well documented and filed alternately as Condorcet’s Paradox and Arrow’s impossibility theorem. Much work has been done to show that propensity of cycles in a democracy is not great. (For example, Gerry Mackie, ‘Democracy Defended’) One contention, however, remains unanswered for the binary choice version -American Democracy often reduces larger sets into two options. One can imagine that the preference order for citizens will depend on unoffered choices. Depending on how multiple choices are reduced to two choices, one can think of ways ‘cycling’ can work even in the offered binary choices. (David Austen Smith) More succinctly – all binary decisions in democratic politics can be thought to come from larger option sets, and the threat of cycling hence is omnipresent.

Correct Decisions

CJT, a trivial result from probability, when applied to voting with two choices is just that an electorate is most likely to arrive at the more likely choice of each of its members. The probability of achieving that comes close to 1 as n increases.

If we assume that electoral democracy is a competition between interests, then we just get majoritarian opinions, not ‘correct’ answers. As in there is no common utility function but a different set of utilities for different groups – so people look at a common information signal and split based on their group interests. In that case, the ‘correctness’ of the decision really reduces to the ‘winning’ decision.

Median Voter: Condorcet in Reverse

Applying Condorcet to democracy is in many ways applying things in reverse. We know that politicians create policies that appeal to the ‘median voter’ (not to be confused with the median citizen, or anything to do with ‘correctness’). Politicians work to cobble together a ‘majority’ such that the p of the majority picking them is the greatest. Significantly – policy preferences that can be sold to the majority have no similar claims as made by CJT.  Another important conclusion that can be drawn from the above is that since the options on offer can manipulate the population, it is likely that the errors are not at random.

Democratic Errors Don’t Cancel

Benjamin Page and Bob Shapiro, in ‘The Rational Public,’ argue that one of the benefits of aggregation is that errors cancel out. Errors may be seen to cancel if they are at ‘random’ but if they are heteroskedastic, and strongly predicted by sociodemographics, they are likely to have political consequences. For example, we know then that such ‘errors’ will reduce the likelihood certain constituencies from making a demand, or from coalescing into raising political demands in line with their interests.

Formation of Preferences, Aggregation of Preferences

Applying CJT to democracy, we can roughly proxy that preferences emerge from available data. Assuming people have the perfect lens to the hazy data, the “probability that the correct alternative will win under majority voting converges to the probability that the body of evidence is not misleading.” (Franz Dietrich, and Christian List, ‘A Model of Jury Decisions Where All Jurors Have the Same Evidence’)

While even the probability calculated thence is optimistic – as we know that evidence isn’t same for all jurors, and the lens of most jurors is foggy – it is a good start to thinking about the – what data is available to the jurors, and how it is used by the jurors (citizens), and what are the consequences of different information and ‘analytic lens’ distributions.

Letting the Experts Speak

If our interest is limited to getting the ‘correct outcome,’ then we ought to do better (in terms of likelihood of arriving at the correct decision) by polling people with higher probabilities of getting it right.  We will also save on resources. Another version of the idea would be to do a weighted poll, with weights proportional to the probability of being correct.  The optimal strategy is to have weights proportional to log p(correct)/p(incorrect).  (Nitzan and Paroush, 1982; Shapley and Grofman, 1984)

It isn’t as much a contention as a prelude to the following conclusion – Any serious engagement with epistemic worthiness as a prime motive in governance will probably mean serious adjustments to the shape and nature of democracy and in all likelihood abandonment of mass democracy.

60% is Different from 51%

The key consideration in CJT is choosing the ‘right’ option from the two on offer. Under this system, 51% doesn’t quite differ from 60% or 90% for all yield the same ‘right choice.’ Politics works differently. Presidents tout their ‘mandates,’ and base their policy agendas on them. Congress and Senate have a slew of procedural and legislative rules that buckle under larger numbers. Thinking about Congress and Senate brings new complications, and here’s why – while the election of each member may be justified by CJT, the benefit produced by elected representatives needs another round of aggregation – without some of the large n benefits of mass democracy. Here again, we may note – as McCarty and Poole have relentlessly shown – that the ‘jury’ is extremely ‘non-independent,’ prone to systematic biases, etc. In addition, no longer is choice limited to two – though each choice task can be broken down into a series of Boolean decisions (arriving at the ‘right decision’ in this kind of linear aggregation over choice spectrum will follow a complex function of p(correct choice) for each binary decision.

Summary

Conjectures about the epistemic utility of electoral democracy are particularly rife with problems when seen through the lens of Condorcet. This isn’t to say that no such benefits exist but that alternate frameworks are needed to understand those benefits.