The paradox(es) of Condorcet’s jury theorem when applied to democracy

18 Mar

Majority preferences were seen by Rousseau (The Social Contract) as expression of general will. With Condorcet, the ‘general will’ has also been imbued with the notion of ‘correctness’. As contradictory evidence is ample, it is time to remove to the false comfort of any epistemic benefit accruing from such aggregation.

Condorcet Jury Theorem, formalized by Duncan Black based on the elliptical essays of Marquis de Condorcet, runs roughly as follows –

If –

  1. Jury has to decide between two options using simple majority
  2. If each juror’s probability of being correct is greater than half (~ competence)
  3. Each juror has an equal probability of being correct (~ homogeneity)
  4. Each juror votes independently (~ independence)

Then –

  1. Any jury of odd juror is more likely to arrive at the correct answer than any single juror
  2. As n increases, probability of arriving at the correct answer approaches 1

(The above summarization of the key points is paraphrased from ‘Aggregation of Correlated Votes and CJT’ by Serguei Kaniovski)

There have been multiple attempts at ‘generalizing’ Condorcet – mostly by showing that violations to one or more of the assumptions doesn’t automatically doom the possibility of achieving an ‘epistemically’ superior outcome. One of the generalizations, offered by Christian List and Bob Goodin, is that the result still holds if people are posed with k options, and they have higher than 1/k chance of being correct.

Suppose there are k options and that each voter/juror has independent probabilities p1, p2, …, pk of voting for options 1, 2, …, k, respectively, where the probability, pi, of voting for the “correct” outcome, i, exceeds each of the probabilities, pj, of voting for any of the “wrong” outcomes, j ≠ i. Then the “correct” option is more likely than any other option to be the plurality winner. As the number of voters/jurors tends to infinity, the probability of the “correct” option being the plurality winner converges to 1.

Other ‘generalizations’ outline modified versions of the theorem if independence doesn’t hold, or when competence is unevenly distributed, etc.

One way to summarize the theorems is that math works to the extent the assumptions hold. Assumptions are at best poorly realized, and at worst inapplicable when we transpose CJT to democracy.

Problems of Applying Condorcet’s Jury theorem to Electoral Democracy

Introduction

To apply CJT to democracy – we must assume citizenry to be a jury, and the decision task in front of it as choosing the “right” party or candidate.

The word ‘jury’ is saddled with association with courts in the current American context, and it is important to disambiguate how the citizenry differs from the ‘jury’ of citizens summoned by court for disambiguation will allow us to cover central issues that affect the epistemic utility of any “aggregations” of human beings. In the court system, a jury is (randomly ~ within certain guidelines) selected from the community, generally subject to a battery of ‘voir dire’ questions so as to assess their independence, lack of conflict of interest, biases etc., sworn to render a “rational”, and “impartial” verdict, instructed in applicable law (including evidentiary law), asked not to learn about case from any other source other than what is presented within the court (which itself is subjected to reasonably stringent evidentiary guidelines), guarded from undue influence (for example – bribes by interested parties), made to at least sit through presentation of extensive presentations from ‘both sides’, and their rebuttals, and generally asked to deliberate the evidence (among what is generally a ‘diverse’ pool) before reaching a verdict etc. On the other hand, citizenry that comes to vote is a self-selected sample (roughly half of the total body), highly and admissibly ‘non-independent’ in how they look at the evidence, generally sworn to ‘parties’, unconstrained by law on what evidence to look at, and how to look at it, generally extensively manipulated by interested ‘parties’, rarely informed about the ‘basis’, rarely arriving at decision after learning about arguments by ‘both sides’, and rarely ever deliberating etc.

The comparison provides a rough template for arguing against positive comparisons between the epistemic competence of juries and that of the citizenry. However Condorcet’s argument is a bit different – though many of the above lessons apply – and hinges on the enormous n in a democracy. The only other assumption that one then needs is each juror having more than ½ chance of having it right, or some variation thereof. The central contentions that can be made against Condorcet can come from two sources – theorization of the sources and extent of violation of the assumptions, for example – independence, and competence; inapplicability due to incongruence etc. The various contentions – emerging from the two sources – are covered below (in no particular order).

Rational voting, Sincere voting

While it is one of the weaker cases against applying Condorcet – mostly because the counterargument imagines a ‘rational’ voter – the argument deserves some attention – mostly because of its salience in the political science literature. One of the axioms of political science, since Downs, has been that information acquisition is costly. Hence it follows that as the decision making body becomes larger, and as the chance to be a ‘pivotal voter’ goes down, the incentives to shirk (free-ride) increase.

Austen-Smith and Banks, among others, have shown that ‘sincere voting’ – voting the best choice based on information signal – is not ‘equilibrium behavior’ as rational voter votes not only based on the signal but also on the chance of being pivotal. Feddersen and Pesendorfer (1998, APSR), taking the claim (perils of strategic voting) to its logical extreme – and applying it to ‘unanimity rule’ (not majority rule though similar less stark contentions apply – which they note), have shown that as jury size increases, the probability of convicting an innocent increases.

Extreme Non-independence

Given p > half is a ‘reasonably high’ threshold – jurors performing better than random – especially in circumstances of misinformation, problems can arise quickly.

In the current state, about 90% of the voters exhibit high forms of non-independence emerging from apathy and partisanship. (It also reasons that reduction in either one will lead to higher probability of citizenry choosing the ‘better choice’ on offer, and arguably better choices on offer.) Partisanship also means that people have different utilities that they intend to maximize. The other 10% err on average on the side of manipulation.

The flawed choice task

To the extent there are two inferior choices to choose from, one can imagine that in the best case the polity will choose the slightly better one among the two inferior choices. Condorcet offers no comfort for what kind of choices are on offer – perhaps the central and pivotal role of any normative conception of democracy. In fact, it is likely that the quality of choices on offer (‘correctness of choices’) is likely to be a function of probability with which a body politic knows about the ‘optimal correct choice’, and probability that it chooses the ‘optimally correct’ choice (which is likely to be collinear with odds of picking the ‘better choice’).

Policy Choices

Policy choices are an array of infinite counterfactuals. To choose the ‘most correct’ one would mean a population informed enough to disinter the right choice with a higher probability than any other wrong choice. Given infinite choices, the bar set for each citizen is very high, and chances of citizenry constituted as such crossing that bar – non-existent.

The well known paradox from 3 or more choices

The manipulability of system offering more than 2 choices is well documented and filed alternately as Condorcet’s Paradox and Arrow’s impossibility theorem. Much work has been done to show that propensity of cycles in democracy is not great. (For example, Gerry Mackie, ‘Democracy Defended’) One contention however remains unanswered for the binary choice version -American Democracy often reduces larger sets into two options. One can imagine that the preference order for citizens will depend on unoffered choices. Depending how multiple choices are reduced to two choices, one can think of ways ‘cycling’ can work even in the offered binary choices. (David Austen Smith) More succinctly – all binary decisions in democratic politics can be thought to come from larger option sets, and the threat of cycling hence is omnipresent.

‘Correct decisions’

CJT – a trivial result from probability -when applied to voting with two choices is just that an electorate is most likely to arrive at the more likely choice of each of its members. The probability of achieving that comes close to 1 as n increases.

If we assume that electoral democracy is a competition between interests, then we just get majoritarian opinions, not ‘correct’ answers. As in there is no ‘common’ utility function but different set of utilities for different groups – so people look at a common information signal and split based on their group interests. In that case, the ‘correctness’ of the decision really reduces to the ‘winning’ decision.

Median Voter – Condorcet in reverse

Applying Condorcet to democracy is in many ways applying things in reverse. We know that politicians create policies that appeal to the ‘median voter’ (not to be confused with median citizen, or anything to do with ‘correctness’). Politicians work to cobble together a ‘majority’ such that the p of the majority picking them is the greatest. Significantly – policy preferences that can be sold to the majority have no similar claims as made by CJT.  Another important conclusion that can be drawn from the above is that since the options on offer can manipulate the population, it is likely that the errors are not at random.

Democratic errors don’t cancel

Benjamin Page and Bob Shapiro, in ‘The Rational Public’, argue that one of the benefits of aggregation is that errors cancel out. Errors may be seen to cancel if they are at ‘random’ but if they are heteroskedastic, and strongly predicted by sociodemographics, they are likely to have political consequences. For example, we know then that such ‘errors’ will reduce the likelihood certain constituencies from making a demand, or from coalescing into raising political demands in line with their interests.

Formation of preferences, aggregation of preferences

Applying CJT to democracy, we can roughly proxy that preferences emerge from available data. Assuming people have perfect lens to the hazy data, the “probability that the correct alternative will win under majority voting converges to the probability that the body of evidence is not misleading.” (Franz Dietrich, and Christian List, ‘A Model of Jury Decisions Where All Jurors Have the Same Evidence’)

While even the probability calculated thence is optimistic – as we know that evidence isn’t same for all jurors, and the lens of most jurors is foggy – it is a good start to thinking about the – what data is available to the jurors, and how it is used by the jurors (citizens), and what are the consequences of different information and ‘analytic lens’ distributions.

Letting experts speak

If our interest is limited to getting the ‘correct outcome’, then we ought to do better (in terms of likelihood of arriving at correct decision) by polling people with higher probabilities of getting it right.  We will also save on resources. Another version of the idea would be to do a weighted poll, with weights proportional to probability of being correct.  The optimal strategy is to have weights proportional to log p(correct)/p(incorrect).  (Nitzan and Paroush, 1982; Shapley and Grofman, 1984)

It isn’t as much a contention as a prelude to the following conclusion – Any serious engagement with epistemic worthiness as a prime motive in governance will probably mean serious adjustments to the shape and nature of democracy, and in all likelihood abandonment of mass democracy.

When 60% differs from 51%

The key consideration in CJT is choosing the ‘right’ option from the two on offer. Under this system, 51% doesn’t quite differ from 60% or 90% for all yield the same ‘right choice’. Politics works differently – presidents tout and base their policy agendas on ‘mandates’, Congress and Senate have a slew of procedural and legislative rule that buckle under larger numbers. Thinking about Congress and Senate brings new complications, and here’s why – while election of each member may be justified by CJT, the benefit produced by elected representatives needs another round of aggregation – without some of the large n benefits of mass democracy. Here again we may note – as McCarty and Poole have relentlessly shown – that the ‘jury’ is extremely ‘non-independent’, prone to systematic biases, etc. In addition, no longer is choice limited to two – though each choice task can be broken down into a series of Boolean decisions (arriving at the ‘right decision’ in this kind of linear aggregation over choice spectrum will follow a complex function of p(correct choice) for each binary decision.

In Summary

Conjectures about epistemic utility of electoral democracy are particularly rife with problems when seen through the lens of Condorcet. This isn’t to say that no such benefits exist but that alternate frameworks are needed to understand those benefits.