Category Archives: Politics

Why were the polls so accurate?

The Quant. Interwebs have overflowed with joy since the election. Poll aggregating works. And so indeed does polling, though you won’t hear as much about it on the news, which is likely biased towards celebrity intellects, than the hardworking many. But why were polls so accurate?

One potential explanation: because they do some things badly. For instance, most fail at collecting “random samples” these days, because of a fair bit of nonresponse bias. This nonresponse bias – if correlated with propensity to vote – may actually push up the accuracy of the vote choice means. There are a few ways to check this theory.

One way to check this hypothesis – were results from polls using Likely Voter screens different from those not using them? If not, why not? From Political Science literature, we know that people who vote (not just those who say they vote) do vary a bit from those who do not vote, even on things like vote choice. For instance, there is just a larger proportion of `independents’ among them.

Other kinds of evidence will be in the form of failure to match population, or other benchmarks. For instance, election polls would likely fare poorly when predicting how many people voted in each state. Or tallying up Spanish language households or number of registered. Another way of saying this is that the bias will vary by what parameter we aggregate from these polling data.

So let me reframe the question – how do polls get election numbers right even when they undercount Spanish speakers? One explanation is positive correlation between selection into polling, and propensity to vote, which makes vote choice means much more reflective of what we will see come election day.

The other possible explanation to all this – post-stratification or other posthoc adjustment to numbers, or innovations in how sampling is done: matching, stratification etc. Doing so uses additional knowledge about the population and can shrink ses and improve accuracy. One way to test such non-randomness: over tight confidence bounds. Many polls tend to do wonderfully on multiple uncorrelated variables – for instance, census region proportions, gender, … etc. – something random samples cannot regularly produce.

A potential source of bias in estimating impact of televised campaign ads

One popular strategy for estimating impact of televised campaign ads is by exploiting ‘accidental spillover’ (see Huber and Arceneaux 2007). The identification strategy builds on the following facts: Ads on local television can only be targeted at the DMA level. DMAs sometimes span multiple states. Where DMAs span battleground and non-battleground states, ads targeted for residents of battleground states are seen by those in non-battleground states. In short, people in non-battleground states are ‘inadvertently’ exposed to the ‘treatment’. Behavior/Attitudes etc. of the residents who were inadvertently exposed are then compared to those of other (unexposed) residents in those states. The benefit of this identification strategy is that it allows television ads to be decoupled from the ground campaign and other campaign activities, such as presidential visits (though people in the spillover region are exposed to television coverage of the visits). It also decouples ad exposure etc. from strategic targeting of the people based on characteristics of the battleground DMA etc. There is evidence that content, style, the volume, etc. of television ads is ‘context aware’ – varies depending on what ‘DMA’ they run in etc. (After accounting for cost of running ads in the DMA, some variation in volume/content etc. across DMAs within states can be explained by partisan profile of the DMA, etc.)

By decoupling strategic targeting from message volume and content, we only get an estimate of the ‘treatment’ targeted dumbly. If one wants an estimate of ‘strategic treatment’, such quasi-experimental designs relying on accidental spillover may be inappropriate. How to estimate then the impact of strategically targeted televised campaign ads: first estimate how ads are targeted depending on area and people (Political interest moderates the impact of political ads [see for e.g. Ansolabehere and Iyengar 1995]) characteristics, next estimate effect of messages using the H/A strategy, and then re-weight the effect using estimates of how the ad is targeted.

One can also try to estimate effect of ‘strategy’ by comparing adjusted treatment effect estimates in DMAs where treatment was targeted vis-a-vis (captured by regressing out other campaign activity) and where it wasn’t.

Interviewer Assesments of Respondent’s Level of Political Information

In the National Election Studies (NES), interviewers have been asked to rate respondent’s level of political information – “Respondent’s general level of information about politics and public affairs seemed – Very high, Fairly high, Average, Fairly low, Very low.” John Zaller, among others, have argued that these ratings measure political knowledge reasonably well. However there is some evidence that challenges the claim. For instance, there is considerable unexplained inter and intra-interviewer heterogeneity in ratings – people with similar levels of knowledge (as measured via closed-ended items) are rated very differently (Levendusky and Jackman 2003 (pdf)). It also appears that mean interviewer ratings have been rising over the years, compared to the relatively flat trend observed in more traditional measures (see Delli Carpini, and Keeter 1996 and Gilens, Vavreck, and Cohen 2004, etc).

Part of the increase is explained by higher ratings of respondents with less than a college degree; ratings of respondents with BS or more have remained somewhat more flat. As a result, difference in ratings of people with a Bachelor’s Degree or more and those with less than a college degree is decreasing over time. Correlation between interviewer ratings and other criteria like political interest are also trending downward (though decline is less sharp). This conflicts with evidence for increasing ‘knowledge gap’ (Prior 2005).

The other notable trend is the sharp negative correlation (over .85) between intercept and slope of within-year regressions of interviewer ratings and political interest, education, etc. This sharp negative correlation hints at possible ceiling effects. And indeed there is some evidence for that.

Interviewer Measure – The measure is sometimes from the pre-election wave only, other times in the post-election wave only, and still other times in both waves. Where both pre and post measures were available, they were averaged. The correlation between pre-election and post-election rating was .69. The average post-election ratings are lower than pre-election ratings.

The Worry about Anna (Hazare)

The following piece is in response to Arundhati Roy’s opinion published in The Hindu.

That Anna’s proposal for Lokpal is deeply flawed is inarguable. Whether Anna is also a bigoted RSS sympathizer, if not their agent, propelled by foreign money, as Roy would have us believe, is more in doubt. Since the debate about the latter point is rendered moot by the overwhelming support that Anna seems to enjoy, I focus on some important, though very well-tread and long understood, questions around corruption raised by Roy in her polemical screed.

Corruption is ubiquitous in India. Ration shops (considerable adulteration, the skim sold off), government employment schemes (ghost employees), admission to government schools (bribes must be paid to the principal), allocation of telecom and mining licenses (bribes paid for getting licenses for cheaper than what a fair auction would fetch), ultrasound clinics providing prenatal gender identification (bribes paid to police to keep these running) etc. are but a few examples of this widespread practice.

That corruption has serious negative consequences is also not in doubt. The poor get lower quality produce, if anything at all, as a result of corruption in ration shops. Inadequate public goods (e.g. canals) result from public’s money, and some intended beneficiaries denied benefit, as a result of ghost employees in government employment schemes. Sex-selective abortions result from continued operation of prenatal ultrasound clinics. And considerable loss in government revenue (which can be used to provide public goods) results from corruption in granting of licenses.

On occasion, corruption may increase welfare of those most in need. For example, if some laws are arrayed against the poor, and if the poor can pay a nominal bribe to circumvent the law, corruption may benefit the poor. The overall impact of corruption on the poor is still likely to be heavily negative, if only because the loss to the public exchequer via the widely suspected significantly greater corruption among the rich is expected to be far greater. There also exists some empirical evidence to support the claim that corruption causes poverty (Gupta et al., 2002). However, an argument can be made to not enforce anti-corruption laws in some spheres, if successful attempts to amend the law that warrants circumvention can’t be mounted.

In all, the case for reducing corruption is strong. However schemes of solving corruption by creating a bureaucracy to go after the corrupt may be upended by bureaucrats going rogue. Stories of the almost limitless power of a ‘Vigilance Commissioner’ to harass and extort are almost legend.

‘Who shall mind the minders?’ is one of the central questions in institutional design. The traditional solution to the problem has been to institute a system of checks and balances to supplement accountability via ‘free and fair’ elections (which themselves need a functioning institutional framework). The system only works within limits, though innovative institutional designs to solve the problem can be thought off. The only other fruitful direction for reducing corruption has been to increase transparency (via RTI, post-facto disclosures of all bids in an auction, etc.), and via increased automation (cutting out the middle men, keeping bids blind from the committee so as to prevent certain kinds of collusion, etc.) – something the government is slowly and unevenly (depending on vested interests) working towards.

Corruption in enforcement is harder to tackle. Agents sent to enforce pollution laws have been known to extort from factory owners by threatening them with falsely implicating them with deliberately adulterated samples. There automating testing, and scrambling identity of source during analysis, may prove useful.

Bibliography-
Gupta, Sanjeev, Hamid Davoodi and Rosa Alonso-Terme. 2002. Does corruption affect income inequality and poverty? Economics of Governance. 3: 23–45

Lawyers!

(Based on data from the 111th Congress)

Law is the most popular degree at the Capitol Hill (it has been the case for a long time) – nearly 52% of the senators, and 36% of congressional representatives have a degree in law. There are some differences across parties and across houses, with Republicans likelier to have a law degree than Democrats in the Senate (58% to 48%), and the reverse holding true for the Congress – where more Democrats have law degrees than Republicans (40% to 32%). Less than 10% of members of congress have a degree in the natural sciences or engineering. Nearly 8% have a degree from Harvard, making Harvard’s the largest alumni contingent at the Capitol. Yale is a distant second with less than half the number that went to Harvard.

Does children’s gender cause partisanship?

More women identify themselves as Democrats than as Republicans. The disparity is yet greater among single women. It is possible (perhaps even likely) that this difference in partisan identification is due to (perceived) policy positions of Republicans and Democrats.

Now let’s do a thought experiment: Imagine a couple about to have a kid. Also assume that the couple doesn’t engage in sex-selection. Two things can happen – the couple can have a son or a daughter. It is possible that having a daughter persuades the parent to change his or her policy preferences towards a direction that is perceived as more congenial to women. It is also possible that having a son has the opposite impact – persuading parents to adopt more male congenial political preferences. Overall, it is possible that gender of the child makes a difference to parents’ policy preferences. With panel data one can identify both movements. With cross-sectional data, one can only identify the difference between those who had a son, and those who had a daughter.

Let’s test this using cross-sectional data from Jennings and Stoker’s “Study of Political Socialization: Parent-Child Pairs Based on Survey of Youth Panel and Their Offspring, 1997″.

Let’s assume that a couple’s partisan affiliation doesn’t impact the gender of their kid.

Number of kids, however, is determined by personal choice, which in turn may be impacted by ideology, income, etc. For example, it is likely that conservatives have more kids as they are less likely to believe in contraception, etc. This is also supported by the data. (Ideology is a post-treatment variable. This may not matter if impact of having a daughter is same in magnitude as impact of having a son, and if there are similar numbers of each across people.)

Hence one may conceptualize ‘treatment’ as gender of the kids, conditional on number of kids.

Understandably, we only study people who have one or more kids.

Conditional on number of kids, the more daughters respondent has, the less likely respondent is to identify herself as a Republican (b = -.342, p < .01) (when dependent variable is curtailed to Republican/Democrat dichotomous variable; the relationship holds – indeed becomes stronger – if the dependent variable is coded as an ordinal trichotomous variable: Republican, Independent, and Democrat, and an ordered multinomial estimated)

Future –

If what we observe is true then we should also see that as party stances evolve, impact of gender on policy preference of a parent should vary. One should also be able to do this cross-nationally.

Some other findings –

  1. Probability of having a son (limiting to live births in the U.S.) is about .51. This ‘natural rate’ varies slightly by income – daughters are more likely to be born among lower income. However effect of income is extremely modest in the U.S., to the point of being ignorable. The live birth ratio is marginally rebalanced by the higher child mortality rate among males. As a result, among 0-21, the ratio between men and women is about equal in U.S.

    In the sample, there are significantly more daughters than sons. The female/male ratio is 1.16. This is ‘significantly’ unusual.

  2. If families are less likely to have kids after the birth of a boy, number of kids will be negatively correlated with proportion sons. Among people with just one kid, number of sons is indeed greater than number of daughters, though the difference is insignificant. Overall correlation between proportion sons and number of kids is also very low (corr. = -.041).