For the data and scripts used to generate the graphs, see https://github.com/soodoku/pollbias.
I am pleased to announce the release of TV and Cable Factbook Data (1997–2002; 1998 coverage is modest). Use of the data is restricted to research purposes.
In 2007, Stefano DellaVigna and Ethan Kaplan published a paper that used data from Warren’s Factbook to identify the effect of introduction of Fox News Channel on Republican vote share (link to paper). Since then, a variety of papers exploiting the same data and identification scheme have been published (see, for instance, Hopkins and Ladd, Clinton and Enamorado, etc.)
In 2012, I embarked on a similar such project—trying to use the data to study the impact of introduction of Fox News Channel on attitudes and behaviors related to climate change. However, I found the original data to be limited—DellaVigna and Kaplan had used a team of research assistants to manually code a small number of variables for a few years. So I worked on extending the data. I planned on extending the data in two ways: adding more years, and adding ‘all’ the data for each year. To that end, I developed custom software. The data collection and parsing of a few thousand densely packed, inconsistently formatted, pages (see below) to a usable CSV (see below) finished sometime early in 2014. (To make it easier to create a crosswalk with other geographical units, I merged the data with Town lat/long (centroid) and elevation data from http://www.fallingrain.com/world/US/.)
Soon after I finished the data collection, however, I became aware of a paper by Martin and Yurukoglu. They found some inconsistencies between the Nielsen data and the Factbook data (see Appendix C1 of paper), tracing the inconsistencies to delays in updating the Factbook data—“Updating is especially poor around [DellaVigna and Kaplan] sample year. Between 1999 and 2000, only 22% of observations were updated. Between 1998 and 1999, only 37% of observations were updated.” Based on their paper, I abandoned the plan to use the data, though I still believe the data can be used for a variety of important research projects, including estimating the impact of introduction of Fox News. Based on that belief, I am releasing the data.
Three goals: impart information, spur deep(er) thought about the topic (and the social world more generally), and inculcate care in thinking. As is perhaps clear, working towards achieving any one of these goals creates positive externalities that help achieve other goals. For instance, care in exposition, which is a necessary (though not sufficient) condition for imparting correct information, may also inadvertently produce —either through mimesis, or further thought—care in how students think about questions.
Supplement such synergies by actively seeking and utilizing pertinent opportunities during both, class-wide discussions about the materials, and one-to-one discussions about research projects, to raise (and clarify) relevant points. During discussions, encourage students to seriously consider questions about epistemology, fundamental to science but also more generally to reasoning and discourse, by weaving in questions such as, “What is the claim that we are making?”, and “When can we make this claim and why?”.
Some of the epistemological questions are most naturally (and perhaps best) handled when students are engaged in working on their own research projects. Guiding students as they collect and analyze their own data provides unique opportunities to discuss issues related to research design, and logic. And it is my hunch that students are more engaged with the material (and hence learn more of it, and think more about it) when they work on their own projects than when asked to learn the materials through lectures alone. For instance, undergraduates at Stanford often excel at knowing the points made in the text, but often have yet to spend time thinking about the topic itself. My sense is (and some experience corroborates it) that thinking broadly about an issue allows students to gain new insights, and helps them contextualize their findings better. It also spurs curiosity about the social world and the broader set of questions about society. Hence, in addition to the above, ask students to discuss the topics that they are working on more generally, and think carefully and deeply about what else could be going on.
The paucity of women in Computer Science, Math and Engineering in the US is justly widely lamented. Sometimes, the imbalance is attributed to gender stereotypes. But only a small fraction of men study these fields. And in absolute terms, the proportion of women in these fields is not a great deal lower than the proportion of men. So in some ways, the assertion that these fields are stereotypically male is in itself a misunderstanding.
For greater clarity, a contrived example: Say that the population is split between two similar sized groups, A and B. Say only 1% of Group A members study X, while the proportion of Group B members studying X is 1.5%. This means that 60% of those to study X belong to Group B. Or in more dramatic terms: activity X is stereotypically Group B. However, 98.5% of Group B doesn’t study X. And that number is not a whole lot different from 99%, the percentage of Group A that doesn’t study X.
When people say activity X is stereotypically Group B, many interpret it as ‘activity X is quite popular among X.’ (That is one big stereotype about stereotypes.) That clearly isn’t so. In fact, the difference between the preferences for studying X between Group A and B — as inferred from choices (assuming same choices, utility) — is likely pretty small.
Obliviousness to the point is quite common. For instance, it is behind arguments linking terrorism to Muslims. And Muslims typically respond with a version of the argument laid out above—they note that an overwhelming majority of Muslims are peaceful.
One straightforward conclusion from this exercise is that we may be able to make headway in tackling disciplinary stereotypes by elucidating the point in terms of the difference between p(X|Group A) and p(X| Group B) rather than in terms of p(Group A | X).
Papers at hand:
Two empirical points that we learn from the papers:
1. Partisan gaps are highly variable and the mean gap is reasonably small (without money, control condition). See also: Partisan Retrospection?
(The point is never explicitly commented on by either of the papers. The point has implications for proponents of partisan retrospection.)
2. When respondents are offered money for the correct answer, partisan gap reduces by about half on average.
Question in front of us: Interpretation of point 2.
Why are there partisan gaps on knowledge items?
1. Different Beliefs: People believe different things to be true: People learn different things. For instance, Republicans learn that Obama is a Muslim, and Democrats that he is an observant Christian. For a clear exposition on what I mean by ‘beliefs’, see Waters of Casablanca.
2. Systematic Lazy Guessing: The number one thing people lie about on knowledge items is that they have the remotest clue about the question being asked. And the reluctance to acknowledge ‘Don’t Know’ is in itself a serious point worthy of investigation and careful interpretation. (My sense is that it tells us something important about humans.) When people guess on items with partisan implications, they may use inference rules. For instance, a Republican, when asked about whether unemployment rate under Obama had increased or decreased, may reason that Obama is a socialist and since socialism is bad policy, it must have increased the unemployment rate.
3. Cheerleading: Even when people know that things that reflect badly on their party happened, they lie. (I will be surprised if this is common.)
The Quantity of Interest: Different Beliefs.
We do not want: Different Beliefs + Systematic Lazy Guessing
Why would money reduce partisan gaps?
1. Reducing Systematic Lazy Guessing: Bullock et al. use pay for DK, offering people small incentive (much smaller than pay for correct) to confess to ignorance. Estimate should be closer to the quantity of interest: ‘Different Beliefs.’
2. Considered Guessing: On being offered money for the correct answer, respondents replace ‘lazy’ (for a bounded rational human —optimal) partisan heuristic described above with more effortful guessing. Replacing Systematic Lazy Guessing with Considered Guessing is good to the extent that Considered Guessing is less partisan. If it is so, the estimate will be closer to the quantity of interest: ‘Different Beliefs.’ (Think of it as a version of correlated measurement error. And we are now replacing systematic measurement error with error that is more evenly distributed, if not ‘randomly’ distributed.)
3. Looking up the Correct Answer: People look up answers to take the money on offer. Both papers go some ways to show that cheating isn’t behind the narrowing of the partisan gap. Bullock et al. use ‘placebo’ questions, and Prior et al. timing etc.
4. Reduces Cheerleading: For respondents for whom utility from lying < $, they stop lying. Estimate will be closer to the quantity of interest: 'Different Beliefs.'
5. Demand Effects: Respondents take the offer of money as a cue that their instinctive response isn’t correct. Estimate may be further away from the quantity of interest: ‘Different Beliefs.’