January 2010

You are currently browsing the monthly archive for January 2010.

Republican ‘Con-census’

The party that dislikes the census, put a hold on the nominee for census bureau, is now sending out fraudulent mail surveys that seem as if they were from the census bureau. Read here, here, and here. Accusation for being ‘fraudulent’ stems, not only from the use of `census’, but also from it being an attempt at “frugging”, the practice of cloaking a fund raising appeal in what appears to be a research. (“Suggers” sell using surveys.)

Who Knew
One of the people who has benefited the most from Macaulay’s reforms recently sent an email, part of a larger email chain that now generally implies some travesty, which quoted Macaulay as having spoken the following (on Feb 2nd, 1835 in India, the same day his gave his Minute on Education in Britain) -

I have traveled across the length and breadth of India and I have not seen one person who is a beggar, who is a thief. Such wealth I have seen in this country, such high moral values, people of such caliber, that I do not think we would ever conquer this country, unless we break the very backbone of this nation, which is her spiritual and cultural heritage, and, therefore, I propose that we replace her old and ancient education system, her culture, for if the Indians think that all that is foreign and English is good and greater than their own, they will lose their self-esteem, their native self-culture and they will become what we want them, a truly dominated nation?

Of course, Lord Macaulay, never said such a thing. He said many other tawdry things but never did he eulogize the absolutely hokum positive images of India. While we all want glorious histories, past isn’t as glorious. Neither are confessions of colonial rapacity generally so naked.

Suggestion: For greater imagined glories one most go further back than just 1835, when the ‘Muggles’ had had their way (as Hindu may point out), writing had been invented, into the mists of more obscure pre-history where one can have his way with truth. How about the crowning glories of Lord Rama and his return on airplane like ‘pushpak’ vahan?

Future of Journalism

There was a time when each medium produced its own kind of journalism – television journalism, newspaper journalism, and radio journalism. Future, we are told, doesn’t care for such distinctions. So while in future we may be able to talk about `journalism’, without a qualifier bolted in front, for now, discussing the future of each of the varieties may be more appropriate, especially since prognosis for each variety is different.

Making forecasts about various forms of `journalism’ has of late reduced to extending the trend line from recent past well into the future. Such exertions typically produce the following forecast – Newspapers are dead or dying. Both network newscasts, and cable news channels, are shedding audiences, and will continue to do so. Radio is generally ignored.

Continue Reading…

Deliberative Poll ™ works as follows: A random sample of people are surveyed. Out of the initial sample, a random subset is invited to deliberate, given balanced briefing materials, randomly assigned to small groups moderated by trained moderators, allowed the opportunity to quiz experts, and in the end surveyed again.

Reports and papers on Deliberative Polls often carry comparisons between participants to non-participants on a host of attitudinal, and demographic variables (e.g. see here, and here). The analysis purports to answer whether people who came to Deliberative Poll were different from those who didn’t and to compare participant sample to the population. This sounds about right, except this – the comparison is made between participants, and a pool of two likely distinct sub-populations – people who were never invited (probably a representative, random set), and people who were invited but never came. Under plausible and probable assumptions, such pooling biases against finding a result.

The key thing we want to measure is self-selection bias – was there a difference between people who accepted the invitation, and who did not. The correct way to estimate the bias would be as follows:
(Participant/Didn’t come) ~ socio-demographics (gender, education, income, party id, age) + knowledge + attitude extremity

Effect sizes can be provided to summarize the extent of bias. This kind of analysis can account for the fact that bias may not occur at first marginals (gender), but at second marginals (low educated females). (This all can be theory driven, or more descriptive in purpose). The analysis also allows for smaller effects to be seen, as variance within cells are reduced.

p values
When the conservative thing to do is to reject the null hypothesis, think a little less about p-values.

Assuming initial survey approximates a ‘representative’ sample of the entire population. Assuming we want inference how ‘representative’ Participants are to the entire population, it makes sense to just report mean differences without p values.

The survey sample estimates stand in for the entire population. Entire population census numbers are without standard errors or very low s.e. so comparisons are always significant.

By comparing to an uncertain estimate of the population one cannot say whether the participant sample was representative of the entire population. That estimation is without bias but suffers the following problem – the more uncertain the population estimate, the less likely one can reject null, and more likely one is to conclude that the participant sample is representative. One way to deal with this is to do the following – Have 95% conf. band of sample estimate of population and then calculate max and min difference of the sample and report that.

Name calling
Calling the analysis- ‘representativeness’ analysis – seems misleading on two counts –

  1. While a clear representation question can be answered by some analysis, none such question is answered, and can be answered by the analysis presented. Moreover it isn’t clear if it relates to some larger politically meaningful variable. For example – one question that can be posed is whether participant sample resembles the population at large. For answering such a question, one would want to compare population estimates to census estimates (which have near zero variance, so t-tests etc. would be pointless.)
  2. In a series of papers in the 1970s, Kruskal and Moesteller (citations at the end) rightly excoriate the use of `representativeness’, which is fuzzy and open to much abuse.

Citations
Kruskal, W; Mosteller, F. (1979) Representative sampling I: non-scientific literature. Intern Stat Rev. 47:13-24.
Kruskal, W; Mosteller, F. (1979) Representative sampling II: scientific literature. Intern Stat Rev. 47:111-127.
Kruskal, W; Mosteller, F. (1979) Representative sampling III: the current statistical literature. Intern Stat Rev. 47:245-265.

Media scholars have for long complained about the lack of good measures of media use. Survey self-reports have been shown to be notoriously unreliable, especially for news, where there is significant over-reporting, and without good measures, research lags. The same is true for most research in marketing.

Until recently, the state of the art aggregate media use measures were Nielsen ratings, which put a `meter’ in a few households, or asked people to keep a diary of what they saw. In short, the aggregate measures were pretty bad as well. Digital media, which allows for effortless tracking, and the rise of Internet polling however for the first time provides an opportunity to create `panels’ of respondents for whom we have near perfect measures of media use. The proposal is quite simple: create a hybrid of Nielsen on steroids and YouGov/Polimetrix or Knowledge Network kind of recruiting of individuals.

Logistics: Give people free cable and Internet (~ 80/month) in return for 2 hours of their time per month and monitoring of media consumption. Pay people who already have cable (~100/month) for installing a device and software. Recording channel information is enough for TV, but Internet equivalent of channel – main Domain – clearly isn’t as people can self-select within websites. So we only need to monitor channel for TV but more for Internet.

While number of devices on which people browse Internet, and watch TV has multiplied, there generally remains only one `pipe’ per house. We can install a monitoring device at the central hub for cable, and automatically install software for anyone who connects to the Internet router, or do passive monitoring on the router. Monitoring can also be done through applications on mobile devices.

Monetizability: Consumer companies (say Kellog’s, Ford), Communication researchers, Political hacks (e.g. how many watched campaign ads) will all pay for it. The crucial innovation (modest) is addition of the possibility to survey people on a broad range of topics, in addition to getting great media use measures.

Addressing Privacy concerns:

  1. Limit recording information to certain channels/websites – ones which on which customers advertize etc. This changing list say can be made subject to approval by the individual.
  2. Provide for a web-interface where people can look/suppress the data before it is sent out. Of course reconfirm that all data is anonymous to deter such censoring.

Ensuring privacy may lead to some data censoring and we can try to prorate the data we get it a couple of ways -

  • Survey people on media use
  • Use Television Rating Points (TRP) by sociodemographics to weight data.