How do we know?

17 Aug

How can fallible creatures like us (claim to) know something? The scientific method is about answering that question well. To answer the question well, we have made at least three big innovations:

1. Empiricism. But no privileged observer. What you observe should be reproducible by all others.

2. Open to criticism: If you are not convinced about the method of observation, the claims being made, criticize. Offer reason or proof.

3. Mathematical Foundations: Reliance on math or formal logic to deduce what claims can be made if certain conditions are met.

These innovations along with two more innovations have allowed us to ‘scale.’ Foremost among the innovations that allow us to scale is our ability to work together. And our ability to preserve information on stone, paper, electrons, allows us to collaborate with and build on the work done by people who are now dead. The same principle that allows us to build as gargantuan a structure as the Hoover Dam and entire cities allows us to learn about complex phenomenon. And that takes us to the final principle of science.

Motivated Citations

13 Jan

The best kind of insight is the ‘duh’ insight — catching something that is exceedingly common, almost routine, but something that no one talks about. I believe this is one such insight—the standards for citing congenial research (that supports the hypothesis of choice) are considerably lower than the standards for citing uncongenial research. It is an important kind of academic corruption. And it means that the prospects of teleological progress toward truth within science, as currently practiced, are bleak. An alternate ecosystem that provides objective ratings for each piece of research is likely to be more successful. (As opposed to the ‘echo-system’—here are the people who find stuff that ‘agrees’ with what I find—in place today.)

An empirical implication of the point is that the average ranking of journals in which congenial research that is cited is published is likely to be lower than in which uncongenial research is published. Though, for many of the ‘conflicts’ in science, all sides of the conflict will have top-tier publications —-which is to say that the measure is somewhat crude.

The deeper point is that readers generally do not judge the quality of the work cited for support of specific arguments, taking many of the arguments at face value. This, in turn, means that the role of journal rankings is somewhat limited. Or more provocatively, to improve science, we need to make sure that even research published in low ranked journals is of sufficient quality.

Reviewing the Peer Review

24 Jul

Update: Current version is posted here.

Science is a process. And for a good deal of time, peer review has been an essential part of the process. Looked independently by people with no experience with it, it makes a fair bit of sense. For there is only one well-known way of increasing the quality of an academic paper — additional independent thinking. And who better than engaged, trained colleagues.

But this seemingly sound part of the process is creaking. Today, you can’t bring two academics together without them venting their frustration about the broken review system. The plaint is that the current system is a lose-lose-lose. All the parties — the authors, the editors, and the reviewers — lose lots and lots of time. And the change in quality as a result of suggested changes is variable, generally small, and sometimes negative. Given how critical the peer review is in the scientific production, it deserves closer attention, preferably with good data.

But data on peer review aren’t available to be analyzed. Thus, some anecdotal data. Of the 80 or so reviews that I have filed and for which editors have been kind enough to share comments by other reviewers, two things have jumped at me: a) hefty variation in quality of reviews, b) and equally hefty variation in recommendations for the final disposition. It would be good to quantify the two. The latter is easy enough to quantify.

Reliability of the review process has implications for how many recommenders we need to reliably accept or reject the same article. Counter-intuitively, increasing the number of reviewers per manuscript may not increase the overall burden of reviewing. Partly because everyone knows that the review process is so noisy, there is an incentive to submit articles that people know aren’t good enough. Some submitters likely reason that there is a reasonable chance of a `low quality’ article being accepted at top places. Thus, low reliability peer review systems may actually increase the number of submissions. Greater number of submissions, in turn, increases editors’ and reviewer’s load and hence reduces the quality of reviews, and lowers the reliability of recommendations still further. It is a vicious cycle. And the answer may be as simple as making the peer review process more reliable. At any rate, these data ought to be publicly released. Side-by-side, editors should consider experimenting with number of reviewers to collect more data on the point.

Quantifying the quality of reviews is a much harder problem. What do we mean by a good review? A review that points to important problems in the manuscript and, where possible, suggests solutions? Likely so. But this is much trickier to code. But perhaps there isn’t as much a point to quantifying this. What is needed perhaps is guidance. Much like child-rearing, there is no manual for reviewing. There really should be. What should reviewers attend to? What are they missing? And most critically, how do we incentivize this process?

When thinking about incentives, there are three parties whose incentives we need to restructure — the author, the editor, and the reviewer. Authors’ incentives can be restructured by making the process less noisy, as we discuss above. And by making submissions costly. All editors know this: electronic submissions have greatly increased the number of submissions. (It would be useful to study what the consequence of move to electronic submission has been on quality of articles.) As for the editors — if the editors are not blinded to the author (and the author knows this), they are likely to factor in the author’s status in choosing the reviewers, in whether or not to defer to the reviewers’ recommendations, and in making the final call. Thus we need triple blinded pipelines.

Whether or not the reviewer’s identity is known to the editor when s/he is reading the reviewer’s comments also likely affects reviewer’s contributions — in both good and bad ways. For instance, there is every chance that junior scholars in trying to impress editors file more negative reviews than they would if they would if they knew that the editor had no way of tying the identity of the reviewer with the review. Beyond altering anonymity, one way to incentivize reviewers would be to publish the reviews publicly, perhaps as part of the paper. Just like online appendices, we can have a set of reviews published online with each article.

With that, some concrete suggestions beyond the ones already discussed. Expectedly — given they come from a quantitative social scientist — they fall into two broad brackets: releasing and learning from the data already available, and collecting more data.

Existing Data

A fair bit of data can be potentially released without violating anonymity. For instance,

  • Whether manuscript was desk rejected or not
  • How many reviewers were invited
  • Time taken by each reviewer to accept (NA for those from whom you never heard)
  • Total time in review for each article (till R and R or reject) (And separate set of column for each revision)
  • Time taken by each reviewer
  • Recommendation by each reviewer
  • Length of each review
  • How many reviewers did the author(s) suggest?
  • How often were suggested reviewers followed-up on?

In fact, much of the data submitted in multiple-choice question format can probably be released easily. If editors are hesitant, a group of scholars can come together and we can crowdsource collection of review data. People can deposit their reviews and the associated manuscript in a specific format to a server. And to maintain confidentiality, we can sandbox these data allowing scholars to run a variety of pre-screened scripts on it. Or else journals can institute similar mechanisms.

Collecting More Data

  • In economics, people have tried to institute shorter deadlines for reviewers to effect of reducing review times. We can try that out.
  • In terms of incentives, it may be a good idea to try out cash but also perhaps experimenting with a system where reviewers are told that their comments will be public. I, for one, think it would lead to more responsible reviewing. It would be also good to experiment with triple-blind reviewing.

If you have additional thoughts on the issue, please propose them at: https://gist.github.com/soodoku/b20e6d31d21e83ed5e39

Here’s to making advances in the production of science and our pursuit for truth.

Capuchin Monkeys and Fairness: I want at least as much as the other

1 Dec

In a much heralded experiment, we see that a Capuchin monkey rejects a reward (food) for doing a task after seeing another monkey being rewarded with something more appetizing for doing the same task. It has been interpreted as evidence for our ‘instinct for fairness’. But there is more to the evidence. The fact that the monkey that gets the heftier reward doesn’t protest the more meager reward for the other monkey is not commented upon though highly informative. Ideally any weakly reasoned deviation from equality should provoke a negative reaction. Monkeys who get the longer end of the stick – even when aware that others are getting the shorter end of the stick – don’t complain. Primates are peeved only when they are made aware that they are getting the short end of the stick. Not so much if someone else gets it. My sense is that it is true for most humans as well – people care far more about them holding the short end of the stick than others. It is thus incorrect to attribute such behavior to an ‘instinct for fairness’. A better attribution may be to the following rule – I want at least as much as the others are getting.

‘Fairly’ Random

15 Mar

Lottery is a way to assign disproportionate rewards (or punishments) fairly. Procedural fairness “equal chance of selection” provides legitimacy to this system of disproportionate allocation.

Given the purpose of a lottery is unequal allocation, it is important that informed consent be sought from the participants, and that it be used in consequential arenas only when necessary.

Fairness over the longer term
One particular use of lottery is in fair assignment of scarce indivisible resources. For example, think of a good school with only hundred open seats that receives a thousand applications from candidates who are indistinguishable (or only weakly distinguishable) — given limitations of data — from each other in matters of ability. One fair way of assigning seats would be to do it randomly.

One may choose to consider the matter closed at this point. However, this means making peace with disproportional outcomes. Alternatives exist to this option. For example, one may ask the winners of the lottery to give back to those who didn’t win – say by sharing the portion of their income attributable to going to a good school, or by producing public goods, or by some other mutually agreed mechanism.

Fair Selection
Random selection is a fair method of selection over objects where we have no or little reason to prefer one over the other. When objects are observably (as much as data can tell us) same, or similar, same within some margin, random selection is fair.

One may extend it to objects that are different but for no discretionary action of theirs, say people with physical or mental disabilities, though competing concerns, such as lower efficiency etc., exist. More generally, selection based on some commonly agreed metric – say maximal increase in public good – may also be considered fair.

As is clear, those who aren’t selected don’t deserve less, and indeed adequate compensation ought to be the formal basis of selection, unless of course rewards once earned cannot be transferred (say lottery to get a liver transplant, which leaves others dead, and hence unable to receive any compensation, though one can imagine rewards being transferred to relatives, etc.).

Structural Inequality

27 Nov

Nick Clegg, leader of the Liberal Democrats, recently spoke about social mobility. He said,

My particular focus is on inter-generational social mobility – the extent to which a person’s income or social class is influenced by the income or social class of their parents. Social mobility is a measure of the degree to which the patterns of advantage and disadvantage in one generation are passed on to the next. How far, if you like, the sins of the father are visited on the son.

There is of course plenty of argument within the social science community about precise measures, international comparisons and preferred metrics. But I think intergenerational social mobility speaks to most people’s definition of fairness.

Fairness means everyone having the chance to do well, irrespective of their beginnings. Fairness means that no one is held back by the circumstances of their birth. Fairness demands that what counts is not the school you went to or the jobs your parents did, but your ability and your ambition.

In other words, fairness means social mobility.

Social mobility is only half imagined — as movement from lower rungs to upper rungs, not vice versa. Society, currently constructed, offers a relatively fixed (likely declining) number of upper shelf jobs, and it thus reasons that for every n transitioning to upper echelon, a similar ought to transition to the lower rungs. Now a politician wouldn’t sell his idea — that he wants certain number of rich people to make way for the poor and in turn take their place — but then we all expect such diplomacy from politicians.

Fairness as level playing field or a fair lottery is widely accepted as an ideal. Wide acceptance is no insurance against fundamental problems. To help illustrate the problems, here’s an example.

Imagine a fair marriage in which at the beginning husband and wife flip a coin — heads the wife does all chores for the entire tenure of the marriage, tails the wife never has to do chores. Of course marriages based on this fair coin toss don’t seem fair to us — we would ideally want all couples to share the unpleasant chores equally, or by some such equitable arrangement arrived at by mutual agreement.

Carrying over the analogy to society — we would want everyone to take part in unpleasant chores, and everyone to take part in more pleasant activities, equally; we simply don’t want everyone to just have an equal shot. Of course such lack of specialization makes for a very inefficient system. So perhaps one can prorate the wage to the unpleasantness of work, with people stuck doing unpleasant work being provided wages at higher rates, greater leisure time, etc. exactly opposite the system we have in place now.

Summarizing, the current society is unfair not only because not everyone has a similar chance of success, but also because there are only a few good opportunities — mandating that there be a large set of losers, and a small set of winners.

Discussion on education and economic equality can be accessed here.